213 76 10MB
English Pages 984 [988] Year 2011
Semantics HSK 33.1
Handbücher zur Sprach- und Kommunikationswissenschaft Handbooks of Linguistics and Communication Science Manuels de linguistique et des sciences de communication Mitbegründet von Gerold Ungeheuer (†) Mitherausgegeben 1985−2001 von Hugo Steger
Herausgegeben von / Edited by / Edités par Herbert Ernst Wiegand Band 33.1
De Gruyter Mouton
Semantics An International Handbook of Natural Language Meaning
Edited by Claudia Maienborn Klaus von Heusinger Paul Portner Volume 1
De Gruyter Mouton
ISBN 978-3-11-018470-9 e-ISBN 978-3-11-022661-4 ISSN 1861-5090 Library of Congress Cataloging-in-Publication Data Semantics : an international handbook of natural language meaning / edited by Claudia Maienborn, Klaus von Heusinger, Paul Portner. p. cm. – (Handbooks of linguistics and communication science; 33.1) Includes bibliographical references and index. ISBN 978-3-11-018470-9 (alk. paper) 1. Semantics—Handbooks, manuals, etc. I. Maienborn, Claudia. II. Heusinger, Klaus von. III. Portner, Paul. P325.S3799 2011 401'.43—dc22 2011013836 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.d-nb.de. © 2011 Walter de Gruyter GmbH & Co. KG, Berlin/Boston Cover design: Martin Zech, Bremen Typesetting: RefineCatch Ltd, Bungay, Suffolk Printing: Hubert & Co. GmbH & Co. KG, Göttingen
앝 Printed on acid-free paper Printed in Germany www.degruyter.com
Preface
An essential property of language is that it is meaningful. The meaningfulness of language may be manifest in many ways: Language may be used to express emotion, take action, indicate one’s place in the social world, and so forth. But at the core of our understanding of linguistic meaning is the fact that language may be used to describe the world, and, unlike simpler semiotic systems, it can describe the world in a limitless variety of ways. Although the nature of meaning has been an issue for as long as people have discussed linguistic problems, semantics as a subdiscipline of linguistics only emerged in the 19th century as diachronic semantics. The rise of synchronic linguistics affected semantics only with some delay, as early structuralist semantic descriptions were restricted to lexical semantics (see Ullmann 1957). The modern semantic enterprise – that is, the systematic scientific study of meaning – was born within philosophy and logic as scholars began to understand better the capacity of language to describe the world. Thus, a semanticist might aim to explain how it is that the sentence snow is white connects to the world’s being in a certain way (Tarski 1944’s famous example). Over time, the development of model-theoretic, possible worlds semantics in logic and philosophy gave rise to a credible model of semantic content, and this approach was quickly imported into linguistics. By the 1970’s, many linguistic semanticists had come to see their aim as understanding how speakers of a language know that a given sentence is true in certain imaginable circumstances (i.e., possible worlds, including the world as it actually is), but not in others. That is, the task of semantics came to be the discovery of a set of principles which determine how the morphemes and words which make up a sentence, and the sentence’s grammatical structure, determine its truth conditions modeled in terms of possible worlds. Another important trend in the early days of linguistic semantics was the development of a number of theoretical frameworks based closely on generative syntax, including for example Katz & Postal’s (1964) and Generative Semantics (McCawley 1968, Lakoff 1971). These theories relied on extending the technology of transformational syntax to the representation of meaning. Syntactically-based approaches were ultimately found insufficient for both theory internal reasons (they could not account for all of the phenomena of semantics in a plausible way) and for conceptual reasons (they failed to adequately address the descriptive capacity of language). It was in this context that the model-theoretic, possible worlds approach of logic and philosophy came to dominate linguistic semantics as well. Despite being so greatly influenced by philosophy, by the 1970’s semantics had become fully established as a sub-field within linguistics, separate from philosophy and complete with its own theoretical apparatus to guide progress and debates. These days, most students of semantics learn far more about syntax, phonology, and morphology than they do about philosophy of language or logic. This growing differentiation from philosophy was characterized by a shift to a cognitively oriented view of language closely connected to syntax and a concern for understanding all of language, not just simple model examples like snow is white. Although other non-syntactic approaches were around at that time (e.g. Hintikka’s Game Theoretic Semantics, see Hintikka 1973, Hintikka & Sandu 1997), by far the most influential models from the early days of linguistic semantics were the
vi
Preface approaches of Richard Montague (1970a, 1970b, 1973) and related work by such scholars as David Lewis (1970) and Maxwell Cresswell (1973). As mentioned above, this line of research explicitly addressed the descriptive quality of language by borrowing from formal logic the idea that the semantic content of a sentence can be modeled with possible worlds. It combined a model-theoretic, possible worlds semantics with generative syntactic models (though not necessarily orthodox ones) which looked like they might be able to be extended to cover significant portions of natural language. Through the work of a number of scholars in the 1970’s, Montague’s syntactic and semantic system developed into a widely used and influential semantic framework, Montague Grammar (cf. Partee 1976; Dowty, Wall & Peters 1981), but from quite early on it was clear there would be no theoretical orthodoxy in semantics. Some scholars were developing new semantic theories (e.g. Kamp’s 1981 Discourse Representation Theory, Heim’s 1982 File Change Semantics, Barwise & Perry’s 1983 Situation Semantics, and Davidsonian theories of the kind systematized by, for example, Parsons 1990 and Larson & Segal 1995). Others focused on analyzing particular linguistic phenomena, and these scholars were not necessarily concerned with harmonizing the details of their analyses with one another (e.g. Kratzer 1977, 1978, Barwise & Cooper 1981, Jacobs 1983, Link 1983 to take a few examples chosen almost at random). Other important work in semantics did not follow a model-theoretic paradigm (e.g. Jackendoff 1972, 1990, Bierwisch 1982 and Bierwisch & Lang 1989) and was to varying degrees meant as a cognitively oriented alternative, rather than a potential complement, to the more mainstream Montague Grammar and its descendents. This picture of modern semantics is well represented in the first Handbook of Semantics (HSK 6, von Stechow & Wunderlich 1991). Perhaps the most important reason why frameworks like Montague Grammar slowly lost their orthodox status was the realization that language is simply too complex to be approached in terms of a single, shared theory, at least given our (then as well as current) level of understanding. As more and more phenomena were investigated, the number of interesting analytical tools began to grow. For example, one can think of the ideas which have been put forward to explain quantifier scope phenomena since Montague, including quantifying in, quantifier raising, storage, unselective binding, and choice functions. Moreover, a better understanding of the diversity among human languages has made it even more clear that a wide variety of ideas and approaches will be around for quite some time. This development has produced benefits: Semanticists can study many phenomena and languages simultaneously while postponing the issue of how what they learn fits together until such time as that issue can be addressed in an intelligent way; and it has inflicted costs: Sometimes the theoretical assumptions (compositional mechanisms, model theory if any, syntactic framework, etc.) in contemporary work are inexplicit or inconsistent with other semanticists’ assumptions. As semanticists have realized that a better understanding of meaning in natural language would not come from incremental progress on a single agreed-upon theoretical framework and set of theoretical tools, but rather necessitated the coexistence of and competition among a multiplicity of models, a number of important issues have come into focus. The nature of the interfaces between semantics and neighboring linguistic disciplines (especially syntax and pragmatics) is open for debate, as are the choices of particular syntactic or pragmatic theories to be interfaced with. The role of semantics as a component discipline within cognitive science has become more important to many
Preface semanticists investigating the nature of semantic representations and the kinds of inferences drawn in the course of producing and understanding natural language. To the extent that semanticists have begun to utilize evidence drawn from new sources such as crosslinguistic data, psycholinguistic or neurolinguistic experiments, and very large corpora, important methodological issues have come to the forefront. The current level of concern for methodological issues is a sign of the field’s maturity as a scientific discipline. In light of the contemporary situation within semantics as outlined above, the present handbook aims at the following goals: 1. 2. 3. 4.
To discuss the foundations and methodology of semantics. To introduce important theoretical frameworks and theoretical issues. To cover a wide variety of specific topics and phenomena of natural language meaning. To explore the relationship between semantics and other fields, both within linguistics and outside.
The articles contained in the three volumes of this handbook not only address these tasks, but also represent the research results of a whole generation of semanticists since the state of the art recorded by its predecessor Semantics – An International Handbook of Contemporary Research in the same series (HSK 6) from 1991. We hope that the present handbook will be useful to researchers in a number of ways. It provides a reference resource of established empirical facts and theoretical results. It introduces contemporary theories and theoretical debates. It informs readers about research trends and controversies. It includes a summary of the history of, and historical background to, semantics. And finally, we hope that it will stimulate research by pointing out gaps, inconsistencies, and flaws in how semantics is currently practiced and conceptualized. It was a long journey from the initial planning to the final shape of the handbook and we are greatly indebted to many people who accompanied us along that way and helped us eventually reach the final destination. First of all, we would like to thank our authors for their continuous enthusiasm in this joint venture. Next, we wish to thank the publisher Mouton de Gruyter for their continuous support and professional assistance from the first planning until the last proof reading; special thanks are due to Barbara Karlson for continuously and patiently taking care of the various stages the handbook project had to run through. This handbook wouldn’t exist without the invaluable help of our editorial assistants, Noor van Leusen and Elena Karagjosova. They know how this handbook was built from the inside out, and assisted with (or took charge of) various facets from the planning stages to final production. Thanks also go to Janina Radó and Susanne Trissler for their assistance in proof reading and, in particular, to our student assistants who accompanied this handbook project with no less endurance and dedication than the editors themselves: Michael Fister and Dankmar Enke (University of Tübingen); Annika Deichsel, Julia Jürgens and Tatjana Tietze (University of Stuttgart); and Justin Kelly, Lissa Krawczyk, Yanyan Cui, and Julia Wise (Georgetown). Noor, Elena, Susanne and the students dealt with the demands of the style guidelines as they would be applied to over one hundred manuscripts (some of which were quite close to the mark). We were very fortunate to be able to work together with such an excellent team of collaborators and authors whose enthusiasm for the field of semantics never dwindled over the course of the project. It is this commitment to the field that we hope to bequeath to our readers.
vii
viii
Preface
References Barwise, Jon & Robin Cooper 1981. Generalized quantifiers and natural language. Linguistics & Philosophy 4, 159–219. Barwise, Jon & John Perry 1983. Situations and Attitudes. Cambridge, MA: The MIT Press. Bierwisch, Manfred 1982. Formal and lexical semantics. Linguistische Berichte 80, 3–17. Bierwisch, Manfred & Ewald Lang (eds.) 1989. Dimensional Adjectives. Grammatical Structure and Conceptual Interpretation. Berlin: Springer. Cresswell, Maxwell 1973. Logics and Languages. London: Methuen. Dowty, David, Robert Wall & Stanley Peters 1981. Introduction to Montague Semantics. Dordrecht: Reidel. Heim, Irene 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation. University of Massachusetts, Amherst, MA. Reprinted: Ann Arbor, MI: University Microfilms. Hintikka, Jaakko 1973. Logic, Language-Games and Information: Kantian Themes in the Philosophy of Logic. Oxford: Clarendon Press. Hintikka, Jaakko & Gabriel Sandu 1997. Game-theoretical semantics. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 361–410. Jackendoff, Ray 1972. Semantic Interpretation in Generative Grammar. Cambridge, MA: The MIT Press. Jackendoff, Ray 1990. Semantic Structures. Cambridge, MA: The MIT-Press. Jacobs, Joachim 1983. Fokus und Skalen. Zur Syntax und Semantik von Gradpartikeln im Deutschen. Tübingen: Niemeyer. Kamp, Hans 1981. A theory of truth and semantic interpretation. In: J. Groenendijk, T. Janssen & M. Stokhof (eds.). Formal Methods in the Study of Language. Amsterdam: Mathematical Centre, 277–322. Reprinted in: J. Groenendijk, T. Janssen & M. Stokhof (eds.). Truth, Interpretation, and Information. Dordrecht: Foris, 1984, 1–41. Katz, Jerrold & Paul Postal 1964. An Integrated Theory of Linguistic Descriptions. Cambridge, MA: The MIT Press. Kratzer, Angelika 1977. What ‘must’ and ‘can’ must and can mean. Linguistics & Philosophy 1, 337–355. Kratzer, Angelika 1978. Semantik der Rede: Kontexttheorie, Modalwörter, Konditionalsätze. Königstein: Scriptor. Lakoff, George 1971. On Generative Semantics. In: D. Steinberg & L. Jakobovits (eds.). Semantics. An Interdisciplinary Reader in Philosophy, Linguistics, and Psychology. Cambridge: Cambridge University Press, 232–296. Larson, Richard & Gabriel Segal 1995. An Introduction to Semantic Theory. Cambridge, MA: The MIT Press. Lewis, David 1970. General semantics. Synthese 22, 18–67. Link, Godehard 1983. The logical analysis of plurals and mass terms: A lattice-theoretical approach. In: R. Bäuerle et al. (eds.). Meaning, Use, and the Interpretation of Language. Berlin: de Gruyter, 302–323. McCawley, James 1968. The role of semantics in a grammar. In: E. Bach & R. T. Harms (eds.). Universals in Language. New York: Holt, Rinehart & Winston, 124–169. Montague, Richard 1970a. English as a formal language. In: B. Visentini et al. (eds.). Linguaggi nella Società e nella Tecnica. Milan: Edizioni di Comunità, 189–224. Reprinted in: R. Thomason (ed.). Formal Philosophy: Selected Papers of Richard Montague. New Haven, CT: Yale University Press, 1974, 188–221. Montague, Richard 1970b. Universal grammar. Theoria 36, 373–398. Reprinted in: R. Thomason (ed.). Formal Philosophy: Selected Papers of Richard Montague. New Haven, CT: Yale University Press, 1974, 222–246. Montague, Richard 1973. The proper treatment of quantification in ordinary English. In: J. Hintikka, J. Moravcsik & P. Suppes (eds.). Approaches to Natural Language. Reidel: Dordrecht,
Preface
ix
221–242. Reprinted in: R. Thomason (ed.). Formal Philosophy: Selected Papers of Richard Montague. New Haven, CT: Yale University Press, 1974, 247–270. Parsons, Terence 1990. Events in the Semantics of English: A Study in Subatomic Semantics. Cambridge, MA: The MIT Press. Partee, Barbara (ed.) 1976. Montague Grammar. New York: Academic Press. von Stechow, Arnim & Dieter Wunderlich (eds.) 1991. Semantik – Semantics. Ein internationales Handbuch der zeitgenössischen Forschung – An International Handbook of Contemporary Research (HSK 6). Berlin: de Gruyter. Tarski, Alfred 1944. The semantic conception of truth. Philosophy and Phenomenological Research 4, 341–375. Ullmann, Stephen 1957. The Principles of Semantics. Oxford: Blackwell.
April 2011 Claudia Maienborn, Tübingen (Germany) Klaus von Heusinger, Stuttgart (Germany) Paul Portner, Washington, DC (USA)
Contents Volume 1 I.
Foundations of semantics
1. Meaning in linguistics · Claudia Maienborn, Klaus von Heusinger and Paul Portner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2. Meaning, intentionality and communication · Pierre Jacob. . . . . . . . . . . . . . . 11 3. (Frege on) Sense and reference · Mark Textor . . . . . . . . . . . . . . . . . . . . . . . . . 25 4. Reference: Foundational issues · Barbara Abbott . . . . . . . . . . . . . . . . . . . . . . 49 5. Meaning in language use · Georgia M. Green. . . . . . . . . . . . . . . . . . . . . . . . . . 74 6. Compositionality · Peter Pagin and Dag Westerståhl. . . . . . . . . . . . . . . . . . . . 96 7. Lexical decomposition: Foundational issues · Stefan Engelberg . . . . . . . . . 124
II.
History of semantics
8. Meaning in pre-19th century thought · Stephan Meier-Oeser. . . . . . . . . . . . 145 9. The emergence of linguistic semantics in the 19th and early 20th century · Brigitte Nerlich . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 10. The influence of logic on semantics · Albert Newen and Bernhard Schröder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 11. Formal semantics and representationalism · Ruth Kempson . . . . . . . . . . . . 216
III. Methods in semantic research 12. 13. 14. 15.
Varieties of semantic evidence · Manfred Krifka . . . . . . . . . . . . . . . . . . . . . . 242 Methods in cross-linguistic semantics · Lisa Matthewson . . . . . . . . . . . . . . . 268 Formal methods in semantics · Alice G.B. ter Meulen . . . . . . . . . . . . . . . . . . 285 The application of experimental methods in semantics · Oliver Bott, Sam Featherston, Janina Radó and Britta Stolterfoht . . . . . . . 305
IV. Lexical semantics 16. 17. 18. 19.
Semantic features and primes · Manfred Bierwisch . . . . . . . . . . . . . . . . . . . . 322 Frameworks of lexical decomposition of verbs · Stefan Engelberg . . . . . . . 358 Thematic roles · Anthony R. Davis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399 Lexical Conceptual Structure · Beth Levin and Malka Rappaport Hovav . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420 20. Idioms and collocations · Christiane Fellbaum . . . . . . . . . . . . . . . . . . . . . . . . 441 21. Sense relations · Ronnie Cann . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456 22. Dual oppositions in lexical meaning · Sebastian Löbner . . . . . . . . . . . . . . . . 479
xii
Contents
V.
Ambiguity and vagueness
23. 24. 25. 26.
Ambiguity and vagueness: An overview · Christopher Kennedy . . . . . . . . . 507 Semantic underspecification · Markus Egg . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 Mismatches and coercion · Henriëtte de Swart. . . . . . . . . . . . . . . . . . . . . . . . 574 Metaphors and metonymies · Andrea Tyler and Hiroshi Takahashi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597
VI. Cognitively oriented approaches to semantics 27. 28. 29. 30. 31.
Cognitive Semantics: An overview · Leonard Talmy . . . . . . . . . . . . . . . . . . . 622 Prototype theory · John R. Taylor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643 Frame Semantics · Jean Mark Gawron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664 Conceptual Semantics · Ray Jackendoff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 688 Two-level Semantics: Semantic Form and Conceptual Structure · Ewald Lang and Claudia Maienborn . . . . . . . . . . . . . . . . . . . . . . 709 32. Word meaning and world knowledge · Jerry R. Hobbs . . . . . . . . . . . . . . . . . 740
VII. Theories of sentence semantics 33. Model-theoretic semantics · Thomas Ede Zimmermann. . . . . . . . . . . . . . . . 762 34. Event semantics · Claudia Maienborn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 802 35. Situation Semantics and the ontology of natural language · Jonathan Ginzburg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 830
VIII. Theories of discourse semantics 36. Situation Semantics: From indexicality to metacommunicative interaction · Jonathan Ginzburg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 852 37. Discourse Representation Theory · Hans Kamp and Uwe Reyle. . . . . . . . . 872 38. Dynamic semantics · Paul Dekker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 923 39. Rhetorical relations · Henk Zeevat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 946
Volume 2 IX. Noun phrase semantics 40. 41. 42. 43. 44. 45. 46. 47.
Pronouns · Daniel Büring Definiteness and indefiniteness · Irene Heim Specificity · Klaus von Heusinger Quantifiers · Edward Keenan Bare noun phrases· Veneeta Dayal Possessives and relational nouns · Chris Barker Mass nouns and plurals · Peter Lasersohn Genericity · Gregory Carlson
Contents
X.
Verb phrase semantics
48. 49. 50. 51.
Aspectual class and Aktionsart · Hana Filip Perfect and progressive · Paul Portner Verbal mood · Paul Portner Deverbal nominalization · Jane Grimshaw
XI. Semantics of adjectives and adverb(ial)s 52. 53. 54. 55. 56.
Adjectives · Violeta Demonte Comparison constructions · Sigrid Beck Adverbs and adverbials · Claudia Maienborn and Martin Schäfer Adverbial clauses · Kjell Johan Sæbø Secondary predicates · Susan Rothstein
XII. Semantics of intensional contexts 57. 58. 59. 60. 61.
Tense · Toshiyuki Ogihara Modality · Valentine Hacquard Conditionals · Kai von Fintel Propositional attitudes · Eric Swanson Indexicality and de se reports · Philippe Schlenker
XIII. Scope, negation, and conjunction 62. 63. 64. 65.
Scope and binding · Anna Szabolcsi Negation · Elena Herburger Negative and positive polarity items · Anastasia Giannakidou Coordination · Roberto Zamparelli
XIV. Sentence types 66. 67. 68. 69. 70.
Questions · Manfred Krifka Imperatives · Chung-hye Han Copular clauses · Line Mikkelsen Existential sentences · Louise McNally Ellipsis · Ingo Reich
XV. Information structure 71. 72. 73. 74. 75. 76.
Information structure and truth-conditional semantics · Stefan Hinterwimmer Topics · Craige Roberts Discourse effects of word order variation · Gregory Ward and Betty J. Birner Cohesion and coherence · Andrew Kehler Discourse anaphora, accessibility, and modal subordination · Bart Geurts Discourse particles · Malte Zimmermann
xiii
xiv
Contents
Volume 3 XVI. The interface of semantics with phonology and morphology 77. 78. 79. 80. 81.
Semantics of intonation · Hubert Truckenbrodt Semantics of inflection · Paul Kiparsky and Judith Tonhauser Semantics of derivational morphology · Rochelle Lieber Semantics of compounds · Susan Olsen Semantics in Distributed Morphology · Heidi Harley
XVII. The syntax-semantics interface 82. 83. 84. 85. 86. 87.
Syntax and semantics: An overview · Arnim von Stechow Argument structure · David Pesetsky Operations on argument structure · Dieter Wunderlich Type shifting · Helen de Hoop Constructional meaning and compositionality · Paul Kay and Laura A. Michaelis The grammatical view of scalar implicatures and the relationship between semantics and pragmatics · Gennaro Chierchia, Danny Fox and Benjamin Spector
XVIII. The semantics-pragmatics interface 88. 89. 90. 91. 92. 93. 94.
Semantics/pragmatics boundary disputes · Katarzyna M. Jaszczolt Context dependency · Thomas Ede Zimmermann Deixis and demonstratives · Holger Diessel Presupposition · David Beaver and Bart Geurts Implicature · Mandy Simons Game theory in semantics and pragmatics · Gerhard Jäger Conventional implicature and expressive content · Christopher Potts
XIX. Typology and crosslinguistic semantics 95. 96. 97. 98.
Semantic types across languages · Emmon Bach and Wynn Chao Count/mass distinctions across languages · Jenny Doetjes Tense and aspect: Time across languages · Carlota S. Smith The expression of space across languages · Eric Pederson
XX. Diachronic semantics 99. 100. 101.
Theories of meaning change: An overview · Gerd Fritz Cognitive approaches to diachronic semantics · Dirk Geeraerts Grammaticalization and semantic reanalysis · Regine Eckardt
XXI. Processing and learning meaning 102. 103.
Meaning in psycholinguistics · Lyn Frazier Meaning in first language acquisition · Stephen Crain
Contents
104. 105. 106. 107.
Meaning in second language acquisition · Roumyana Slabakova Meaning in neurolinguistics · Roumyana Pancheva Conceptual knowledge, categorization and meaning · Stephanie Kelter and Barbara Kaup Space in semantics and cognition · Barbara Landau
XXII. Semantics and computer science 108. 109. 110. 111. 112.
Semantic research in computational linguistics · Manfred Pinkal and Alexander Koller Semantics in corpus linguistics · Graham Katz Semantics in computational lexicons · Anette Frank and Sebastian Padó Web Semantics · Paul Buitelaar Semantic issues in machine translation · Kurt Eberle
xv
I. Foundations of semantics 1. Meaning in linguistics 1. 2. 3. 4. 5. 6.
Introduction Truth Compositionality Context and discourse Meaning in contemporary semantics References
Abstract The article provides an introduction to the study of meaning in modern semantics. Major tenets, tools, and goals of semantic theorizing are illustrated by discussing typical approaches to three central characteristics of natural language meaning: truth conditions, compositionality, and context and discourse.
1. Introduction Meaning is a key concept of cognition, communication and culture, and there is a diversity of ways to understand it, reflecting the many uses to which the concept can be put. In the following we take the perspective on meaning developed within linguistics, in particular modern semantics, and we aim to explain the ways in which semanticists approach, describe, test and analyze meaning. The fact that semantics is a component of linguistic theory is what distinguishes it from approaches to meaning in other fields like philosophy, psychology, semiotics or cultural studies. As part of linguistic theory, semantics is characterized by at least the following features: 1. Empirical coverage: It strives to account for meaning in all of the world’s languages. 2. Linguistic interfaces: It operates as a subtheory of the broader linguistic system, interacting with other subtheories such as syntax, pragmatics, phonology and morphology. 3. Formal expliciteness: It is laid out in an explicit and precise way, allowing the community of semanticists to jointly test it, improve it, and apply it to new theoretical problems and practical goals. 4. Scientific paradigm: It is judged on the same criteria as other scientific theories, viz. coherence, conceptual simplicity, its ability to unify our understanding of diverse phenomena (within or across languages), to raise new questions and open up new horizons for research. In the following we exemplify these four features on three central issues in modern semantic theory that define our understanding of meaning: truth conditions, compositionality, and context and discourse. Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 1–10
2
I. Foundations of semantics
2. Truth If one is to develop an explicit and precise scientific theory of meaning, the first thing one needs to do is to identify some of the data which the theory will respond to, and there is one type of data which virtually all work in semantics takes as fundamental: truth conditions. At an intuitive level, truth conditions are merely the most obvious way of understanding the meaning of a declarative sentence. If I say It is raining outside, I have described the world in a certain way. I may have described it correctly, in which case what I said is true, or I may have described it incorrectly, in which case it is false. Any competent speaker knows to a high degree of precision what the weather must be like for my sentence to count as true (a correct description) or false (an incorrect description). In other words, such a speaker knows the truth conditions of my sentence. This knowledge of truth conditions is extremely robust – far and wide, English speakers can make agreeing judgments about what would make my sentence true or false – and as a result, we can see the truth conditions themselves as a reliable fact about language which can serve as part of the basis for semantic theory. While truth conditions constitute some of the most basic data for semantics, different approaches to semantics reckon with them in different ways. Some theories treat truth conditions not merely as the data which semantics is to deal with, but more than this as the very model of sentential meaning. This perspective can be summarized with the slogan “meaning is truth conditions”, and within this tradition, we find statements like the following: (1)
[[ It is raining outside ]]t,s = TRUE iff it is raining outside of the building where the speaker s is located at time t, and = FALSE otherwise.
The double brackets [[ X ]] around an expression X names the semantic value of X in the terms of the theory in question. Thus, (1) indicates a theory which takes the semantic value of a sentence to be its truth value, TRUE or FALSE. The meaning of the sentence, according to the truth conditional theory, is then captured by the entire statement (1). Although (1) represents a truth conditional theory according to which semantic value and meaning (i.e., the truth conditions) are distinct (the semantic value is a crucial component in giving the meaning), other truth conditional theories use techniques which allow meaning to be reified, and thus identified with semantic value, in a certain sense. The most well-known and important such approach is based on possible worlds: (2)
a. [[ It is raining outside ]]w,t,s = TRUE iff it is raining outside of the building where the speaker s is located at time t in world w, and = FALSE otherwise. b. [[ It is raining outside ]]t,s = the set of worlds {w : it is raining outside of the building where the speaker s is located at time t in world w}
A possible world is a complete way the world could be. (Other theories use constructs similar to possible worlds, such as situations.) The statement in (2a) says virtually the same thing as (1), making explicit only that the meaning of It is raining outside depends not merely on the actual weather outside, but whatever the weather may turn out to be. Crucially, by allowing the possible world to be treated as an arbitrary point of evaluation, as in (2a), we are able to identify the truth conditions with the set of all such points, as
1. Meaning in linguistics in (2b). In (2), we have two different kinds of semantic value: the one in (2a), relativized to world, time, and speaker, corresponds to (1), and is often called the extension or reference. That in (2b), where the world point of evaluation has been transferred into the semantic value itself, is then called the intension or sense. The sense of a full sentence, for example given as a set of possible worlds as in (2b), is called a proposition. Specific theories differ in the precise nature of the extension and intension: The intension may involve more or different parameters than w, t, s, and several of these may be gathered into a set (along with the world) to form the intension. For example, in tense semantics, we often see intensions treated as sets of pairs of a world and a time. The majority of work in semantics follows the truth conditional approach to the extent of making statements like those in (1)–(2) the fundamental fabric of the theory. Scholars often produce explicit fragments, i.e. mini-theories which cover a subset of a language, which are actually functions from expressions of a language to semantic values, with the semantic values of sentences being truth conditional in the vein of (1)–(2). But not all semantic research is truth conditional in this explicit way. Descriptive linguistics, functional linguistics, typological linguistics and cognitive linguistics frequently make important claims about meaning (in a particular language, or crosslinguistically). For example, Wolfart (1973: 25), a descriptive study of Plains Cree states: “Semantically, direction serves to specify actor and goal. In sentence (3), for instances, the direct theme sign /ā/ indicates the noun atim as goal, whereas the inverse theme sign /ekw/ in (4) marks the same noun as actor.” (3)
nisēkihānān atim scare(1p-3) dog(3) ‘We scare the dog.’
(4)
nisēkihikonān atim scare(3-1p) dog(3) ‘The dog scares us.’
Despite not being framed as such, this passage is implicitly truth conditional. Wolfart is stating a difference in truth conditions which depends on the grammatical category of direction using the descriptions “actor” and “goal”, and using the translations of cited examples. This example serves to illustrate the centrality of truth conditions to any attempt to think about the nature of linguistic meaning. As a corollary to the focus on truth conditions, semantic theories typically take relations like entailment, synonymy, and contradiction to provide crucial data as well. Thus, the example sentence It is raining outside entails (5), and this fact is known to any competent speaker. (5)
It is raining outside or the kids are playing with the water hose.
Obviously, this entailment can be understood in terms of truth conditions (the truth of the one sentence guarantees the truth of the other), a fact which supports the idea that the analysis of truth conditions should be a central goal of semantics. It is less satisfying to describe synonymy in terms of truth conditions, as identity of truth conditions doesn’t in most cases make for absolute sameness of meaning, in an intuitive sense – consider
3
4
I. Foundations of semantics Mary hit John and John was hit by Mary; nevertheless, a truth conditional definition of synonymy allows for at least a useful concept of synonymy, since people can indeed judge whether two sentences would accurately describe the same circumstances, whereas it’s not obvious that complete intuitive synonymy is even a useful concept, insofar as it may never occur in natural language. The truth conditional perspective on meaning is intuitive and powerful where it applies, but in and of itself, it is only a foundation. It doesn’t, at first glance, say anything about the meanings of subsentential constituents, the meanings or functions of nondeclarative sentences, or non-literal meaning, for example. Semantic theory is responsible for the proper analysis of each of these features of language as well, and we will see in many of the articles in this handbook how it has been able to rise to these challenges, and many others.
3. Compositionality A crucial aspect of natural language meaning is that speakers are able to determine the truth conditions for infinitely many distinct sentences, including sentences they have never encountered before. This shows that the truth conditions for sentences (or whatever turns out to be their psychological correlate) cannot be memorized. Speakers do not associate truth conditions such as the ones given in (1) or (2) holistically with their respective sentences. Rather, there must be some principled way to compute the meaning of a sentence from smaller units. In other words, natural language meaning is essentially combinatorial. The meaning of a complex expression is construed by combining the meaning of its parts in a certain way. Obviously, syntax plays a significant role in this process. The two sentences in (6), for instance, are made up of the same lexical material. It is only the different word order that is responsible for the different sentence meanings of (6a) and (6b). (6)
a. Caroline kissed a boy. b. A boy kissed Caroline.
In a similar vein, the ambiguity of a sentence like (7) is rooted in syntax. The two readings paraphrased in (7a) and (7b) correspond to different syntactic structures, with the PP being adjoined either to the verbal phrase or to the direct object NP. (7)
Caroline observed the boy with the telescope. a. Caroline observed the boy with the help of the telescope. b. Caroline observed the boy who had a telescope.
Examples such as (6) and (7) illustrate that the semantic combinatorial machinery takes the syntactic structure into account in a fairly direct way. This basic insight lead to the formulation of the so-called “principle of compositionality”, attributed to Gottlob Frege (1892), which is usually formulated along the following lines: (8)
Principle of compositionality: The meaning of a complex expression is a function of the meanings of its parts and the way they are syntactically combined.
1. Meaning in linguistics According to (8), the meaning of, e.g., Caroline sleeps is a function of the meanings of Caroline and sleeps and the fact that the former is the syntactic subject of the latter. There are stronger and weaker versions of the principle of compositionality, depending on what counts as “parts” and how exactly the semantic combinatorics is determined by the syntax. For instance, adherents of a stronger version of the principle of compositionality typically assume that the parts that constitute the meaning of a complex expression are only its immediate constituents. According to this view, only the NP [Caroline] and the VP [kissed a boy] would count as parts when computing the sentence meaning for (6a), but not (directly) [kissed] or [a boy]. Modern semantics explores many different ways of implementing the notion of compositionality formally. One particularly useful framework is based on the mathematical concept of a function. It takes the meaning of any complex expression as being the result of applying the meaning of one of its immediate parts (= the functor) to the meaning of its other immediate part (= the argument). With functional application as the basic semantic operation that is applied stepwise, mirroring the binary branching of syntax, the function-argument approach allows for a straightforward syntax-semantics mapping. Although there is wide agreement among semanticists that, given the combinatorial nature of linguistic meaning, some version of the principle of compositionality must certainly hold, it is also clear that, when taking into account the whole complexity and richness of natural language meaning, compositional semantics is faced with a series of challenges. As a response to these challenges, semanticists have come up with several solutions and amendments. These relate basically to (A) the syntax-semantics interface, (B) the relationship between semantics and ontology, and (C) the semantics-pragmatics interface.
A Syntax-Semantics Interface One way to cope with challenges to compositionality is to adjust the syntax properly. This could be done, e.g., by introducing possibly mute, i.e. phonetically empty, functional heads into the syntactic tree that nevertheless carry semantic content, or by relating the semantic composition to a more abstract level of syntactic derivation – Logical Form – that may differ from surface structure due to invisible movement. That is, the syntactic structure on which the semantic composition is based may be more or less directly linked to surface syntax, such that it fits the demands of compositional semantics. Of course, any such move should be independently motivated.
B Semantics – Ontology Another direction that might be explored in order to reconcile syntax and semantics is to reconsider the inventory of primitive semantic objects the semantic fabric is assumed to be composed of. A famous case in point is Davidson’s (1967) plea for an ontological category of events. A crucial motivation for this move was that the standard treatment of adverbial modifiers at that time was insufficient insofar as it failed to account properly for the combinatorial behavior and entailments of adverbial expressions. By positing an additional event argument introduced by the verb, Davidson laid the grounds for a theory of adverbial modification that would overcome these shortcomings. Under this assumption Davidson’s famous sentence (9a) takes a semantic representation along the lines of (9b):
5
6
I. Foundations of semantics (9)
a. Jones buttered the toast in the bathroom with the knife at midnight. b. ∃e [ butter (jones, the toast, e) & in (e, the bathroom) & instr (e, the knife) & at (e, midnight) ]
According to (9b), there was an event e of Jones buttering the toast, and this event was located in the bathroom. In addition, it was performed by using a knife as an instrument, and it took place at midnight. That is, Davidson’s move enabled standard adverbial modifiers to be treated as simple first-order predicates that add information about the verb’s hidden event argument. The major merits of such a Davidsonian analysis are, first, that it accounts for the typical entailment patterns of adverbial modifiers directly on the basis of their semantic representation. That is, the entailments in (10) follow from (9b) simply by virtue of the logical rule of simplification. (10) a. b. c. d.
Jones buttered the toast in the bathroom at midnight. Jones buttered the toast in the bathroom. Jones buttered the toast at midnight. Jones buttered the toast.
And, secondly, Davidson paved the way for treating adverbial modifiers on a par with adnominal modifiers. In the meantime, researchers working within the Davidsonian paradigm have discovered more and more fundamental analogies between the verbal and the nominal domain, attesting to the fruitfulness of Davidson’s move. In short, by enriching the semantic universe with a new ontological category of events, Davidson solved the compositionality puzzle of adverbials and arrived at a semantic theory superior to its competitors in both conceptual simplicity and empirical coverage. Of course once again, such a solution does not come without costs. With Quine’s (1958) dictum “No entity without identity!” in mind, any ontological category a semantic theory makes use of requires a proper ontological characterization and legitimization. In the case of events, this is still the subject of ongoing debates among semanticists.
C Semantics-Pragmatics Interface Finally, challenges to compositionality might also be taken as an invitation to reconsider the relationship between semantics and pragmatics by asking how far the composition of sentential meaning goes, and what the principles of pragmatic enrichment and pragmatic licensing are. One notorious case in point is the adequate delineation of linguistic knowledge and world knowledge. To give an example, when considering the sentences in (11), we know that each of them refers to a very different kind of opening event. Obviously, the actions underlying, for instance, the opening of a can differ substantially from those of opening one’s eyes or opening a file on a computer. (11) a. She opened the can. b. She opened her eyes. c. She opened the electronic file. To a certain extent, this knowledge is of linguistic significance, as can be seen when taking into account the combinatorial behavior of certain modifiers:
1. Meaning in linguistics (11) a. She opened the can {with a knife, *abruptly, *with a double click}. b. She opened her eyes {*with a knife, abruptly, *with a double click}. c. She opened the electronic file {*with a knife, *abruptly, with a double click}. A comprehensive theory of natural language meaning should therefore strive to account for these observations. Nevertheless, incorporating this kind of world knowledge into compositional semantics would be neither feasible nor desireable. A possible solution for this dilemma lies in the notion of semantic underspecification. Several proposals have been developed which take the lexical meaning that is fed into semantic composition to be of an abstract, context neutral nature. In the case of to open in (11), for instance, this common meaning skeleton would roughly say that some action of an agent x on an object y causes a change of state such that y is accessible afterwards. This would be the verb’s constant meaning contribution that can be found in all sentences in (11a–c) and which is also present, e.g., in (11d), where we don’t have such clear intuitions about how x acted upon y, and which is therefore more liberal as to adverbial modification. (11) d. She opened the gift {with a knife, abruptly, with a double click}. That is, underspecification accounts would typically take neither the type of action performed by x nor the exact sense of accessibility of y as part of the verb’s lexical meaning. To account for this part of the meaning, compositional semantics is complemented by a procedure of pragmatic enrichment, by which the compositionally derived meaning skeleton is pragmatically specified according to the contextually available world knowledge. Semantic underspecification/pragmatic enrichment accounts provide a means for further specifying a compositionally well-formed, underspecified meaning representation. A different stance towards the semantics-pragmatics interface is taken by so-called “coercion” approaches. These deal typically with the interpretation of sentences that are strictly speaking ungrammatical but might be “rescued” in a certain way. An example is given in (12). (12) The alarm clock stood intentionally on the table. The sentence in (12) does not offer a regular integration for the subject-oriented adverbial intentionally, i.e, the subject NP the alarm clock does not fulfill the adverbial’s request for an intentional subject. Hence, a compositional clash results and the sentence is ungrammatical. Nevertheless, although deviant, there seems to be a way to rescue the sentence so that it becomes acceptable and interpretable anyway. In the case of (12), a possible repair strategy would be to introduce an actor, who is responsible for the fact that the alarm clock stands on the table. This move would provide a suitable anchor for the adverbial’s semantic contribution. Thus, we understand (12) as saying that someone put the alarm clock on purpose on the table. That is, in case of a combinatorial clash, there seems to be a certain leeway for non-compositional adjustments of the compositionally derived meaning. The defective part is “coerced” into the right format. The exact mechanism of coercion and its grammatical and pragmatic licensing conditions are still poorly understood. In current semantic research many quite different directions are being explored with respect to the issues A–C. What version of the principle of compositionality
7
8
I. Foundations of semantics ultimately turns out to be the right one and how compositional semantics interacts with syntax, ontology, and pragmatics is, in the end, an empirical question. Yet, the results and insights obtained so far in this endeavor are already demonstrating the fruitfulness of reckoning with compositionality as a driving force in the constitution of natural language meaning.
4. Context and discourse Speakers do not use sentences in isolation, but in the context of an utterance situation and as part of a longer discourse. The meaning of a sentence depends on the particular circumstances of its utterances, but also on the discourse context in which it is uttered. At the same time the meaning of linguistic expression changes the context, e.g., the information available to speaker and hearer. The analysis of the interaction of context, discourse and meaning provides new and challenging issues to the research agenda in the semantics-pragmatics interface as described in the last section. In the following we focus on two aspects of these issues to illustrate how the concept of meaning described above can further be developed by theorizing on the interaction between sentence meaning, contextual parameters and discourse structure. So far we have characterized the meaning of a sentence by its truth conditions and, as a result, we have “considered semantics to be the study of propositions” (Stalnaker 1970: 273). It is justified by the very clear concept that meaning describes “how the world is”. However, linguistic expressions often need additional information to form propositions as sentences contain indexical elements, such as I, you, she, here, there, now and the tenses of verbs. Indexical expressions cannot be interpreted according to possible worlds, i.e. how the conditions might be, but they are interpreted according to the actual utterance situation. Intensive research into this kind of context dependency led to the conclusion that the proposition itself depends on contextual parameters like speaker, addressee, location, time etc. This dependency is most prominently expressed in Kaplan’s (1977) notion character for the meaning of linguistic expressions. The character of an expression is a function from the context of utterance c, which includes the values for the speaker, the hearer, the time, the location etc. to the proposition. Other expressions such as local, different, a certain, enemy, neighbor may contain “hidden” indexical parameters. They express their content dependent on one or more reference points given by the context. Thus meaning is understood as an abstract concept or function from contexts to propositions, and propositions themselves are described as functions from possible worlds into truth conditions. The meaning of a linguistic expression is influenced not only by such relatively concrete aspects of the situation of use as speaker and addressee, but also by intentional factors like the assumptions of the speaker and hearer about the world, their beliefs and their goals. This type of context is continuously updated by the information provided by each sentence in a discourse. We see that linguistic expressions are not only “context-consumers”, but also “context-shifters”. This can be illustrated by examples from anaphora, presuppositions and various discourse relations. (13) a. A man walks in the park. He smokes. b. #He smokes. A man walks in the park.
1. Meaning in linguistics (14) a. Rebecca married Thomas. She regrets that she married him. b Rebecca regrets that she married Thomas. ?She married him. (15) a. John left. Ann started to cry. b. Ann started to cry. John left. In (13) the anaphoric pronoun needs an antecedent, in other words it is a contextconsumer as it takes the information provided in the context for fixing its meaning. The indefinite noun a man however is a context-shifter. It changes the context by introducing a discourse referent into the discourse or discourse structure such that the pronoun can be linked to it. In (13a) the indefinite introduces the referent and the anaphoric pronoun can be linked to it, in (13b) the pronoun in the first sentence has no antecedent and if the indefinite noun phrase in the second clause should refer to the same discourse referent it must not be indefinite. In (14) we see the contribution of presupposition to the context. (14b) is odd, since one can regret only something that is known to have happened. To assert this again makes the contribution of the second sentence superfluous and the small discourse incoherent. (15) provides evidence that we always assume some relation between sentences above a simple conjunction of two propositions. The relation could be a sequence of events or a causal relation between the two event, and this induces different meanings on the two small discourses as a whole. These and many more examples have led to the development of dynamic semantics, i.e. the view that meaning is shifting a given information status to a new one. There are different ways to model the context dependency of linguistic expressions and the choice among them is still an unresolved issue and a topic of considerable contemporary interest. We illustrate this by presenting one example from the literature. Stalnaker proposes to represent the context as a set of possible worlds that are shared by speaker and hearer, his “common ground”. A new sentence is interpreted with respect to the common ground, i.e. to a set of possible worlds. The interpretation of the sentence changes the common ground (given that the hearer does not reject the content of the sentence) and the updated common ground is the new context for the next sentence. Kamp (1988) challenges this view as problematic, as possible worlds do not provide enough linguistically relevant information, as the following example illustrates (due to Barbara Partee, first discussed in Heim 1982: 21). (16) Exactly one of the ten balls is not in the bag. It is under the sofa. (17) Exactly nine of the ten balls are in the bag. #It is under the sofa. Both sentences in (16) and (17) have the same truth conditions, i.e. in exactly all possible circumstances in which (16) is true (17) is true, too; still the continuation with the second sentence is only felicitous in (16), but not in (17). (16) explicitly introduces an antecedent in the first sentence, and the pronoun in the second sentence can be anaphorically linked to it. In (17), however, no explicit antecedent is introduced and therefore we cannot resolve the anaphoric reference of the pronoun. Extensive research on these issues has proven very fruitful for the continuous developing of our methodological tools and for our understanding of natural language meaning in context and its function for discourse structure.
9
10
I. Foundations of semantics
5. Meaning in contemporary semantics Meaning is a notion investigated by a number of disciplines, including linguistics, philosophy, psychology, artificial intelligence, semiotics as well as many others. The definitions of meaning are as manifold and plentiful as the different theories and perspectives that arise from these disciplines. We have argued here that in order to use meaning as a well-defined object of investigation, we must perceive facts to be explained and have tests to expose the underlying phenomena, and we must have a well-defined scientific apparatus which allows us to describe, analyze and model these phenomena. This scientific apparatus is contemporary semantics: It possesses a clearly defined terminology, it provides abstract representations and it allows for formal modeling that adheres to scientific standards and renders predictions that can be verified or falsified. We have illustrated the tenets, tools and goals of contemporary semantics by discussing typical approaches to three central characteristics of meaning: truth conditionality, compositionality, and context and discourse. Recent times have witnessed an increased interest of semanticists in developing their theories on a broader basis of empirical evidence, taking into account crosslinguistic data, diachronic data, psycho- and neurolinguistic studies as well as corpus linguistic and computational linguistic resources. As a result of these efforts, contemporary semantics is characterized by a continuous explanatory progress, an increased awareness of and proficiency in methodological issues, and the emergence of new opportunities for interdisciplinary cooperation. Along these lines, the articles of this handbook develop an integral, many-faceted and yet well-rounded picture of this joint endeavour in the linguistic study of natural language meaning.
6. References Davidson, Donald 1967. The logical form of action sentences. In: N. Resher (ed.). The Logic of Decision and Action. Pittsburgh, PA: University of Pittsburgh Press, 81–95. Reprinted in: D. Davidson (ed.), Essays on Actions and Events. Oxford: Clarendon Press, 1980, 105–148. Frege, Gottlob 1892/1980. Über Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik 100, 25–50. English translation in: P. Geach & M. Black (eds.). Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1980, 56–78. Heim, Irene 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation. University of Massachusetts, Amherst, MA. Reprinted: Ann Arbor, MI: University Microfilms. Kaplan, David 1977/1989. Demonstratives. An Essay on the Semantics, Logic, Metaphysics, and Epistemology of Demonstratives and Other Indexicals. Ms. Los Angeles, CA, University of California. Printed in: J. Almog & J. Perry & H. Wettstein (eds.). Themes from Kaplan. Oxford: Oxford University Press, 1989, 481–563. Kamp, Hans 1988. Belief attribution and context & Comments on Stalnaker. In: R.H. Grimm & D.D. Merrill (eds.). Contents of Thought. Tucson, AZ: The University of Arizona Press, 156–181, 203–206. Quine, Willard van Orman 1958. Speaking of Objects. Proceedings and Addresses of the American Philosophical Association 31, 5–22. Stalnaker, Robert 1970. Pragmatics. Synthese 22, 272–289. Wolfart, H. Christoph 1973. Plains Cree: A Grammatical Study. Transactions of the American Philosophical Society, New Series, vol. 63, pt. 5. Philadelphia, PA: The American Philosophical Society.
Claudia Maienborn, Tübingen (Germany) Klaus von Heusinger, Stuttgart (Germany) Paul Portner, Washington, DC (USA)
2. Meaning, intentionality and communication
11
2. Meaning, intentionality and communication 1. 2. 3. 4. 5. 6. 7.
Introduction Intentionality: Brentano’s legacy Early pragmatics: ordinary language philosophy and speech act theory Grice on speaker’s meaning and implicatures The emergence of truth-conditional pragmatics Concluding remarks: pragmatics and cognitive science References
Abstract This article probes the connections between the metaphysics of meaning and the investigation of human communication. It first argues that contemporary philosophy of mind has inherited most of its metaphysical questions from Brentano’s puzzling definition of intentionality. Then it examines how intentionality came to occupy the forefront of pragmatics in three steps. (1) By investigating speech acts, Austin and ordinary language philosophers pioneered the study of intentional actions performed by uttering sentences of natural languages. (2) Based on his novel concept of speaker’s meaning and his inferential view of human communication as a cooperative and rational activity, Grice developed a threetiered model of the meaning of utterances: (i) the linguistic meaning of the uttered sentence; (ii) the explicit truth-conditional content of the utterance; (iii) the implicit content conveyed by the utterance. (3) Finally, the new emerging truth-conditional trend in pragmatics urges that not only the implicit content conveyed by an utterance but its explicit content as well depends on the speaker’s communicative intention.
1. Introduction This article lies at the interface between the scientific investigation of human verbal communication and metaphysical questions about the nature of meaning. Words and sentences of natural languages have meaning (or semantic properties) and they are used by humans in tasks of verbal communication. Much of twentieth-century philosophy of mind has been concerned with metaphysical questions raised by the perplexing nature of meaning. For example, what is it about the meaning of the English word “dog” that enables a particular token used in the USA in 2008 to latch onto hairy barking creatures that lived in Egypt four thousand years earlier (cf. Horwich 2005)? Meanwhile, the study of human communication in the twentieth century can be seen as a competition between two models, which Sperber & Wilson (1986) call the “code model” and the “inferential model.” A decoding process maps a signal onto a message associated to the signal by an underlying code (i.e., a system of rules or conventions). An inferential process maps premises onto a conclusion, which is warranted by the premises. When an addressee understands a speaker’s utterance, how much of the content of the utterance has been coded into, and can be decoded from, the linguistic meaning of the utterance? How much content does the addressee retrieve by his ability to infer the speaker’s communicative intention? These are the basic scientific questions in the investigation of human verbal communication. Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 11–25
12
I. Foundations of semantics Much philosophy of mind in the twentieth century devoted to the metaphysics of meaning sprang from Brentano’s puzzling definition of the medieval word “intentionality” (section 2). Austin, one of the leading ordinary language philosophers, emphasized the fact that by uttering sentences of some natural language, a speaker may perform an action, i.e., a speech act (section 3). But he espoused a social conventionalist view of speech acts, which later pragmatics rejected in favor of an inferential approach. Grice instead developed an inferential model of verbal communication based on his concept of speaker’s meaning and his view that communication is a cooperative and rational activity (section 4). However, many of Grice’s insights have been further developed into a nonGricean truth-conditional pragmatics (section 5). Finally, the “relevance-theoretic” approach pioneered by Sperber & Wilson (1986) fills part of the gap between the study of meaning and the cognitive sciences (section 6).
2. Intentionality: Brentano’s legacy Brentano (1874) made a twofold contribution to the philosophy of mind: he provided a puzzling definition of intentionality and he put forward the thesis that intentionality is “the mark of the mental.” Intentionality is the power of minds to be about things, properties, events and states of affairs. As the meaning of its Latin root (tendere) indicates, “intentionality” denotes the mental tension whereby the human mind aims at so-called “intentional objects.” The concept of intentionality should not be confused with the concept of intention. Intentions are special psychological states involved in the planning and execution of actions. But on Brentano’s view, intentionality is a property of all psychological phenomena. Nor should “intentional” and “intentionality” be confused with the predicates “intensional” and “intensionality,” which mean “non-extensional” and “nonextensionality”: they refer to logical features of sentences and utterances, some of which may describe (or report) an individual’s psychological states. “Creature with a heart” and “creature with a kidney” have the same extension: all creatures with a heart have a kidney and conversely (cf. Quine 1948). But they have different intensions because having a heart and having a kidney are different properties. This distinction mirrors Frege’s (1892) distinction between sense and reference (cf. article 3 (Textor) Sense and reference and article 4 (Abbott) Reference). In general, a linguistic context is nonextensional (or intensional) if it fails to license both the substitution of coreferential terms salva veritate and the application of the rule of existential generalization. As Brentano defined it, intentionality is what enables a psychological state or act to represent a state of affairs, or be directed upon what he called an “intentional object.” Intentional objects exemplify the property which Brentano called “intentional inexistence” or “immanent objectivity,” by which he meant that the mind may aim at targets that do not exist in space and time or represent states of affairs that fail to obtain or even be possible. For example, unicorns do not exist in space and time and round squares are not possible geometrical objects. Nonetheless thinking about either a unicorn or a round square is not thinking about nothing. To admire Sherlock Holmes or to love Anna Karenina is to admire or love something, i.e., some intentional object. Thus, Brentano’s characterization of intentionality gave rise to a gap in twentieth-century philosophical logic between intentional-objects theorists (Meinong 1904; Parsons 1980; Zalta 1988), who claimed that there must be things that do not exist, and their opponents (Russell 1905;
2. Meaning, intentionality and communication Quine 1948), who denied it and rejected the distinction between being and existence. (For further discussion, cf. Jacob 2003.) Brentano (1874) also held the thesis that intentionality is constitutive of the mental: all and only psychological phenomena exhibit intentionality. Brentano’s second thesis that only psychological (or mental) phenomena possess intentionality led him to embrace a version of the Cartesian ontological dualism between mental and physical things. Chisholm (1957) offered a linguistic version of Brentano’s second thesis, according to which the intensionality of a linguistic report is a criterion of the intentionality of the reported psychological state (cf. Jacob 2003). He further argued that the contents of sentences describing an agent’s psychological states cannot be successfully paraphrased into the behaviorist idiom of sentences describing the agent’s bodily movements and behavior. Quine (1960) accepted Chisholm’s (1957) linguistic version of Brentano’s second thesis which he used as a premise for an influential dilemma: if the intentional idiom is not reducible to the behaviorist idiom, then the intentional idiom cannot be part of the vocabulary of the natural sciences and intentionality cannot be “naturalized.” Quine’s dilemma was that one must choose between a physicalist ontology and intentional realism, i.e., the view that intentionality is a real phenomenon. Unlike Brentano, Quine endorsed physicalism and rejected intentional realism. Some of the physicalists who accept Quine’s dilemma (e.g., Churchland 1989) have embraced eliminative materialism and denied the reality of beliefs and desires. The short answer to this proposal is that it is difficult to make sense of the belief that there are no beliefs. Others (such as Dennett 1987) have taken the “instrumentalist” view that, although the intentional idiom is a useful stance for predicting a complex physical system’s behavior, it lacks an explanatory value. But the question arises how the intentional idiom could make useful predictions if it fails to describe and explain anything (cf. Jacob 1997, 2003 and Rey 1997). As a result of the difficulties inherent to both eliminative materialism and interpretive instrumentalism, several physicalists have chosen to deny Brentano’s thesis that only non-physical things exhibit intentionality, and to challenge Quine’s dilemma according to which intentional realism is not compatible with physicalism. Their project is to “naturalize” intentionality and account for the puzzling features of intentionality (e.g., the fact that the mind may aim at non-existing objects and represent non-actual states of affairs), using only concepts recognizable by natural scientists (cf. section 3 on Grice’s notion of non-natural meaning). In recent philosophy of mind, the most influential proposals for naturalizing intentionality have been versions of the so-called “teleosemantic” approach championed by Millikan (1984), Dretske (1995) and others, which is based on the notion of biological function (or purpose). Teleosemantic theories are so-called because they posit an underlying connection between teleology (design or function) and content (or intentionality): a representational device is endowed with a function (or purpose). Something whose function is to indicate the presence of some property may fail to fulfill its function. If and when it does, then it may generate a false representation or represent something that fails to exist. Brentano’s thesis that only mental phenomena exhibit intentionality seems also open to the challenge that expressions of natural languages, which are not mental things, have intentionality in virtue of which they too can represent things, properties, events and
13
14
I. Foundations of semantics states of affairs. In response, many philosophers of mind, such as Grice (1957, 1968), Fodor (1987), Haugeland (1981) and Searle (1983, 1992), have endorsed the distinction between the underived intentionality of a speaker’s psychological states and the derived intentionality (i.e., the conventional meaning) of the sentences by the utterance of which she expresses her mental states. On their view, sentences of natural languages would lack meaning unless humans used them for some purpose. (But for dissent, see Dennett 1987.) Some philosophers go one step further and posit the existence of an internal “language of thought:” thinking, having a thought or a propositional attitude is to entertain a token of a mental formula realized in one’s brain. On this view, like sentences of natural languages, mental sentences possess syntactic and semantic properties. But, unlike sentences of natural languages, they lack phonological properties. Thus, the semantic properties of a complex mental sentence systematically depend upon the meanings of its constituents and their syntactic combination. The strongest arguments for the existence of a language of thought are based on the productivity and systematicity of thoughts, i.e., the facts that there is no upper limit on the complexity of thoughts and that a creature able to form certain thoughts must be able to form other related thoughts. On this view, the intentionality of an individual’s thoughts and propositional attitudes derives from the meanings of symbols in the language of thought (cf. Fodor 1975, 1987).
3. Early pragmatics: ordinary language philosophy and speech act theory Unlike sentences of natural languages, utterances are created by speakers, at particular places and times, for various purposes, including verbal communication. Not all communication, however, need be verbal. Nor do people use language solely for the purpose of communication; one can use language for clarifying one’s thoughts, reasoning and making calculations. Utterances, not sentences, can be shouted in a hoarse voice and tape-recorded. Similarly, the full meaning of an utterance goes beyond the linguistic meaning of the uttered sentence, in two distinct aspects: both its representational content and its so-called “illocutionary force” (i.e., whether the utterance is meant as a prediction, a threat or an assertion) are underdetermined by the linguistic meaning of the uttered sentence. Prior to the cognitive revolution of the 1950’s, the philosophy of language was divided into two opposing approaches: so-called “ideal language” philosophy (in the tradition of Frege, Russell, Carnap and Tarski) and so-called “ordinary language” philosophy (in the tradition of Wittgenstein, Austin, Strawson and later Searle). The word “pragmatics,” which derives from the Greek word praxis (which means action), was first introduced by ideal language philosophers as part of a threefold distinction between syntax, semantics and pragmatics (cf. Morris 1938 and Carnap 1942). Syntax was defined as the study of internal relations among symbols of a language. Semantics was defined as the study of the relations between symbols and their denotations (or designata). Pragmatics was defined as the study of the relations between symbols and their users (cf. article 88 (Jaszczolt) Semantics and pragmatics). Ideal language philosophers were interested in the semantic structures of sentences of formal languages designed for capturing mathematical truths. The syntactic structure
2. Meaning, intentionality and communication of any “well-formed formula” (i.e., sentence) of a formal language is defined by arbitrary rules of formation and derivation. Semantic values are assigned to simple symbols of the language by stipulation and the truth-conditions of a sentence can be mechanically determined from the semantic values of its constituents by the syntactic rules of composition. From the perspective of ideal language philosophers, such features of natural languages as their context-dependency appeared as a defect. For example, unlike formal languages, natural languages contain indexical expressions (e.g., “now”, “here” or “I”) whose references can change with the context of utterance. By contrast, ordinary language philosophers were concerned with the distinctive features of the meanings of expressions of natural languages and the variety of their uses. In sharp opposition to ideal language philosophers, ordinary language philosophers stressed two main points, which paved the way for later work in pragmatics. First, they emphasized the context-dependency of the descriptive content expressed by utterances of sentences of natural languages (see section 4). Austin (1962a: 110–111) denied that a sentence as such could ever be ascribed truth-conditions and a truth-value: “the question of truth and falsehood does not turn only on what a sentence is, nor yet on what it means, but on, speaking very broadly, the circumstances in which it is uttered.” Secondly, they criticized what Austin (1962b) called the “descriptive fallacy,” according to which the sole point of using language is to state facts or describe the world (cf. article 5 (Green) Meaning in language use). As indicated by the title of Austin’s (1962b) book, How to Do Things with Words, they argued that by uttering sentences of some natural language, a speaker performs an action, i.e., a speech act: she performs an “illocutionary act” with a particular illocutionary force. A speaker may give an order, ask a question, make a threat, a promise, an entreaty, an apology, an assertion and so on. Austin (1962b) sketched a new framework for the description and classification of speech acts. As Green (2007) notes, speech acts are not to be confused with acts of speech: “one can perform an act of speech, say by uttering words in order to test a microphone, without performing a speech act.” Conversely, one can issue a warning without saying anything, by producing a gesture or a “minatory facial expression.” Austin (1962b) identified three distinct levels of action in the performance of a speech act: the “locutionary act,” the “illocutionary act,” and the “perlocutionary act,” which stand to one another in the following hierarchical structure. By uttering a sentence, a speaker performs the locutionary act of saying something by virtue of which she performs an illocutionary act with a given illocutionary force (e.g., giving an order). Finally, by performing an illocutionary act endowed with a specific illocutionary force, the speaker performs a perlocutionary act, whereby she achieves some psychological or behavioral effect upon her audience, such as frightening him or convincing him. Before he considered this threefold distinction within the structure of speech acts, Austin had made a distinction between so-called “constative” and “performative” utterances. The former is supposed to describe some state of affairs and is true or false according to whether the described state of affairs obtains or not. Instead of being a (true or false) description of some independent state of affairs, the latter is supposed to constitute (or create) a state of affairs of its own. Clearly, the utterance of a sentence in either the imperative mood (“Leave this room immediately!”) or the interrogative mood (“What time is it right now?”) is performative in this sense: far from purporting to register any pre-existing state of affairs, the speaker either gives an order or asks a question.
15
16
I. Foundations of semantics By drawing the distinction between constative and performative utterances, Austin was able to criticize the descriptive fallacy and emphasize the fact that many utterances of declarative sentences are performative (not constative) utterances. In particular, Austin was interested in explicit performative utterances (“I promise I’ll come,” “I order you to leave” or “I apologize”), which include a main verb that denotes the very speech act that the utterance performs. Austin’s attention was drawn towards explicit performatives, whose performance is governed, not merely by linguistic rules, but also by social conventions and by what Searle (1969: 51) called “institutional facts” (as in “I thereby pronounce you husband and wife”), i.e., facts that (unlike “brute facts”) presuppose the existence of human institutions. Specific bodily movements count as a move in a game, as an act of e.g., betting, or as part of a marriage ceremony only if they conform to some conventions that are part of some social institutions. For a performative speech act to count as an act of baptism, of marriage, or an oath, the utterance must meet some social constraints, which Austin calls “felicity” conditions. Purported speech acts of baptism, oath or marriage can fail some of their felicity conditions and thereby “misfire” if either the speaker lacks the proper authority or the addressee fails to respond with an appropriate uptake – in response to e.g., an attempted bet sincerely made by the speaker. If a speaker makes an insincere promise, then he is guilty of an “abuse.” Austin came to abandon his former distinction between constative and performative utterances when he came to realize that some explicit performatives can be used to make true or false assertions or predictions. One can make an explicit promise or an explicit request by uttering a sentence prefixed by either “I promise” or “I request.” One can also make an assertion or a prediction by uttering a sentence prefixed by either “I assert” or “I predict.” Furthermore, two of his assumptions led Austin to embrace a social conventionalist view of illocutionary acts. First, Austin took explicit performatives as a general model for illocutionary acts. Secondly, he took explicit performatives, whose felicity conditions include the satisfaction of social conventions, as a paradigm of all explicit performatives. Thus Austin (1962b: 103) was led to embrace a social conventionalist view of illocutionary acts according to which the illocutionary force of a speech act is “conventional in the sense that it could be made explicit by the performative formula.” Austin’s social conventionalist view of illocutionary force was challenged by Strawson (1964: 153–154) who pointed out that the assumption that no illocutionary act could be performed unless it conformed to some social convention would be “like supposing that there could not be love affairs which did not proceed on lines laid down in the Roman de la Rose.” Instead, Strawson argued, what confers to a speech act its illocutionary force is that the speaker intends it to be so taken by her audience. By uttering “You will leave,” the speaker may make a prediction, a bet or order the addressee to leave. Only the context, not some socially established convention, may help the audience determine the particular illocutionary force of the utterance. Also, as noted by Searle (1975) and by Bach & Harnish (1979), speech acts may be performed indirectly. For example, by uttering “I would like you to leave,” a speaker directly expresses her desire that her addressee leave. But in so doing, she may indirectly ask or request her addressee to do so. By uttering “Can you pass the salt?” – which is a direct question about her addressee’s ability – , the speaker may indirectly request him to pass the salt. As Recanati (1987: 92–93) argues, when a speaker utters an explicit performative such as “I order you to leave,” her utterance has the direct illocutionary force of a statement. But it may also have the indirect force of an order. There need be no socially
2. Meaning, intentionality and communication established convention whereby a speaker orders her audience to leave by means of an utterance with a verb that denotes the act performed by the speaker.
4. Grice on speaker’s meaning and implicatures In his 1957 seminal paper, Grice did three things: he drew a contrast between “natural” and “non-natural” meaning; he offered a definition of the novel concept of speaker’s meaning; and he sketched a framework within which human communication is seen as a cooperative and rational activity (the addressee’s task being to infer the speaker’s meaning on the basis of her utterance, in accordance with a few principles of rational cooperation). In so doing, Grice took a major step towards an “inferential model” of human communication, and away from the “code model” (cf. article 88 (Jaszczolt) Semantics and pragmatics). As Grice (1957) emphasized, smoke is a natural sign of fire: the former naturally means the latter in the sense that not unless there was a fire would there be any smoke. By contrast, the English word “fire” (or the French word “feu”) non-naturally means fire: if a person erroneously believes that there is a fire (or wants to intentionally mislead another into wrongly thinking that there is a fire) when there is none, then she can produce a token of the word “fire” in the absence of a fire. (Thus, the notion of non-natural meaning is Grice’s counterpart of Brentano’s intentionality.) Grice (1957, 1968, 1969) further introduced the concept of speaker’s meaning, i.e., of someone meaning something by exhibiting some piece of behavior that can, but need not, be verbal. For a speaker S to mean something by producing some utterance x is for S to intend the utterance of x to produce some effect (or response r) in an audience A by means of A’s recognition of this very intention. Hence, the speaker’s meaning is a communicative intention, with the peculiar feature of being reflexive in the sense that part of its content is that an audience recognize it. Strawson (1964) turned to Grice’s concept of speaker’s meaning as an intentionalist alternative to Austin’s social conventional account of illocutionary acts (section 2). Strawson (1964) also pointed out that Grice’s complex analysis of speaker’s meaning or communicative intention requires the distinction between three complementary levels of intention. For S to mean something by an utterance x is for S to intend: (i) S’s utterance of x to produce a response r in audience A; (ii) A to recognize S’s intention (i); (iii) A’s recognition of S’s intention (i) to function at least as part of A’s reason for A’s response r. This analysis raises two opposite problems: it is both overly restrictive and insufficiently so. First, as Strawson’s reformulation shows, Grice’s condition (i) corresponds to S’s intention to perform what Austin (1962b) called a perlocutionary act. But for S to successfully communicate with A, it is not necessary that S’s intention to perform her perlocutionary act be fulfilled (cf. Searle 1969: 46–48). Suppose that S utters: “It is raining,” intending (i) to produce in A the belief that it is raining. A may recognize S’s intention (i); but, for some reason, A may mistrust S and fail to acquire the belief that it is raining. S would have failed to convince A (that it is raining), but S would nonetheless have successfully communicated what she meant to A. Thus, fulfillment of S’s intention (i) is not
17
18
I. Foundations of semantics necessary for successful communication. Nor is the fulfillment of S’s intention (iii), which presupposes the fulfillment of S’s intention (i). All that is required for S to communicate what she meant to A is A’s recognition of S’s intention (ii) that S has the higher-order intention to inform A of her first-order intention to inform A of something. Secondly, Strawson (1964) pointed out that his reformulation of Grice’s definition of speaker’s meaning is insufficiently restrictive. Following Sperber & Wilson (1986: 30), suppose that S intends A to believe that she needs his help to fix her hair-dryer, but she is reluctant to ask him openly to do so. S ostensively offers A evidence that she is trying to fix her hair-dryer, thereby intending A to believe that she needs his help. S intends A to recognize her intention to inform him that she needs his help. However, S does not want A to know that she knows that he is watching her. Since S is not openly asking A to help her, she is not communicating with A. Although S has the second-order intention that A recognizes her first-order intention to inform him that she needs his help, she does not want A to recognize her second-order intention. To deal with such a case, Strawson (1964) suggested that the analysis of Grice’s speaker’s meaning include S’s third-order intention to have her second-order intention recognized by her audience. But as Schiffer (1972) pointed out, this opens the way to an infinity of higher-order intentions. Instead, Schiffer (1972) argued that for S to have a communicative intention, S’s intention to inform A must be mutually known to S and A. But as pointed out by Sperber & Wilson (1986: 18–19), people who share mutual knowledge know that they do. So the question arises: how do speaker and hearer know that they do? (We shall come back to this issue in the concluding remarks.) Grice (1968) thought of his concept of speaker’s meaning as a basis for a reductive analysis of semantic notions such as sentence- or word-meaning. But most linguists and philosophers have expressed skepticism about this aspect of Grice’s program (cf. Chomsky 1975, 1980). By contrast, many assume that some amended version of Grice’s concept of speaker’s meaning can serve as a basis for an inferential model of human communication. In his 1967 William James Lectures, Grice argued that what enables the hearer to infer the speaker’s meaning on the basis of her utterance is that he rationally expects all utterances to meet the “Cooperative Principle” and a set of nine maxims or norms organized into four main categories which, by reference to Kant, he labeled maxims of Quantity (informativeness), Quality (truthfulness), Relation (relevance) and Manner (clarity). As ordinary language philosophers emphasized, in addition to what is being said by an assertion – what makes the assertion true or false – , the very performance of an illocutionary act with the force of an assertion has pragmatic implications. For example, consider Moore’s paradox: by uttering “It is raining but I do not believe it,” the speaker is not expressing a logical contradiction, as there is no logical contradiction between the fact that it is raining and the fact that the speaker fails to believe it. Nonetheless, the utterance is pragmatically paradoxical because by asserting that it is raining, the speaker thereby expresses (or displays) her belief that it is raining, but her utterance explicitly denies that she believes it. Grice’s (1967/1975) third main contribution to an inferential model of communication was his concept of conversational implicature, which he introduced as “a term of art” (cf. Grice 1989: 24). Suppose that Bill asks Jill whether she is going out and Jill replies: “It’s raining.” For Jill’s utterance about the weather to constitute a response to Bill’s question,
2. Meaning, intentionality and communication additional assumptions are required, such as, for example, that Jill does not like rain (i.e., that if it is raining, then Jill is not going out) which, together with Jill’s response, may entail that she is not going out. Grice’s approach to communication, based on the Cooperative Principle and the maxims, offers a framework for explaining how, from Jill’s utterance, Bill can retrieve an implicit answer to his question by supplying some additional assumptions. Bill must be aware that Jill’s utterance is not a direct answer to his question. Assuming that Jill does not violate (or “flout”) the maxim of relevance, she must have intended Bill to supply the assumption that e.g., she does not enjoy rain, and to infer that she is not going out from her explicit utterance. Grice (1967/1975) called the additional assumption and the conclusion “conversational” implicatures. In other words, Grice’s conversational implicatures enable a hearer to reconcile a speaker’s utterance with his assumption that the speaker conforms to the Principle of Cooperation. Grice (1989: 31) insisted that “the presence of a conversational implicature must be capable of being worked out; for even if it can in fact be intuitively grasped, unless the intuition is replaceable by an argument, the implicature (if present at all) will not count as a conversational implicature.” (Instead, it would count as a so-called “conventional” implicature, i.e., a conventional aspect of meaning that makes no contribution to the truth-conditions of the utterance.) Grice further distinguished “generalized” conversational implicatures, which are generated so to speak “by default,” from “particularized” conversational implicatures, whose generation depends on special features of the context of utterance. Grice’s application of his cooperative framework to human communication and his elaboration of the concept of (generalized) conversational implicature were motivated by his concern to block certain moves made by ordinary language philosophers. One such move was exemplified by Strawson’s (1952) claim that, unlike the truth-functional conjunction of propositional calculus, the English word “and” makes different contributions to the full meanings of the utterances of pairs of conjoined sentences. For example, by uttering “John took off his boots and got into bed” the speaker may mean that the event first described took place first. In response, Grice (1967/1975, 1981) argued that, in accordance with the truth-table of the logical conjunction of propositional calculus, the utterance of any pair of sentences conjoined by “and” is true if and only if both conjuncts are true and false otherwise. He took the view that the temporal ordering of the sequence of events described by such an utterance need not be part of the semantic content (or truth-conditions) of the utterance. Instead, it arises as a conversational implicature retrieved by the hearer through an inferential process guided by his expectation that the speaker is following the Cooperative Principle and the maxims, e.g., the sub-maxim of orderliness (one of the sub-maxims of the maxim of Manner), according to which there is some reason why the speaker chose to utter the first conjunct first. Also, under the influence of Wittgenstein, some ordinary language philosophers claimed that unless there are reasons to doubt whether some thing is really red, it is illegitimate to say “It looks red to me” (as opposed to “It is red”). In response, Grice (1967/1975) argued that whether an utterance is true or false is one thing; whether it is odd or misleading is another (cf. Carston 2002a: 103; but see Travis 1991 for dissent).
19
20
I. Foundations of semantics
5. The emergence of truth-conditional pragmatics Grice’s seminal work made it clear that verbal communication involves three layers of meaning: (i) the linguistic (conventional) meaning of the sentence uttered, (ii) the explicit content expressed (i.e., “what is said”) by the utterance, and (iii) the implicit content of the utterance (its conversational implicatures). Work in speech act theory further suggests that each layer of meaning also exhibits a descriptive dimension (e.g., the truth conditions of an utterance) and a pragmatic dimension (e.g., the fact that a speech act is an assertion). Restricting itself to the descriptive dimension of meaning, the rest of this section discusses the emergence of a new truth-conditional pragmatic approach, whose core thesis is that what is said (not just the conversational implicatures of an utterance) depends on the speaker’s meaning. By further extending the inferentialist model of communication, this pragmatic approach to what is said contravenes two deeply entrenched principles in the philosophy of language: literalism and minimalism. Ideal language philosophers thought of indexicality and other context-sensitive phenomena as defective features of natural languages. Quine (1960: 193) introduced the concept of an eternal sentence as one devoid of any context-sensitive or ambiguous constituent so that its “truth-value stays fixed through time and from speaker to speaker.” An instance of an eternal sentence might be: “Three plus two equals five.” Following Quine, many philosophers (see e.g., Katz 1981) subsequently accepted literalism, i.e., the view that for any statement made in some natural language, using a context-sensitive sentence in a given context, there is some eternal sentence in the same language that can be used to make the same statement in any context. Few linguists and philosophers nowadays subscribe to literalism because they recognize that indexicality is an ineliminable feature of natural languages. However, many subscribe to minimalism. Grice urged an inferential model of the pragmatic process whereby a hearer infers the conversational implicatures of an utterance from what is said. But he embraced the minimalist view that what is said departs from the linguistic meaning of the uttered sentence only as is necessary for the utterance to be truth-evaluable (cf. Grice 1989: 25). If a sentence contains an ambiguous phrase (e.g., “He is in the grip of a vice”), then it must be disambiguated. If it contains an indexical, then it cannot be assigned its proper semantic value except by relying on contextual information. But according to minimalism, appeal to contextual information is always mandated by some linguistic constituent (e.g., an indexical) within the sentence. In order to determine what is said by the utterance of a sentence containing e.g., the indexical pronoun “I,” the hearer relies on the rule according to which any token of “I” refers to the speaker who used that token. As Stanley (2000: 391) puts it, “all truth-conditional effects of extralinguistic context can be traced to logical form” (i.e., the semantic information that is grammatically encoded). Unlike the reference of a pure indexical like “I,” however, the reference of a demonstratively used pronoun (e.g., “he”) can only be determined by representing the speaker’s meaning, not by a semantic rule. So does understanding the semantic value of “here” or “now.” A person may use a token of “here” to refer to a room, a street, a city, a country, the Earth, and so forth. Similarly, a person may use a token of “now” to refer to a millisecond, an hour, a day, a year, a century, and so forth. One cannot determine the semantic value of a token of either “here” or “now” without representing the speaker’s meaning.
2. Meaning, intentionality and communication According to truth-conditional pragmatics, what is said by an utterance is determined by pragmatic processes, which are not necessarily triggered by some syntactic constituent of the uttered sentence (e.g., an indexical). By contrast, minimalists reject truthconditional pragmatics and postulate, in the logical form of the sentence uttered, the existence of hidden variables whose semantic values must be contextually determined for the utterance to be truth-evaluable (see the controversy between Stanley 2000 and Recanati 2004 over whether the logical form of an utterance of “It’s raining” contains a free variable for locations). The rise of truth-conditional pragmatics may be interpreted (cf. Travis 1991) as vindicating the view that an utterance’s truth-conditions depend on what Searle (1978, 1983) calls “the Background,” i.e., a network of practices and unarticulated assumptions (but see Stalnaker 1999 and cf. article 38 (Dekker) Dynamic semantics for a semantic approach). Although the verb “to cut” is unambiguous, what counts as cutting grass differs from what counts as cutting a cake. Only against alternative background assumptions will one be able to discriminate the truth-conditions of “John cut the grass” and of “John cut the cake.” However, advocates of minimalism argue that if, instead of using a lawn mower, John took out his pocket-knife and cut each blade lengthwise, then by uttering “John cut the grass” the speaker would speak the truth (cf. Cappelen & Lepore 2005). Three pragmatic processes involved in determining what is said by an utterance have been particularly investigated by advocates of truth-conditional pragmatics: free enrichment, loosening and transfer.
5.1. Free enrichment Grice (1967/1975, 1981) offered a pragmatic account according to which the temporal or causal ordering between the events described by the utterance of a conjunction is conveyed as a conversational implicature. But consider Carston’s (1988) example: “Bob gave Mary his key and she opened the door.” Carston (1988) argues that part of what is said is that “she” refers to whoever “Mary” refers to and that Mary opened the door with the key Bob gave her. If so, then the fact that Bob gave Mary his key before Mary opened the door is also part of what is said. Following Sperber & Wilson (1986: 189), suppose a speaker utters “I have had breakfast,” as an indirect way of declining an offer of food. By minimalist standards, what the speaker said was that she has had breakfast at least once in her life prior to her utterance. According to Grice, the hearer must be able to infer a conversational implicature from what the speaker said. However, the hearer could not conclude that the speaker does not wish any food from the truism that she has had breakfast at least once in her life before her utterance. Instead, for the hearer to infer that the speaker does not wish to have food in response to his question, what the speaker must have said is that she has had breakfast just prior to the time of utterance.
5.2. Loosening Cases of free enrichment are instances of strengthening the concept linguistically encoded by the meaning of the sentence – for example, strengthening of the concept encoded by “the key” into the concept expressible by “the key Bob gave to Mary”. However,
21
22
I. Foundations of semantics not all pragmatic processes underlying the generation of what is said from the linguistic meaning of the sentence are processes of conceptual strengthening or narrowing. Some are processes of conceptual loosening or broadening. For example, imagine a speaker’s utterance in a restaurant of “My steak is raw” whereby what she says is not that her steak is literally uncooked but rather that it is undercooked.
5.3. Transfer Strengthening and loosening are cases of modification of a concept linguistically encoded by the meaning of a word. Transfer is a process whereby a concept encoded by the meaning of a word is mapped onto a related but different concept. Transfer is illustrated by examples from Nunberg (1979, 1995): “The ham sandwich left without paying” and “I am parked out back.” In the first example, the property expressed by the predicate “left without paying” is being ascribed to the person who ordered the sandwich, not to the sandwich itself. In the second example, the property of being parked out back is ascribed not to the speaker, but to the car that stands in the ownership relation to her. The gist of truth-conditional pragmatics is that speaker’s meaning is involved in determining both the conversational implicatures of an utterance and what is said. As the following example shows, however, it is not always easy to decide whether a particular assumption is a conversational implicature of an utterance or part of what is said. Consider “The picnic was awful. The beer was warm.” For the second sentence to offer a justification (or explanation) of the truth expressed by the first, the assumption must be made that the beer was part of the picnic. According to Carston (2002b), the assumption that the beer was part of the picnic is a conversational implicature (an implicated premise) of the utterance. According to Recanati (2004), the concept linguistically encoded by “the beer” is strengthened into the concept expressible by “the beer that was part of the picnic” and part of what is said.
6. Concluding remarks: pragmatics and cognitive science Sperber & Wilson’s (1986) relevance-theoretic approach squarely belongs to truthconditional pragmatics: it makes three contributions towards bridging the gap between pragmatics and the cognitive sciences. First, it offers a novel account of speaker’s meaning. As Schiffer (1972) pointed out, not unless S’s intention to inform A is mutually known to S and A could S’s intention count as a genuine communicative intention (cf. section 3). But how could S and A know that they mutually know S’s intention to inform A of something? Sperber & Wilson (1986) argue that they cannot and urge that the mutual knowledge requirement be replaced by the idea of mutual manifestness. An assumption is manifest to S at t if and only if S is capable of representing and accepting it as true at t. A speaker’s informative intention is an intention to make (more) manifest to an audience a set of assumptions {I}. A speaker’s communicative intention is her intention to make it mutually manifest that she has the above informative intention. Hence, a communicative intention is a second-order informative intention. Secondly, relevance theory is so-called because Sperber & Wilson (1986) accept a Cognitive principle of relevance according to which human cognition is geared towards the maximization of relevance. Relevance is a property of an input for an individual at t: it depends on both the set of contextual effects and the cost of processing, where the
2. Meaning, intentionality and communication contextual effect of an input might be the set of assumptions derivable from processing the input in a given context. Other things being equal, the greater the set of contextual effects achieved by processing an input, the more relevant the input. The greater the effort required by the processing, the lower the relevance of the input. They further accept a Communicative principle of relevance according to which every ostensively produced stimulus conveys a presumption of its own relevance: an ostensive stimulus is optimally relevant if and only if it is relevant enough to be worth the audience’s processing effort and it is the most relevant stimulus compatible with the communicator’s abilities and preferences. Finally, the relevance-theoretic approach squarely anchors pragmatics into what cognitive psychologists call “third-person mindreading,” i.e., the ability to represent others’ psychological states (cf. Leslie 2000). In particular, it emphasizes the specificity of the task of representing an agent’s communicative intention underlying her (communicative) ostensive behavior. The observer of some non-ostensive intentional behavior (e.g., hunting) can plausibly ascribe an intention to the agent on the basis of the desirable outcome of the latter’s behavior, which can be identified (e.g., hit his target), whether or not the behavior is successful. However, the desirable outcome of a piece of communicative behavior (i.e., the addressee’s recognition of the agent’s communicative intention) cannot be identified unless the communicative behavior succeeds (cf. Sperber 2000; Origgi & Sperber 2000 and Wilson & Sperber 2002). Thus, the development of pragmatics takes us from the metaphysical issues about meaning and intentionality inherited from Brentano to the cognitive scientific investigation of the human mindreading capacity to metarepresent others’ mental representations. Thanks to Neftali Villanueva Fernández, Paul Horwich and the editors for comments on this article.
7. References Austin, John L. 1962a. Sense and Sensibilia. Oxford: Oxford University Press. Austin, John L. 1962b. How to Do Things with Words. Oxford: Clarendon Press. Bach, Kent & Robert M. Harnish 1979. Linguistic Communication and Speech Acts. Cambridge, MA: The MIT Press. Brentano, Frantz 1874/1911/1973. Psychology from an Empirical Standpoint. London: Routledge & Kegan Paul. Cappelen, Herman & Ernie Lepore 2005. Insensitive Semantics. Oxford: Blackwell. Carnap, Rudolf 1942. Introduction to Semantics. Chicago, IL: The University of Chicago Press. Carston, Robyn 1988. Implicature, explicature and truth-theoretic semantics. In: R. Kempson (ed.). Mental Representations: The Interface between Language and Reality. Cambridge: Cambridge University Press, 155–181. Carston, Robyn 2002a. Thoughts and Utterances. The Pragmatics of Explicit Communication. Oxford: Blackwell. Carston, Robyn 2002b. Linguistic meaning, communicated meaning and cognitive pragmatics. Mind & Language 17, 127–148. Chisholm, Robert M. 1957. Perceiving: A Philosophical Study. Ithaca, NY: Cornell University Press. Chomsky, Noam 1975. Reflections on Language. New York: Pantheon Books. Chomsky, Noam 1980. Rules and Representations. New York: Columbia University Press. Churchland, Paul M. 1989. A Neurocomputational Perspective: The Nature of Mind and the Structure of Science. Cambridge, MA: The MIT Press.
23
24
I. Foundations of semantics Dennett, Daniel C. 1987. The Intentional Stance. Cambridge, MA: The MIT Press. Dretske, Fred 1995. Naturalizing the Mind. Cambridge, MA: The MIT Press. Fodor, Jerry A. 1975. The Language of Thought. New York: Crowell. Fodor, Jerry A. 1987. Psychosemantics: The Problem of Meaning in the Philosophy of Mind. Cambridge, MA: The MIT Press. Frege, Gottlob 1892/1980. Über Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik 100, 25–50. English translation in: P. Geach & M. Black (eds.). Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1980, 56–78. Green, Mitchell 2007. Speech acts. Stanford Encyclopedia of Philosophy, http://plato.stanford.edu/ entries/speech-acts, July 20, 2008. Grice, H. Paul 1957. Meaning. The Philosophical Review 64, 377–388. Reprinted in: H. P. Grice. Studies in the Way of Words. Cambridge, MA: Harvard University Press, 1989, 212–223. Grice, H. Paul 1967/1975. Logic and conversation. In: P. Cole & J. Morgan (eds.). Syntax and Semantics 3: Speech Acts. New York: Academic Press, 41–58. Reprinted in: H.P. Grice. Studies in the Way of Words. Cambridge, MA: Harvard University Press, 1989, 22–40. Grice, H. Paul 1968. Utterer’s meaning, sentence meaning and word meaning. Foundations of Language 4, 225–242. Reprinted in: H.P. Grice. Studies in the Way of Words. Cambridge, MA: Harvard University Press, 1989, 117–137. Grice, H. Paul 1969. Utterer’s meaning and intentions. Philosophical Review 78, 147–177. Reprinted in: H.P. Grice. Studies in the Way of Words. Cambridge, MA: Harvard University Press, 1989, 86–116. Grice, H. Paul 1978. Further notes on logic and conversation. In: P. Cole (ed.). Syntax and Semantics 9: Pragmatics. New York: Academic Press, 113–128. Reprinted in: H. P. Grice. Studies in the Way of Words. Cambridge, MA: Harvard University Press, 1989, 41–57. Grice, H. Paul 1981. Presuppositions and conversational implicatures. In: P. Cole (ed.). Radical Pragmatics. New York: Academic Press, 183–198. Reprinted in: H. P. Grice. Studies in the Way of Words. Cambridge, MA: Harvard University Press, 1989, 269–282. Grice, H. Paul 1989. Studies in the Way of Words. Cambridge, MA: Harvard University Press. Haugeland, John 1981. Semantic engines: An introduction to mind design. In: J. Haugeland (ed.). Mind Design, Philosophy, Psychology, Artificial Intelligence. Cambridge, MA: The MIT Press, 1–34. Horwich, Paul 2005. Reflections on Meaning. Oxford: Oxford University Press. Jacob, Pierre 1997. What Minds Can Do. Cambridge: Cambridge University Press. Jacob, Pierre 2003. Intentionality. Stanford Encyclopedia in Philosophy, http://plato.Stanford.edu/ intensionality/, June 15, 2006. Katz, Jerry J. 1981 Language and Other Abstract Objects. Oxford: Blackwell. Leslie, Alan 2000. ‘Theory of Mind’ as a mechanism of selective attention. In: M. Gazzaniga (ed.). The New Cognitive Neuroscience. Cambridge, MA: The MIT Press, 1235–1247. Meinong, Alexis 1904. Über Gegenstandstheorie. In: A. Meinong (ed.). Untersuchungen zur Gegenstandstheorie und Psychologie. Leipzig: Barth, 1–50. English translation in: R. M. Chisholm (ed.). Realism and the Background of Phenomenology. Glencoe: The Free Press, 1960, 76–117. Millikan, Ruth G. 1984. Language, Thought and Other Biological Objects. Cambridge, MA: The MIT Press. Morris, Charles 1938. Foundations of the Theory of Signs. Chicago, IL: The University of Chicago Press. Nunberg, Geoffrey 1979. The non-uniqueness of semantic solutions: Polysemy. Linguistics & Philosophy 3, 143–184. Nunberg, Geoffrey 1995. Transfers of meaning. Journal of Semantics 12, 109–132. Origgi, Gloria & Dan Sperber 2000. Evolution, communication and the proper function of language. In: P. Carruthers & A. Chamberlain (eds.). Evolution and the Human Mind: Language, Modularity and Social Cognition. Cambridge: Cambridge University Press, 140–169. Parsons, Terence 1980. Nonexistent Objects. New Haven, CT: Yale University Press.
3. (Frege on) Sense and reference
25
Quine, Willard van Orman 1948. On what there is. Reprinted in: W.V.O. Quine. From a Logical Point of View. Cambridge, MA: Harvard University Press, 1953, 1–19. Quine, Willard van Orman 1960. Word and Object. Cambridge, MA: The MIT Press. Recanati, François 1987. Meaning and Force. Cambridge: Cambridge University Press. Recanati, François 2004. Literal Meaning. Cambridge: Cambridge University Press. Rey, Georges 1997. Contemporary Philosophy of Mind: A Contentiously Classical Approach. Oxford: Blackwell. Russell, Bertrand 1905. On denoting. Mind 14, 479–493. Reprinted in: R. C. Marsh (ed.). Bertrand Russell, Logic and Knowledge, Essays 1901–1950. New York: Capricorn Books, 1956, 41–56. Schiffer, Stephen 1972. Meaning. Oxford: Oxford University Press. Searle, John R. 1969. Speech Acts. Cambridge: Cambridge University Press. Searle, John R. 1975. Indirect speech acts. In: P. Cole & J. Morgan (eds.). Syntax and Semantics 3: Speech Acts. New York: Academic Press, 59–82. Searle, John R. 1978. Literal meaning. Erkenntnis 13, 207–224. Searle, John R. 1983. Intentionality. Cambridge: Cambridge University Press. Searle, John R. 1992. The Rediscovery of the Mind. Cambridge, MA: The MIT Press. Sperber, Dan 2000. Metarepresentations in an evolutionary perspective. In: D. Sperber (ed.). Metarepresentations: A Multidisciplinary Perspective. Oxford: Oxford University Press, 117–137. Sperber, Dan & Deirdre Wilson 1986. Relevance, Communication and Cognition. Cambridge, MA: Harvard University Press. Stalnaker, Robert 1999. Context and Content. Oxford: Oxford University Press. Stanley, Jason 2000. Context and logical form. Linguistics & Philosophy 23, 391–434. Strawson, Peter F. 1952. Introduction to Logical Theory. London: Methuen. Strawson, Peter F. 1964. Intention and convention in speech acts. The Philosophical Review 73, 439–460. Reprinted in: P.F. Strawson. Logico-linguistic Papers. London: Methuen, 1971, 149–169. Travis, Charles 1991. Annals of analysis: Studies in the Way of Words, by H.P. Grice. Mind 100, 237–264. Wilson, Deirdre & Dan Sperber 2002. Truthfulness and relevance. Mind 111, 583–632. Zalta, Edward N. 1988. Intensional Logic and the Metaphysics of Intentionality. Cambridge, MA: The MIT Press.
Pierre Jacob, Paris (France)
3. (Frege on) Sense and reference 1. 2. 3. 4. 5. 6.
Sense and reference: a short overview Introducing sense and reference Extending and exploring sense and reference Criticising sense and reference Summary References
Abstract Gottlob Frege argues in “Über Sinn und Bedeutung” that every intuitively meaningful expression has a sense. He characterises the main ingredient of sense as a ‘mode of presentation’ of at most one thing, namely the referent. Theoretical development of sense Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 25–49
26
I. Foundations of semantics and reference has been a fundamental task for the philosophy of language. In this article I will reconstruct Frege’s motivation for the distinction between sense and reference (section 2). I will then go on to discuss how the distinction can be applied to predicates, sentences and context-dependent expressions (section 3). The final section 4 shows how discussions of Frege’s theory lead to important proposals in semantics.
1. Sense and reference: a short overview Gottlob Frege argues in Über Sinn und Bedeutung that every intuitively meaningful expression has a sense (“Sinn”). The main ingredient of sense is suggestively characterised as a ‘mode of presentation’ of at most one thing, namely the referent (“Bedeutung”). Frege’s theory of sense and reference has been at the centre of the philosophy of language in the 20th century. The discussion about it is driven by the question whether an adequate semantic theory needs to or even can coherently ascribe sense and reference to natural language expressions. On the side of Frege’s critics are, among others, Russell and Kripke. Russell (1905) argues that the introduction of sense creates more problems than it solves. If there are senses, one should be able to speak about them. But according to Russell, every attempt to do so leads to an ‘inextricable tangle’. Russell goes on to develop his theory of definite descriptions to avoid the problems Frege’s theory seems to create. How and whether Russell’s argument works is still a matter of dispute. (For recent discussions see Levine 2004 and Makin 2000.) More recently, Kripke (1972) has marshalled modal and epistemic arguments against what he called the “The Frege-Russell picture”. “Neo-Millians” or “Neo-Russellians” are currently busy closing the holes in Kripke’s arguments against Frege. Frege’s friends have tried to develop and defend the distinction between sense and reference. In Meaning and Necessity Carnap (1956) takes Frege to task for multiplying senses beyond necessity and therefore proposes to replace sense and reference with intension and extension. I will return to Carnap’s view in section 4.1. Quine (1951) has criticised notions like sense and intension as not being definable in scientifically acceptable terms. Davidson has tried to preserve Frege’s insights in view of Quine’s criticism. According to Davidson, “a theory of truth patterned after a Tarski-type truth definition tells us all we need to know about sense. Counting truth in the domain of reference, as Frege did, the study of sense thus comes down to the study of reference” (Davidson 1984: 109). And reference is, even by Quinean standards, a scientifically acceptable concept. Davidson’s proposal to use theories of truth as theories of sense has been pursued by in recent years and lead to fruitful research into the semantics of proper names. (See McDowell 1977 and Sainsbury 2005. See also Wiggins 1997.) Dummett has criticised Davidson’s proposal (see Dummett 1993, essay 1 and 2). A theory of sense should be graspable by someone who does not yet master a language and a theory of truth does not satisfy this constraint. A theory of meaning (sense) should be a theory of understanding and a theory of understanding should take understanding to consist in the possession of an ability, for example, the ability to recognise the bearer of a name. The jury is still out on the question whether the distinction between sense and reference should be preserved or abandoned. The debate has shed light on further
3. (Frege on) Sense and reference notions like compositionality, co-reference and knowledge of meaning. In this article I want to introduce the basic notions, sense and reference, and the problems surrounding them. A note of caution. Frege’s theory is not clearly a descriptive theory about natural language, but applies primarily to an ideal language like the Begriffsschrift (see Frege, Philosophical and Mathematical Correspondence (PMC): 101). If a philosopher of language seeks inspiration in the work of Frege, he needs to consider the question whether the view under consideration is about natural language or a language for inferential thought.
2. Introducing sense and reference 2.1. Conceptual content and cognitive value Frege starts his logico-philosophical investigations with judgement and inference. (See Burge 2005: 14f). An inference is a judgement made “because we are cognizant of other truths as providing a justification for it.” (Posthumous Writings (PW): 3). The premises of an inference are not sentences, but acknowledged truths. In line with this methodology, Frege introduces in BS the conceptual content of a judgement as an abstraction from the role the judgement plays in inference: Let me observe that there are two ways in which the contents of two judgements can differ: it may, or it may not, be the case that all inferences that can be drawn from the first judgement when combined with certain other ones can always be drawn from the second when combined with the same judgements. […] Now I call the part of the content that is the same in both the conceptual content. (BS, §3: 2–3)
Frege’s ‘official’ criterion for sameness of conceptual content is: s has the same conceptual content as s* iff given truths t1 to tn as premises, the same consequences can be inferred from s and s* together with t1 to tn. (See, BS, §3)
Frege identifies the conceptual content of a sentence with a complex of the things for which the sentence constituents stand (BS §8). This raises problems for the conceptual content of identity sentences. In identity sentences the sign of identity is flanked by what Frege calls ‘proper names’. What is a proper name? For Frege, every expression that can be substituted for a genuine proper name salva grammaticale and salva veritate qualifies as a proper name. Fregean proper names include genuine proper names, complex demonstratives (‘That man’) when completed by contextual features and definite descriptions (‘the negation of the thought that 2 = 3’). Following Russell (1905), philosophers and linguists have argued that definite descriptions do not belong on this list. Definite descriptions do not stand for objects, but for properties of properties. For example, ‘The first man on the moon is American’ asserts that (the property) being a first man on the moon is uniquely instantiated and that whatever uniquely has this property has also the property of
27
28
I. Foundations of semantics being American. The discussion about the semantics of definite descriptions is ongoing, but we can sidestep this issue by focusing on genuine proper names. (For an overview of the recent discussion about definite descriptions see Bezuidenhout & Reimer 2004.) Equations figure as premises in inferences that extend our knowledge. How can this be so? In BS Frege struggles to give a convincing answer. If one holds that “=” stands for the relation of identity, the sentences (S1) The evening star = the evening star
and (S2) The evening star = the morning star.
have the same conceptual content CC: CC(S1): = CC(S2): .
But although (S1) and (S2) stand for the same complex, they differ in inferential potential. For example, if we combine the truth that (S1) expresses with the truth that the evening star is a planet we can derive nothing new, but if we combine the latter truth we can derive the truth that the morning star is a planet. Prima facie, (i) Frege’s sameness criteria for conceptual content are in conflict with (ii) his claim that expressions are everywhere merely placeholders for objects. In BS Frege responds by restricting (ii): in a sentence of the form “a = b” the signs “a” and “b” stand for themselves; the sentence says that “a” and “b” have the same conceptual content. The idea that designations refer to themselves in some contexts helps to explain the difference in inferential potential. It can be news to learn that “Dr. Jekyll” stands for the same person as “Mr. Hyde”. Although promising, we will see in a moment that this move alone does not distinguish (S1) and (S2) in the way required. Treating identity statements as the exception to the rule that every expression just stands for an object allows Frege to keep his identification of the conceptual content of a sentence with a complex of objects and properties. The price is that the reference of a non-ambiguous and non-indexical term will shift from one sentence to another. Consider a quantified statement like: (∀x) (∀y) ((x = y & Fx) → Fy)
If signs stand for themselves in identity-statements, one can neither coherently suppose that the variables in the formula range over signs, nor over particulars. (See Furth 1964: xix, and Mendelsohn 1982: 297f.) These considerations prime us for the argument that motivates Frege’s introduction of the distinction between sense and reference. Frege’s new argument uses the notion of cognitive value (“Erkenntniswert”). For the purposes of Frege’s argument it is not important to answer the question “What is cognitive value?”: more important is the
3. (Frege on) Sense and reference question “When do two sentences s and s* differ in cognitive value?” Answer: If the justified acceptance of s puts one in a position to come to know something that one neither knew before nor could infer from what one knew before, while the justified acceptance of s* does not do so (or the other way around). Frege hints at this notion of cognitive value in a piece of self-criticism in The Foundations of Arithmetic (FA). If a true equation connected two terms that refer to the same thing in the same way [a]ll equations would then come down to this, that whatever is given to us in the same way is to be recognized as the same. But this is so self-evident and so unfruitful that it is not worth stating. Indeed, no conclusion could ever be drawn here that was different from any of the premises. (FA, §67: 79. My emphasis)
If a rational and minimally logically competent person accepts what “The evening star is a planet” says, and she comes also to accept the content of “The evening star = the morning star”, she is in a position to acquire the knowledge that the morning star is a planet. This new knowledge was not inferable from what she already knew. Things are different if she has only reason to accept the content of “The evening star = the evening star”. Exercising logical knowledge won’t take her from this premise to a conclusion that she was not already in a position to know on the basis of the other premise or her general logical knowledge. On the basis of this understanding of cognitive value, Frege’s argument can now be rendered in the following form: (P1) (S1) “The evening star = the evening star” differs in cognitive value from (S2) “The evening star = the morning star”. (P2) The difference in cognitive value between (S1) and (S2) cannot be constituted by a difference in reference of the signs composing the sentence. (P3) “If the sign ‘a’ is distinguished from the sign ‘b’ only as object (here by means of its shape), not as sign (i.e., not by the manner in which it designates something) the cognitive value of a = a becomes essentially equal to the cognitive value of a = b.” (P4) “A difference can only arise if the difference between the signs corresponds to a difference between the mode of presentation of that which is designated.” (C1) The difference in cognitive value between (S1) and (S2) is constituted by the fact that different signs compose (S1) and (S2) AND that the different signs designate something in different ways. We have now explained (P1) and Frege has already given convincing reasons for holding (P2) in BS. Let us look more closely at the other premises.
2.2. Sense, sign and logical form In On Sense and Reference (S&R) Frege gives several reasons for the rejection of the Begriffsschrift view of identity statements. The most compelling one is (P3). Frege held in BS that (S1) has a different cognitive value from (S2) because (i) in (S2) different
29
30
I. Foundations of semantics singular terms flank the identity sign that (ii) stand for themselves. If this difference is to ground the difference in cognitive value, Frege must provide an answer to the question “When do different signs and when does the same sign (tokens of the same sign) flank the identity sign?” that is adequate for this purpose. (P3) says that the individuation of signs in terms of their form alone is not adequate. Frege calls signs that are individuated by their form ‘figures’. The distinctive properties of a figure are geometrical and physical properties. There are identity-statements that contain two tokens of the same figure (“Paderewski = Paderewski”), which have the same cognitive value as identity statements that contain two tokens of different figures (“Paderewski = the prime minister of Poland between 1917 and 1919”). There are also identitystatements that contain tokens of different figures (“Germany’s oldest bachelor is Germany’s oldest unmarried eligible male”) that have the same cognitive value as identity statements that contain two tokens of the same figure (“Germany’s oldest bachelor is Germany’s oldest bachelor”). If sameness (difference) of figure does not track sameness (difference) of cognitive value, then the Begriffsschrift view needs to be supplemented by a method of sign individuation that is not merely form-based. Frege’s constructive suggestion is (P4). Let us expand on (P4). I will assume that I have frequently seen what I take to be the brightest star in the evening sky. I have given the star the name “the evening star”. I will apply “the evening star” to an object if and only if it is the brightest star in the evening sky. I am also fascinated by what I take to be the brightest star in the morning sky. I have given this star the name “the morning star” and I will apply “the morning star” to an object if and only if it is the brightest star in the morning sky. Now if the difference in form between “the evening star” and “the morning star” indicates a difference in mode of designation, one can explain the difference in cognitive value between (S1) and (S2). (S1) and (S2) contain different figures AND the different figures refer to something in a different way (they are connected in my idiolect to different conditions of correct reference). Frege assumes also that a difference in cognitive value can only arise if different terms designate in different ways. But (S1) and (S2) would be translated differently into the language of first-order logic with identity: (S1) as “a = a”, (S2) as “a = b”. Why not hold with Putnam (1954) that the difference in logical form grounds the difference in cognitive value? This challenge misses the point of (P3). How can we determine the logical form of a statement independently of a method of individuating signs? As we have already seen, if “b” is a mere stylistic alternative for “a” (the difference in form indicates no difference in mode of presentation), the logical form of “a = b” is a = a. If difference of cognitive value is to depend on differences of logical form, logical form cannot merely be determined by the form of the signs contained in the sentence. Logical form depends on sense. The appeal to logical form cannot make the notion of sense superfluous. If Frege’s argument is convincing, it establishes the conclusion that the sense of an expression is distinct from its reference. The argument tells us what sense is not, it does not tell us what it is. The premises of the argument are compatible with every assumption about the nature of sense that allows the sense of an expression to differ from its reference.
3. (Frege on) Sense and reference
3. Extending and exploring sense and reference 3.1. The sense and reference of predicates and sentences Not only proper names have sense and reference. Frege extends the distinction to concept-words (predicates) and sentences. A letter to Husserl contains a diagram that outlines the complete view (PMC: 63):
Fig. 3.1: Frege’s diagram
The sense of a complete assertoric sentence is a thought. The composition of the sentence out of sentence-parts mirrors the composition of the thought it expresses. If (i) a sentence S is built up from a finite list of simple parts e1 to en according to a finite number of modes of combination, (ii) e1 to en express senses, and (iii) the thought expressed by S contains the senses of e1 to en arranged in a way that corresponds to the arrangement of e1 to en in S, we can express new thoughts by re-combining sentence-parts we already understand. Since, we can express new thoughts, we should accept (i) to (iii). (See Frege, Collected Papers (CP): 390 and PMC: 79.) Compositionality is the property of a language L that the meaning of its complex expressions is determined by the meaning of their parts and their mode of combination. (See article 6 (Pagin & Westerståhl) Compositionality.) According to Frege, natural language and his formal language have a property that is like compositionality, but not the same. First, the claim that the composition of a sentence mirrors the composition of the thought expressed implies that the sense of the parts and their mode of combination determine the thought expressed, but the determination thesis does not imply the mirroring thesis. Second, Fregean thoughts are not meanings. The meaning of an indexical sentence (“I am hungry”) does not vary with the context of utterance, the thought expressed does. Third, the assertoric sentence that expresses a thought often contains constituents that are not words, while compositionality is usually defined in terms of expression parts that are themselves expressions. For example, Frege takes pointings, glances and the time of utterance to be part of the sign that expresses a thought. (See T, CP: 358 (64). For discussion see Künne 1992 and Textor 2007.) If the thought expressed by a sentences s consists of the senses expressed by the parts of s in an order, a sentence s expresses a different thought from a sentence s* if s and s* are not composed from parts with the same sense in the same order. (See Dummett 1981a: 378–379.) Hence, the Mirror Thesis implies that only isomorphic sentences can express the same thought. This conclusion contradicts Frege’s view that the same thought
31
32
I. Foundations of semantics can be decomposed differently into different parts, and that no decomposition can be selected on independent grounds as identifying the thought. Frege argues, for example, that ‘There is at least one square root of 4’ and ‘The number 4 has the property that there is something of which it is the square’ express the same thought, but the wording of the sentence suggests different decompositions of the thought expressed. (See CP: 188 (199– 200).) Progress in logic often consists in discovering that different sentences express the same thought. (See PW: 153–154.) Hence, there is not the structure of a thought. A fortiori, the sentence structure cannot be isomorphic with the structure of the thought expressed. The Mirror and Multiple Decomposition Thesis are in conflict. Dummett (1989) and Rumfitt (1994) want to preserve the Mirror View that has it that thoughts are complex entities, Geach (1975), Bell (1987) and Kemmerling (1990) argue against it. Textor (2009) tries to integrate both. Does an assertoric sentence have a referent, and, if it has one, what is it? Frege introduces the concept of a function into semantic analysis. Functional signs are incomplete. In ‘1 + ξ’ the Greek letter simply marks an argument place. If we complete the functional expression ‘1 + ξ’ with ‘1’, we ‘generate’ a complex designator for the value of the function for the argument 1, the number 2. The analysis of the designator ‘1 + 1’ into functional expressions and completing expressions mirrors the determination of its referent. Frege extends the analysis to equations like ‘1 + 1’ to ‘22 = 4’. In order to do so, he assumes (i) that the equation can be decomposed into functional (for example ‘ξ2 = 4’) and non-functional expressions (for example ‘2’). The functional expression stands for a function, the non-functional one for an argument. Like ‘1 + 1’, the equation ‘22 = 4’ shall stand for the value the concept x2 = 4 assumes for the argument 2. Frege identifies the values of these functions as truth-values: the True and the False (CP: 144 (13)). To generalise: An assertoric sentence s expresses a thought and refers to a truth-value. It can be decomposed into functional expressions (concept-words) that stand for functions from arguments to truth-values (concepts). The truth-value of s is the value which the concept referred to by the concept-word distinguishable in s returns for the argument referred to by the proper name(s) in s. In this way the semantics of sentences and their constituents interlock. The plausibility of the given description depends crucially on the assumption that talk of truth-values is more than a stylistic variant of saying that what a sentence says is true (false). Frege’s arguments for this assumption are still under scrutiny. (See Burge 2005, essay 3 and Ricketts 2003.)
3.2. Sense determines reference The following argument is valid: Hesperus is a planet; Hesperus shines brightly. Therefore, something is a planet and shines brightly, is a valid. By contast, this argument isn’t: Hesperus is a planet; Phosphorus shines brightly. Therefore, something is a planet and shines brightly. Why? A formally valid argument is an argument whose logical form guarantees its validity. All arguments of the form “Fa, Ga, Therefore: (∃x) (Fx & Gx)” are valid, i.e. if the premises are true, the conclusion must also be true. If the first argument is formally valid, the “Hesperus” tokens must have the same sense and reference. If the sense were different, the argument would be a fallacy of equivocation. If the reference could be
3. (Frege on) Sense and reference different, although the sense was the same, the argument could take us from true premises to a false conclusion, because the same sense could determine different objects in the premises and the conclusion. This consideration makes a thesis plausible that is frequently attributed to Frege, although never explicitly endorsed by him: Sense-Determines-Reference: Necessarily, if α and β have the same sense, α and β have the same referent.
Sense-Determines-Reference is central to Frege’s theory. But isn’t it refuted by Putnam’s (1975) twin earth case? My twin and I may connect the same mode of presentation with the syntactically individuated word “water”, but the watery stuff in my environment is H2O, in his XYZ. While I refer to H2O with “water”, he refers to XYZ with “water”. It is far from clear that this is a convincing counterexample to Sense-DeterminesReference. If we consider context-dependent expressions, it will be necessary to complicate to Sense-determines-Reference to Sense-determines-Reference in context of utterance. But this complication does not touch upon the principle. Among other things, Putnam may be taken to have shown that “water” is a context-dependent expression.
3.3. Transparency and homogeneity. In the previous section we were concerned with the truth-preserving character of formally valid arguments. In this section the knowledge-transmitting character of such arguments will become important: a formally valid argument enables us to come to know the conclusion on the basis of our knowledge of the premises and the logical laws (See Campbell 1994, chap. 3.1 and 3.2). Formally valid arguments can only have this epistemic virtue if sameness of sense is transparent in the following way: Transparency-of-Sense-Sameness: Necessarily, if α and β have the same sense, everyone who grasps the sense of α and β, thereby knows that α and β have the same sense, provided that there is no difficulty in grasping the senses involved.
Assume for reduction that sameness of sense is not transparent in the first argument in section 3.2. Then you might understand and accept “Hesperus is a planet” (first premise) and understand and accept “Hesperus shines brightly” (second premise), but fail to recognise that the two “Hesperus” tokens have the same sense. Hence, your understanding of the premises does not entitle you to give them the form “Fa” and “Ga” and hence, you cannot discern the logical form that ensures the formal validity of the argument. Consequently, you are not in a position to come to know the conclusion on the basis of your knowledge of the premises alone. The argument would be formally valid, but one could not come to know its conclusion without adding the premise Hesperus mentioned in premise 1 is the same thing as Hesperus mentioned in premise 2.
Since we take some arguments like the one above to be complete and knowledge transmitting, sameness of sense must be transparent.
33
34
I. Foundations of semantics If we assume in addition to Transparency-of-Sense-Sameness: Competent speakers know that synonymous expressions co-refer, if they refer at all,
we arrive at: Sense-Determines-Reference: Necessarily, if α and β have the same sense, and α refers to a and β refers to b, everyone who grasps the sense of α and β knows that a = b.
Frege makes use of Sense-Reference in Thoughts to argue for the view that an indexical and a proper name, although co-referential, differ in sense. Transparency-of-Sense-Sameness has further theoretical ramifications. Russell wrote to Frege: I believe that in spite of all its snowfields Mont Blanc itself is a component part of what is actually asserted in the proposition ‘Mont Blanc is more than 4000 meters high’. (PMC: 169)
If Mont Blanc is part of the thought expressed by uttering “Mont Blanc is a mountain”, every piece of rock of Mont Blanc is part of this thought. To Frege, this conclusion seems absurd (PMC: 79). Why can a part of Mont Blanc that is unknown to me not be part of a sense I apprehend? Because if there were unknown constituents of the sense of “Mont Blanc” I could not know whether the thought expressed by one utterance of “Mont Blanc is a mountain” is the same as that expressed by another merely by grasping the thought. Hence, Frege does not allow Mont Blanc to be a constituent of the thought that Mont Blanc is a mountain. He endorses the Homogeneity Principle that a sense can only have other senses as constituents (PMC: 127). How plausible is Transparency-of-Sense-Sameness? Dummett (1975: 131) takes it to be an ‘undeniable feature of the notion of meaning’. Frege’s proviso to Transparency-of-SenseSameness shows that he is aware of difficulties. Many factors (the actual wording, pragmatic implicatures, background information) can bring about that one assents to s while doubting s*, although these sentences express the same thought. Frege has not worked out a reply to these difficulties. However, he suggests that two sentences express the same thought if one can explain away appearances to the contrary by appeal to pragmatics etc. Recently, some authors have argued that it is not immediate knowledge of coreference, but the entitlement to ‘trade on identity’ which is the basic notion in a theory of sense (see Campbell 1994: 88). For example, if I use the demonstrative pronoun repeatedly to talk about an object that I visually track (“That bird is a hawk. Look, now that bird pursues the other bird”), I am usually not in a position to know immediately that I am referring to the same thing. But the fact that I am tracking the bird gives me a defeasible right to presuppose that the same bird is in question. There is no longer any suggestion that the grasp of sense gives one a privileged access to sameness of reference.
3.4. Sense without reference According to Frege, the conditions for having a sense and the conditions for having a referent are different. Mere grammaticality is sufficient for an expression that can stand in
3. (Frege on) Sense and reference for (genuine) proper names (“the King of France”) to have a sense, but it is not sufficient for it to have a referent (S&R: 159 (28)). What about genuine proper names? The name ‘Nausikaa’ in Homer’s Odyssee is probably empty. “But it behaves as if it names a girl, and it is thus assured of a sense.” (PW: 122). Frege’s message is: If α is a well-formed expression that can take the place of a proper name or α is a proper name that purports to refer to something, α has a sense.
An expression purports to refer if it has some of the properties of an expression that does refer and the possession of these properties entitles the uninformed speaker to treat it like a referring expression. For instance, in understanding “Nausikaa” I will bring information that I have collected under this name to bear upon the utterance. It is very plausible to treat (complex) expressions and genuine proper names differently when it comes to significance. We can ask a question like “Does a exist?” or “Is a so-and-so”?, even if “a” does not behave as if it to refers to something. “The number which exceeds itself” does not behave as if it names something. Yet, it seems implausible to say that “The number which exceeds itself cannot exist” does not express a thought. What secures a sense for complex expressions is that they are composed out of significant expressions in a grammatically correct way. Complex expressions need not behave as if they name something, simple ones must. Since satisfaction of the sufficient conditions for sense-possession does not require satisfaction of the sufficient conditions for referentiality, there can be sense without reference: Sense-without-Reference: If the expression α is of a type whose tokens can stand for an object, the sense of α determines at most one referent.
If Sense without Reference is true, the mode of presentation metaphor is misleading. For there can be no mode of presentation without something that is presented. Some NeoFregeans take the mode of presentation metaphor to be so central that they reject Sense without Reference (see Evans 1982: 26). But one feels that what has to give here is the mode of presentation metaphor, not the idea of empty but significant terms. Are there more convincing reasons to deny that empty singular terms are significant? The arguments pro and contra Sense without Reference are based on existence conditions for thoughts. According to Evans, a thought must either be either true or false, it cannot lack a truth-value (Evans 1982: 25). Frege is less demanding: The being of a thought may also be taken to lie in the possibility of different thinkers grasping the same thought as one and the same thought. (CP: 376 (146))
A thought is the sense of a complete propositional question. One grasps the thought expressed by a propositional question iff one knows when the question deserves the answer “Yes” and when it deserves the answer “No”. Frege aims above for the following existence condition: (E!) The thought that p exists if the propositional question “p?” can be raised and addressed by different thinkers in a rational way.
35
36
I. Foundations of semantics There is a mere illusion of a thought if different thinkers cannot rationally engage with the same question. On the basis of (E!), Frege rejects the view that false thoughts aren’t thoughts. It was rational to ask whether the circle can be squared and different thinkers could rationally engage with this question. Hence, there is a thought and not merely a thought illusion, although the thought is necessarily false. The opposition to Sense-without-Reference is also fuelled by the idea that (i) a thought expressed by a sentence s is what one grasps when one understands (an utterance of) s and (ii) that one cannot understand (an utterance of) a sentence with an empty singular term. A representative example of an argument against Sense without Reference that goes through understanding is Evans argument from diversity (Evans 1982: 336). He argues that the satisfaction of the conditions for understanding a sentence s with a proper name n requires the existence of the referent of n. Someone who holds Sense without Reference can provide a communication-allowing relation in the empty case: speaker and audience grasp the same mode of presentation. But Evans rightly insists that this proposal is ad hoc. For usually, we don’t require exact match of mode of presentation. The Fregean is challenged to provide a common communication-allowing relation that is present in the case where the singular term is empty and the case where it is referential. Sainsbury (2005, chapter 3.6) proposes that the common communication-allowing relation is causal: exact match of mode of presentation is not required, only that there is a potentially knowledge-transmitting relation between the speaker’s use and the audience episode of understanding the term.
3.5. Indirect sense and reference If we assume that the reference of a sentence is determined by the reference of its parts, the substitution of co-referential terms of the same grammatical category should leave the reference of the sentence (e.g. truth-value) unchanged. However, Frege points out that there are exceptions to this rule: Gottlob said (believes) that the evening star shines brightly in the evening sky. The evening star = the morning star. Gottlob said (believes) that the morning star shines brightly in the evening sky.
Here the exchange of co-referential terms changes the truth-value of our original statement. (In order to keep the discussion simple I will ignore propositional attitude ascriptions in which the singular terms are not within the scope of ‘S believes that …’, see Burge (2005: 198). For a more general discussion of these issues see article 60 (Swanson) Propositional attitudes.) Frege takes these cases to be exceptions of the rule that the reference of a sentence is determined by the referents of its constituents. What the exceptions have in common is that they involve a shift of reference: In reported speech one talks about the sense, e.g. of another person’s remarks. It is quite clear that in this way of speaking words do not have their customary reference but designate what is usually their sense. In order to have a short expression, we will say: In reported speech, words are used indirectly or have their indirect reference. We distinguish accordingly the customary sense from its indirect sense. (S&R: 159 (28))
3. (Frege on) Sense and reference Indirect speech is a defect of natural language. In a language that can be used for scientific purposes the same sign should everywhere have the same referent. But in indirect discourse the same words differ in sense and reference from normal discourse. In a letter to Russell Frege proposes a small linguistic reform to bring natural language closer to his ideal: “we ought really to have special signs in indirect speech, though their connection with the corresponding signs in direct speech should be easy to recognize” (PMC: 153). Let us implement Frege’s reform of indirect speech by introducing indices that indicate sense types. The index 0 stands for the customary sense expressed by an unembedded occurrence of a sign, 1 for the sense the sign expresses when embedded under one indirect speech operator like “S said that …”. By iterating indirect speech operators we can generate an infinite hierarchy of senses: Direct speech:
The-evening-star0 shines0 brightly0 in0 the-evening-sky0.
Indirect speech:
Gottlob0 believes0 that the-evening-star1 shines1 brightly1 in1 the-evening-sky1.
Doubly indirect speech: Bertrand0 believes0 that Gottlob1 believes1 that the-evening-star2 shines2 brightly2 in2 the-evening-sky2. Triply indirect speech:
Ludwig0 believes0 that Bertrand1 believes1 that Gottlob2 believes2 that the-evening-star3 shines3 brightly3 in3 the-evening-sky3.
….
“The evening star0” refers to the planet Venus and expresses a mode of presentation of it; “the evening star1” refers to the customary sense of “the evening star”, a mode of presentation of the planet Venus. “The evening star2” refers to a mode of presentation of the mode of presentation of Venus and expresses a mode of presentation of a mode of presentation of a mode of presentation of Venus. Let us first attend to an often made point that requires a conservative modification of Frege’s theory. It seems natural to say that the name of a thought (“that the evening star shines brightly in the evening sky”) is composed of the nominaliser “that” and the names of the thought constituents in an order (“that” + “the evening star” + “shines” …). If “the evening star” names a sense in “Gottlob said that the evening star shines brightly in the evening sky”, then “Gottlob said that the evening star shines brightly in the evening sky and it does shine brightly in the evening sky” should be false (the sense of “the evening star” does not shine brightly). But the sentence is true! Does this refute Frege’s reference shift thesis? No, for can’t one designator refer to two things? The designator “the evening star1” refers to the sense of “the vening star0” AND to the evening star. Since such anaphoric back reference seems always possible, we have no reference shift, but a systematic increase of referents. Fine (1989: 267f) gives independent examples of terms with multiple referents. Frege himself should be sympathetic to this suggestion. In S&R he gives the following example: John is under the false impression that London is the biggest city in the world.
According to Frege, the italicised sentence ‘counts double’ (“ist doppelt zu nehmen” S&R: 175, (48)). Whatever the double count exactly is, the sentence above will be true iff
37
38
I. Foundations of semantics John is under the impression that London is the biggest city in the world AND It is not the case that London is the biggest city in the world.
Instead of doubly counting the same ‘that’-designator, we can let it stand for a truthvalue and the thought that determines the truth-value. Reference increase seems more adequate than reference shift. Philosophers have been skeptical of Frege’s sense hierarchy. What is so bad about an infinite hierarchy of senses and references? If there is such a hierarchy, the language of propositional attitude ascriptions is unlearnable, argues Davidson. One cannot provide a finitely axiomatised theory of truth for propositional attitude ascriptions if a “that”clause is infinitely ambiguous (Davidson 1984: 208). The potentially infinitely many meanings of “that p” would have to be learned one by one. Hence, we could not understand propositional attitude ascriptions we have not heard before, although they are composed of words we already master. This argument loses its force if the sense of a “that” designator is determined by the sense of the parts of the sense named and their mode of combination (see Burge 2005: 172). One must learn that “that” is a name forming operator that takes a sentence s and returns a designator of the thought expressed by s. Since one can iterate the name forming operator one can understand infinitely many designations for thoughts on the basis of finite knowledge. But is there really an infinite hierarchy of senses and references? We get an infinite hierarchy of sense and references on the following assumptions: (P1) Indirect reference of w in n embeddings = sense of w in n-1 embeddings (P2) Indirect sense of w in n embeddings ≠ sense of w in n embeddings (P3) Indirect sense of w in n embeddings ≠ indirect sense of w in m embeddings if n ≠ m.
Dummett has argued against (P2) that “the replacements of an expression in double oratio obliqua which will leave the truth-value of the whole sentence unaltered are –– just as in single oratio obliqua –– those which have the same sense” (Dummett 1981a: 269). Assume for illustration that “bachelor” is synonymous with “unmarried eligible male”. Dummett is right that we can replace “bachelor” salva veritate in (i) “John believes that all bachelors are dirty” and (ii) “John believes that Peter believes that all bachelors are dirty” with “unmarried eligible male”. But this does not show that “bachelor” retains its customary sense in multiple embeddings. For Frege argues that the sense of “unmarried eligible male” also shifts under the embedding. No wonder they can be replaced salva veritate. However, if singly and doubly embedded words differ in sense and reference, one should expect that words with the same sense cannot be substituted salva veritate in different embeddings. The problem is that it is difficult to check in a convincing way for such failures. If we stick to the natural language sentences, substitution of one token of the same word will not change anything; if we switch to the revised language with indices, we can no longer draw on intuitions about truth-value difference. In view of a lack of direct arguments one should remain agnostic about the Fregean hierarchy. I side here with Parsons that the choice between a theory of sense and reference with or without the hierachy “must be a matter of taste and elegance” (Parsons 1997:
3. (Frege on) Sense and reference 408). The reader should consult the Appendix of Burge’s (2005) essay 4 for attempts to give direct arguments for the hierarchy, Peacocke (1999: 245) for arguments against and Parsons (1997: 402ff) for a diagnosis of the failure of a possible Fregean argument for the hierarchy.
4. Criticising sense and reference 4.1. Carnap’s alternative: extension and intension Carnap’s Meaning and Necessity plays an important role in the reception of Frege’s ideas. However, as Carnap himself emphasizes he does not have the same explanatory aims as Frege. Carnap himself aims to explicate and extend the distinction between denotation and connotation, while he ascribes to Frege the goal of explicating the distinction between the object named by a name and its meaning (Carnap 1956: 127). This description of Frege does not bear closer investigation. Frege argues that predicates (“ξ > 1”) don’t name their referents, yet they have a sense that determines a referent. Predicates refer, although they are not names. Carnap charges Frege for multiplying senses without good reason (Carnap 1956: 157 and 137). If every name has exactly one referent and one sense, and the referent of a name must shift in propositional attitude contexts, Frege must have names for senses which in turn must have new senses that determine senses and so on. Whether this criticism is convincing is, as we have seen in the previous section, still an open issue. Carnap lays the blame for the problem at the door of the notion of naming. Carnap’s method of intension and extension avoids using this notion. Carnap takes equivalence relations between sentences and subsentential expressions as given. In a second step he assigns to all equivalent expressions an object. For example, if s is true in all state descriptions in which s* is true, they have the same intension. If s and s* have the same truth-value, they have the same extension. Expressions have intension and extension, but they don’t name either. It seems however unclear how Carnap can avoid appeal to naming when it comes to equivalence relations between names. ‘Hesperus’ and ‘Phosphorus’ have the same extension because they name the same object (see Davidson 1962: 326f). An intension is a function from a possible world w (a state-description) to an object. The intension of a sentence s is, for example, a function that maps a possible world w onto the truth value of s in w; the intension of a singular term is a function that maps a possible world w onto an object in w. Carnap states that the intension of every expression is the same as its (direct) sense (Carnap 1956: 126). Now, intensions are more coarsely individuated than Fregean senses. The thought that 2 + 2 = 4 is different from the thought that 123 + 23 = 146, but the intension expressed by “2 + 2 = 4” and “123 + 23 = 146” is the same. Every true mathematical statement has the same intension: the function that maps every possible world to the True. Although Carnap’s criticism does not hit its target, his method of extension and intension has been widely used. Lewis (1970), Kaplan (1977/1989) and Montague (1973) have refined and developed it to assign intensions and extensions to a variety of natural language expressions.
39
40
I. Foundations of semantics
4.2. Rigidity and sense What, then, is the sense of a proper name? Frege suggests, but does not clearly endorse, the following answer: The sense of a proper name (in the idiolect of a speaker) is given by a definite description that the speaker would use to distinguish the proper name bearer from all other things. For example, my sense of “Aristotle” might be given by the definite description “the inventor of formal logic”. I will now look at two of Kripke’s objections against this idea that have figured prominently in recent discussion. First, the modal objection (for a development of this argument see Soames 1998): 1. Proper names are rigid designators. 2. The descriptions commonly associated with proper names by speakers are non-rigid. 3. A rigid designator and a non-rigid one do not have the same sense. Therefore: 4. No proper name has the same sense as a definite description speakers associate with it. A singular term α is a rigid designator if, and only if, α refers to x in every possible world in which x exists (Kripke 1980: 48–49). One can make sense of rigid designation without invoking a plurality of possible worlds. If the sentence which results from uniform substitution of singular terms for the dots in the schema … might not have been …
expresses a proposition which we intuitively take to be false, then the substituted term is a rigid designator, if not, not. Why accept (3)? If “Aristotle” had the same sense as “the inventor of formal logic”, we should be able to substitute “the inventor of formal logic” in all sentences without altering the truth-value (salva veritate). But Aristotle might not have been the inventor of formal logic. (Yes!)
and Aristotle might not have been Aristotle. (No!)
But in making this substitution we go from a true sentence to a false one. How can this be? Kripke’s answer is: Even if everyone would agree that Aristotle is the inventor of formal logic, the description “the inventor of formal logic” is not synonymous with “Aristotle”. The description is only used to fix the reference of the proper name. If someone answers the question “Who is Aristotle?” by using the description, he gives no information about the sense of “Aristotle”. He just gives you advice how to pick out the right object. The modal argument is controversial for several reasons. Can the difference between sentences with (non-rigid) descriptions and proper names not be explained without positing a difference in sense? Dummett gives a positive answer that explains the difference in terms of a difference in scope conventions between proper names and definite descriptions (see Dummett 1981a: 127ff; Kripke 1980, Introduction: 11ff replies; Dummett 1981b Appendix 3 replies to the reply).
3. (Frege on) Sense and reference Other authors accept the difference, but argue that there are rigid definite descriptions that are associated with proper names. (See, for example, Jackson 2005.) The debate about this question is ongoing. Second, the epistemic objection: Most people associate with proper names only definite descriptions that are not satisfied by only one thing (“Aristotle is the great Greek philosopher”) or by the wrong object (“Peano is the mathematician who first proposed the axioms for arithmetic”). The ‘Peano’ axioms were first proposed by Dedekind (see Kripke 1972: 84). These objections do not refute Frege’s conclusion that co-referential proper names can differ in something that is more finely individuated than their reference. What the objections show to be implausible is the idea that the difference between co-referential proper names consists in a difference of associated definite descriptions. But arguments that show that a proper name does not have the same sense as a particular definite description do not show the more general thesis that a proper name has a sense that is distinct from its referent to be false. In order to refute the general thesis an additional argument is needed. The constraint that the sense of an expression must be distinct from its reference leaves Frege ample room for manoeuvre. Consider the following two options: The sense of “Aristotle” might be conceptually primitive: it is not possible to say what this sense is without using the same proper name again. This is not implausible. We usually don’t define a proper name. This move of course raises the question how the sense of proper names should be explained if one does not want to make it ineffable. More soon. The sense of “Aristotle” in my mouth is identical with a definite description, but not with a famous deeds description like “the inventor of bi-focals”. Neo-Descriptivists offer a variety of meta-linguistic definite descriptions (“the bearer of ‘N’ ”) ‘to specify the sense of the name (see, for example, Forbes 1990: 536ff and Jackson 2005).
4.3. Too far from the ideal? Frege has provided a good reason to distinguish sense from reference. The expressions composing an argument must be distinguished by form and mode of presentation, if the form of an argument is to reveal whether it is formally valid or not. If figures that differ in form also differ in sense, the following relations obtain between sign, sense and reference: The regular connection between a sign, its sense, and its reference is of such a kind that to the sign there corresponds a definite sense and to that in turn a definite referent, while to a given referent (an object) there does not belong only a single sign. (S&R: 159 (27))
The “no more than one” requirement is according to Frege the most important rule that logic imposes on language. (For discussion see May 2006.) A language that can be used to conduct proofs must be unambiguous. A proof is a series of judgements that terminates in the logically justified acknowledgement that a thought stands for the True. If “2 + 2 = 4” expressed different thoughts, the designation that 2 + 2 = 4 would not identify the thought one is entitled to judge on the basis of the proof (CP: 316, Fn. 3). Frege is under no illusion: there is, in general, no such regular connection between sign, sense and reference in natural language: To every expression belonging to a complete totality of signs [Ganzes von Bezeichnungen], there should certainly correspond a definite sense; but natural languages often do not satisfy
41
42
I. Foundations of semantics this condition, and one must be content if the same word has the same sense in the same context [Zusammenhänge]. (S&R: 159 (27–28))
In natural language, the same shape-individuated sign may have more than one sense (“Bank”), and different occurrences of the same sign with the same reference may vary in sense (“Aristotle” in my mouth and your mouth). Similar things hold for conceptwords. In natural language some concept-words are incompletely defined, others are vague. The concept-word “is a natural number” is only defined for numbers, the sense of the word does not determine its application to flowers, “is bald” is vague, it is neither true nor false of me. Hence, some sentences containing such concept-words violate the law of excluded middle: they are neither true nor false. To prevent truth-value gaps Frege bans vague and incompletely defined concept-words from the language of inference. Even the language of mathematics contains complex signs with more than one referent (“√2”). Natural language doesn’t comply with the rule that every expression has in all contexts exactly one sense. Different speakers associate different senses with the figure “Aristotle” referring to the same person at the same time and/or the same speaker associates different senses with the same proper name at different times. Frege points out that different speakers can correctly use the same proper name for the same bearer on the basis of different definite descriptions (S&R: 27, fn. 2). More importantly, such a principle seems superfluous, since differences in proper name sense don’t prevent speakers of natural language from understanding each other’s utterances. Hence, we seem to be driven to the consequence that differences in sense between proper names and other expressions don’t matter, what matters is that one talks about the same thing: So long as the reference remains the same, such variations of sense may be tolerated, although they are to be avoided in the theoretical structure of a demonstrative science (“beweisende Wissenschaft”) and ought not to occur in a perfect language. (S&R: 158, & fn. 4 (27))
Why is variation of sense tolerable outside demonstrative sciences? Frege answers: The task of vernacular languages is essentially fulfilled if people engaged in communication with one another connect the same thought, or approximately the same thought, with the same sentence For this it is not at all necessary that the individual words should have a sense and reference of their own, provided that only the whole sentence has a sense. (PMC: 115. In part my translation)
Frege makes several interesting points here, but let us focus on the main one: if you and I connect approximately the same thought with “Aristotle was born in Stagira”, we have communicated successfully. What does ‘approximately the same’ amount to? What is shared when you understand my question “Is Aristotle a student of Plato?” is not the thought I express with “Aristotle is a student of Plato”. If one wants to wax metaphysically, what is shared is a complex consisting of Aristotle (the philosopher) and Plato (the Philosopher) standing in the relation of being a student connected in such a way that the
3. (Frege on) Sense and reference complex is true iff Aristotle is a student of Plato. There are no conventional, communitywide senses for ordinary proper names, there is only a conventional community wide reference. Russell sums this up nicely when he writes to Frege: In the case of a simple proper name like “Socrates” I cannot distinguish between sense and reference; I only see the idea which is something psychological and the object. To put it better: I don’t acknowledge the sense, only the idea and the reference. (Letter to Frege 12.12.1904. PMC: 169)
If we go along with Frege’s use of “sign” for a symbol individuated in terms of the mode of presentation expressed, we must say that often we don’t speak the same Fregean language, but that it does not matter for communication. If we reject it, we will speak the same language, but proper names turn out to be ambiguous. (Variations of this argument can be found in Russell 1910/11: 206–207; Kripke 1979: 108; Evans 1982: 399f and Sainsbury 2005: 12ff) This line of argument makes the Hybrid View plausible (see Heck 1995: 79). The Hybrid View takes Frege to be right about the content of beliefs expressed by sentences containing proper names, but wrong about what atomic sentences literally say. “Hesperus is a planet” and “Phosphorus is a planet” have the same content, because the proper name senses are too idiosyncratic to contribute to what one literally says with an utterance containing the corresponding name. Grasping what an assertoric utterance of an atomic sentence literally says is, in the basic case, latching on to the right particulars and properties combined in the right way. The mode in which they are presented does not matter. By contrast, “S believes that Phosphorus is a planet” and “S believes that Hesperus is a planet” attribute different beliefs to S. The argument for the Hybrid View requires the Fregean to answer the question “Why is it profitable to think of natural language in terms of the Fregean ideal in which every expression has one determinate sense?” (see Dummett 1981a: 585). Dummett himself answers that we will gradually approximate the Fregean ideal because we can only rationally decide controversies involving proper names (“Did Napoleon really exist?”) when we agree about the sense of these names (Dummett 1981a: 100f). Evans’ reply is spot on: “[I]t is the actual practice of using the name ‘a’, not some ideal substitute, that interests us […]” (Evans 1982: 40). The semanticist studies English, not a future language that will be closer to the Fregean ideal. Heck has given another answer that is immune to the objection that the Fregean theory is not a theory for a language anyone (already) speaks. Proponents of the Hybrid View assume (i) that one has to know to which thing “Hesperus” refers in order to understand it and (ii) that there is no constraint on the ways or methods in which one can come to know this. But if understanding your utterance of “George Orwell wrote 1984” consists at least in part in coming to know of George Orwell that the utterance is about him, one cannot employ any mode of presentation of George Orwell. The speaker will assume that his audience can come to know what he is talking about on the basis of his utterance and features of the context. If the audience’s method of finding out who is talked about does not draw on these reasons, they still might get it right. But it might easily have been the case that the belief they did actually acquire was false. In this situation they would not know who I am talking about with “George Orwell”. Hence, the idea that in understanding an assertoric utterance one acquires knowledge limits the
43
44
I. Foundations of semantics ways in which one may think of something in order to understand an utterance about it (Heck 1995: 102). If this argument for the application of the sense/reference distinction to natural language is along the right lines, the bearers of sense and reference are no longer formindividuated signs. The argument shows at best that the constraints on understanding an utterance are more demanding than getting the references of the uttered words and their mode of combination right. This allows different utterances of the same formindividuated sentence to have different senses. There is no requirement that the audience and the speaker grasp the same sense in understanding (making) the utterance. The important requirement is that they all know what they are referring to. There is another line of argument for the application of the sense/reference distinction to natural language. Frege often seems to argue that one only needs sense AND reference in languages ‘designed’ for inferential thinking. But of course we also make inferences in natural languages. Communication in natural language is often joint reasoning. Take the following argument: You: Hegel was drunk. Me: And Hegel is married. We both: Hegel was drunk and is married.
Neither you nor I know both premises independently of each other; each person knows one premise and transmits knowledge of this premise to the other person via testimony. Together we can come to know the conclusion by deduction from the premises. But the argument above can only be valid and knowledge-transferring if you and I are entitled to take for granted that “Hegel” in the first premise names the same person as “Hegel” in the second premise without further justification. Otherwise the argument would be incomplete; its rational force would rest on implicit background premises. According to Frege, whenever coincidence in reference is obvious, we have sameness of sense (PMC: 234). Every theory that wants to account for inference must acknowledge that sometimes we are entitled to take co-reference for granted. Hence, we have a further reason to assume that utterances of natural language sentences have senses.
4.4. Sense and reference for context-dependent expressions Natural language contains unambiguous signs, tokens of which stand for different things in different utterances because the sign means what it does. Among such context-dependent expressions are: – – – –
personal pronouns: ‘I’, ‘you’, ‘my’, ‘he’, ‘she’, ‘it’ demonstrative pronouns: ‘that’, ‘this’, ‘these’ and ‘those’ adverbs: ‘here’, ‘now’, ‘today’, ‘yesterday’, ‘tomorrow’ adjectives: ‘actual’, ‘present’ (a rather controversial entry)
An expression has a use as a context-dependent expression if it is used in a way that its reference in that use can vary with the context of utterance while its linguistic meaning stays the same.
3. (Frege on) Sense and reference Context-dependent expressions are supposed to pose a major problem for Frege. Perry (1977) has started a fruitful discussion about Frege’s view on context-dependent expressions. Let us use the indexical ‘now’ to make clear what the problem is supposed to be: 1. If tokens of the English word ‘now’ are produced at different times, the tokens refer to different times. 2. If two signs differ in reference, they differ in sense. Hence, 3. Tokens of the English word ‘now’ that are produced at different times differ in sense. 4. It is not possible that two tokens of ‘now’ co-refer if they are produced at different times. Hence, 5. It is not possible that two tokens of ‘now’ that are produced at different times have the same sense. Every token of ‘now’ that is produced at time t differs in sense from all other tokens of ‘now’ not produced at t. Now one will ask what is the particular sense of a token of ‘now’ at t? Perry calls this ‘the completion problem’. Is the sense of a particular token of ‘now’ the same as the sense of a definite description of the time t at which ‘now’ is uttered? No, take any definite description d of the time t that does not itself contain ‘now’ or a synonym of ‘now’. No statement of the form ‘d is now’ is trivial. Take as a representative example, ‘The start of my graduation ceremony is now’. Surely, it is not self-evident that the start of my graduation ceremony is now. Hence, Perry takes Frege to be settled with the unattractive conclusion that for each time t there is a primitive and particular way in which t is presented to us at t, which gives rise to thoughts accessible only at t, and expressible then with ‘now’ (Perry 1977: 491). Perry (1977) and Kaplan (1977/1989) have argued that one should, for this and further reasons, replace Fregean thoughts with two successor notions: character and content. The character of an indexical is, roughly, a rule that fixes the referent of the context-dependent expression in a context; the content is, roughly, the referent that has been so fixed. This revision of Frege has now become the orthodox theory.
4.5. The mode of presentation problem Frege’s theory of sense and reference raises the question “What are modes of presentation?” If modes of presentation are not the meanings of definite descriptions, what are they? This question can be understood as a request to reduce modes of presentation to scientifically more respectable things. I have no argument that one cannot reduce sense to something more fundamental and scientifically acceptable, but there is inductive evidence that there is no such reduction: many people have tried very hard for a long time, none of them has succeeded (see Schiffer 1990 and 2003). Modes of presentation may simply be modes of presentation and not other things (see Peacocke 1992: 121).
45
46
I. Foundations of semantics Dummett has tried to work around this problem by making a theory of sense a theory of understanding: [T]here is no obstacle to our saying what it is that someone can do when he grasps that sense; and that is all that we need the notion of sense for. (Dummett 1981a: 227)
If this is true, we can capture the interesting part of the notion of sense by explaining what knowledge of sense consists in. Does knowledge of sense reduce to something scientifically respectable? Dummett proposes the following conditions for knowing the sense of a proper name: S knows the sense of a proper name N iff S has a criterion for recognising for any given object whether it is the bearer of N. (Dummett 1981a: 229)
Does your understanding “Gottlob Frege” consist in a criterion for recognising him when he is presented to you? (See Evans 1982, sec. 4.2.) No, he can no longer be presented to you in the way he could be presented to his contemporaries. Did the sense of “Gottlob Frege” change when he died? The real trouble for modes of presentation seems not to be their irreducibility, but the problem to say, in general, what the difference between two modes of presentation consists in. Consider Fine’s (2007: 36) example. You live in a symmetrical universe. At a certain point you are introduced to two identical twins. Perversely you give them simultaneously the same name ‘Bruce’. Even in this situation it is rational to assert the sentence “Bruce is not the same person as Bruce”. The Fregean description of this case is that the figure “Bruce” expresses in one idiolect two modes of presentation. But what is the difference between these modes of presentation? The difference cannot be specified in purely qualitative terms. The twins have all their qualitative properties in common. Can the difference be specified in indexical terms? After all, you originally saw one Bruce over there and the other over here. This solution seems ad hoc. For, in general, one can forget the features that distinguished the bearer of a name when one introduced the name and yet continue to use the name. Can the difference in mode of presentation be specified in terms of the actual position of the two Bruce’s? Maybe, but this difference cannot ground the continued use of name and it raises the question of what makes the current distinction sustain the use of the name originally introduced on the basis of other features. As long as these and related questions are not answered, alternatives to sense and reference merit a fair hearing.
5. Summary Frege’s work on sense and reference has set the agenda for over a century of research. The main challenge for Frege’s friends is to find a plausible way to apply his theoretical apparatus to natural languages. Whether one believes that the challenge can be met or not, Frege has pointed us to a pre-philosophical datum, the fact that true identity
3. (Frege on) Sense and reference statements about the same object can differ in cognitive value, that is a crucial touchstone for philosophical semantics. I want to thank Sarah-Jane Conrad, Silvan Imhof, Laura Mercolli, Christian Nimtz, the Editors and an anonymous referee for helpful suggestions.
6. References Works by Frege: Begriffsschrift (BS, 1879). Reprinted: Darmstadt: Wissenschaftliche Buchgesellschaft, 1977. The Foundations of Arithmetic (FA, 1884). Oxford: Blackwell, 1974. (C. Thiel (ed.). Die Grundlagen der Arithmetik. Hamburg: Meiner, 1988.) Nachgelassene Schriften (NS), edited by Hans Hermes, Friedrich Kambartel & Friedrich Kaulbach. 2nd rev. edn. Hamburg: Meiner Verlag 1983. 1st edn. translated by Peter Long & Roger White as Posthumous Writings (PW). Oxford: Basil Blackwell, 1979. Wissenschaftlicher Briefwechsel (BW), edited by Gottfried Gabriel, Hans Hermes, Friederich Kambartel, Christian Thiel & Albert Veraart. Hamburg: Meiner Verlag, 1976. Abridged by Brian McGuinness and translated by Hans Kaal as Philosophical and Mathematical Correspondence (PMC). Oxford: Basil Blackwell, 1980. McGuinness, Brian (ed.) 1984: Gottlob Frege: Collected Papers on Mathematics, Logic and Philosophy (CP). Oxford: Basil Blackwell. ‘Über Sinn und Bedeutung’. Zeitschrift für Philosophie und philosophische Kritik 100, 25–50. Translated as ‘On Sense and Meaning’ (S&R), in: McGuinness 1984, 157–177. ‘Der Gedanke’. Beiträge zur Philosophie des deutschen Idealismus 1, 58–77. Translated as ‘Thoughts’ (T), in: McGuinness 1984, 351–373. Other Works: Bell, David 1987. Thoughts. Notre Dame Journal of Philosophical Logic 28, 36–51. Bezuidenhout, Anne & Marga Reimer (eds.) 2004. Descriptions and Beyond. Oxford: Oxford University Press 2004. Burge, Tyler 2005. Truth, Thought, Reason: Essays on Frege. Oxford: Clarendon Press. Campbell, John 1994. Past, Space and Self. Cambridge, MA: The MIT Press. Carnap, Rudolf 1956. Meaning and Necessity. 2nd edn. Chicago, IL: The University of Chicago Press. Church, Alonzo 1951. A formulation of the logic of dense and denotation. In: P. Henle, H. M. Kallen & S. K. Langer (eds.). Structure, Method and Meaning. Essays in Honour of Henry M. Sheffer. New York: Liberal Arts Press, 3–24. Church, Alonzo 1973. Outline of a revised formulation of the logic of sense and denotation (Part I). Noûs 7, 24–33. Church, Alonzo 1974. Outline of a revised formulation of the logic of sense and denotation (Part II). Noûs 8, 135–156. Davidson, Donald 1962. The method of extension and intension. In: P. A. Schilpp (ed.). The Philosophy of Rudolf Carnap. La Salle, IL: Open Court, 311–351. Davidson, Donald 1984. Inquiries into Truth and Interpretation. Oxford: Oxford University Press. Dummett, Michael 1975. Frege’s distinction between sense and reference [Spanish translation under the title ‘Frege’]. Teorema V, 149–188. Reprinted in: M. Dummett. Truth and Other Enigmas. London: Duckworth, 1978, 116–145. Dummett, Michael 1981a. Frege: Philosophy of Language. 2nd edn. London: Duckworth. Dummett, Michael 1981b. The Interpretation of Frege’s Philosophy. London: Duckworth. Dummett, Michael 1989. More about thoughts. Notre Dame Journal of Philosophical Logic 30, 1–19. Dummett, Michael 1993. The Seas of Language. Oxford: Oxford University Press. Evans, Gareth 1982. Varieties of Reference. Oxford: Oxford University Press.
47
48
I. Foundations of semantics Fine, Kit 1989. The problem of de re modality. In: J. Almog, J. Perry & H. K. Wettstein (eds.). Themes from Kaplan. Oxford: Oxford University Press, 197–272. Fine, Kit 2007. Semantic Relationism. Oxford: Blackwell. Forbes, Graeme 1990. The indispensability of Sinn. The Philosophical Review 99, 535–563. Furth, Montgomery 1964. Introduction. In: M. Furth (ed.). The Basic Laws of Arithmetic: Exposition of the System. Berkeley, CA: University of California Press, v–lix. Geach, Peter T. 1975. Names and identity. In: S. Guttenplan (ed.). Mind and Language. Oxford: Clarendon Press, 139–158. Heck, Richard 1995. The sense of communication. Mind 104, 79–106. Jackson, Frank 2005. What are proper names for? In: J. C. Marek & M. E. Reicher (eds.). Experience and Analysis. Proceedings of the 27th International Wittgenstein Symposium.Vienna: hpt-öbv, 257–269. Kaplan, David 1989. Demonstratives. In: J. Almog, J. Perry & H. K. Wettstein (eds.). Themes from Kaplan. Oxford: Oxford University Press, 481–563. Kripke, Saul 1980. Naming and Necessity. Cambridge, MA: Harvard University Press. (Separate reissue of: S. Kripke. Naming and necessity. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 1972, 253–355 and 763–769.) Kripke, Saul 1979. A puzzle about belief. In: A. Margalit (ed.). Meaning and Use. Dordrecht: Reidel, 239–283. Reprinted in: N. Salomon & S. Soames (eds.). Propositions and Attitudes. Oxford: Oxford University Press, 1989, 102–148. Kemmerling, Andreas 1990. Gedanken und ihre Teile. Grazer Philosophische Studien 37, 1–30. Künne, Wolfgang 1992. Hybrid proper names. Mind 101, 721–731. Levine, James 2004. On the ‘Gray’s elegy’ argument and its bearing on Frege’s theory of sense. Philosophy and Phenomenological Research 64, 251–295. Lewis, David 1970. General semantics. Synthese 22, 18–67. Makin, Gideon 2000. The Metaphysicians of Meaning. London: Routledge. May, Robert 2006. The invariance of sense. The Journal of Philosophy 103, 111–144. McDowell, John 1977. On the sense and reference of a proper name. Mind 86, 159–185. Mendelsohn, Richard L. 1982. Frege’s Begriffsschrift Theory of identity. The Journal for the History of Philosophy 20, 279–299. Montague, Richard 1973. The proper treatment of quantification in ordinary English. In: J. Hintikka, J. Moravcsik & P. Suppes (eds.). Approaches to Natural Language. Dordrecht: Reidel, 221–242. Reprinted in: R. Thomason (ed.). Formal Philosophy. Selected Papers of Richard Montague. New Haven, CT: Yale University Press, 1974, 247–270. Parsons, Terence 1997. Fregean theories of truth and meaning. In: M. Schirn (ed.). Frege: Importance and Legacy. Berlin: de Gruyter, 371–409. Peacocke, Christopher 1992. A Study of Concepts. Cambridge, MA: The MIT Press. Peacocke, Christopher 1999. Being Known, Oxford: Oxford University Press. Perry, John 1977. Frege on demonstratives. The Philosophical Review 86, 474–497. Putnam, Hilary 1954. Synonymy and the analysis of belief sentences. Analysis 14, 114–122. Putnam, Hilary 1975. The meaning of ‘meaning’. In: K. Gunderson (ed.). Language, Mind and Knowledge. Minneapolis, MN: University of Minnesota Press. Reprinted in: H. Putnam. Mind. Language and Reality: Philosophical Papers, Volume 2. Cambridge: Cambridge University Press, 1975, 215–271. Quine, Willard van Orman 1951. Two dogmas of empiricism. The Philosophical Review 60, 20–43. Reprinted in: W.V.O. Quine. From a Logical Point of View. Harvard: Harvard University Press, 1953, 20–47. Ricketts, Thomas 2003. Quantification, sentences, and truth-values. Manuscrito: Revista International de Filosofia 26, 389–424. Rumfitt, Ian 1994. Frege’s theory of predication: An elaboration and defense, with some new applications. The Philosophical Review 103, 599–637. Russell, Bertrand 1905. On denoting. Mind 14, 479–493.
4. Reference: Foundational issues
49
Russell, Bertrand 1910/11. Knowledge by acquaintance and knowledge by description. Proceedings of the Aristotelian Society New Series 11, 108–128. Reprinted in: B. Russell. Mysticism and Logic. London: Unwin, 1986, 201–221. Sainsbury, Mark R. 2005. Reference without Referents. Oxford: Oxford University Press. Schiffer, Stephen 1990. The mode of presentation problem. In: A. C. Anderson & J. Owens (eds.). Propositional Attitudes. Stanford, CA: CSLI Publications. 249–269. Schiffer, Stephen 2003. The Things We Mean. Oxford: Oxford University Press. Soames, Scott 1998. The modal argument: Wide scope and rigidified descriptions. Noûs 32, 1–22. Textor, Mark 2007. Frege’s theory of hybrid proper names developed and defended. Mind 116, 947–982. Textor, Mark 2009. A repair of Frege’s theory of thoughts. Synthese 167, 105–123. Wiggins, David 1997. Meaning and truth conditions: From Frege’s grand design to Davidson’s. In: B. Hale & C. Wright (eds.). A Companion to the Philosophy of Language. Oxford: Blackwell, 3–29.
Mark Textor, London (United Kingdom)
4. Reference: Foundational issues 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
Introduction Direct reference Frege’s theory of sense and reference Reference vs. quantification and Russell’s theory of descriptions Strawson’s objections to Russell Donnellan’s attributive-referential distinction Kaplan’s theory of indexicality Proper names and Kripke’s return to Millian nondescriptionality Propositional attitude contexts Indefinite descriptions Summary References
Abstract This chapter reviews issues surrounding theories of reference. The simplest theory is the “Fido”-Fido theory – that reference is all that an NP has to contribute to the meaning of phrases and sentences in which it occurs. Two big problems for this theory are coreferential NPs that do not behave as though they were semantically equivalent and meaningful NPs without a referent. These problems are especially acute in sentences about beliefs and desires – propositional attitudes. Although Frege’s theory of sense, and Russell’s quantificational analysis, seem to solve these problems for definite descriptions, they do not work well for proper names, as Kripke shows. And Donnellan and Strawson have other objections to Russell’s theory. Indexical expressions like “I” and “here” create their own issues; we look at Kaplan’s theory of indexicality, and several solutions to the problem indexicals create in propositional attitude contexts. The final section looks at indefinite descriptions, and some more recent theories that make them appear more similar to definite descriptions than was previously thought. Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 49–74
50
I. Foundations of semantics
1. Introduction Reference, it seems, is what allows us to use language to talk about things and thus vital to the functioning of human language. That being said there remain several parameters to be fixed in order to determine a coherent field of study. One important one is whether reference is best viewed as a semantic phenomenon – a relation between linguistic expressions and the objects referred to, or whether it is best viewed as pragmatic – a three-place relation among language users, linguistic expressions, and things. The ordinary everyday meaning of words like “refer” and “reference” would incline us toward the pragmatic view (we say, e.g., Who were you referring to?, not Who was your phrase referring to?). However there is a strong tradition, stemming from work in logic and early modern philosophy of language, of viewing reference as a semantic relation, so that will be the main focus of our attention at the outset, although we will turn before long to pragmatic views.
1.1. Reference vs. predication Another parameter to be determined is what range of expressions (can be used to) refer. The traditional model sentence, e.g. Socrates runs, consists of a simple noun phrase (NP), a proper name in this case, and a simple verb phrase (VP). The semantic function of the NP is to pick out (i.e. refer to) some entity, and the function of the VP is to predicate a property of that entity. The sentence is true if the entity actually has that property, and false otherwise. Seen in this light, reference and predication are quite different operations. Of course many natural language sentences do not fit this simple model. In a semantics which is designed to treat the full panoply of expression types and to provide truth conditions for sentences containing them, the distinction between reference and predication may not be so clear or important. In classical Montague Grammar, for example, expressions of all categories (except determiners and conjunctions) are assigned an extension, where extension is a formal counterpart of reference (Montague 1973; see also article 11 (Kempson) Formal semantics and representationalism). (In Montague Grammar expressions are also assigned an intension, which corresponds to a Fregean sense – see below, and cf. article 3 (Textor) Sense and reference.) An expression’s extension may be an ordinary type of entity, or, more typically, it may be a complex function of some kind. Thus the traditional bifurcation between reference and predication is not straightforwardly preserved in this approach. However for our purposes we will assume this traditional bifurcation, or at least that there is a difference between NPs and other types of expressions, and we will consider reference only for NPs. Furthermore our primary focus will be on definite NPs, which are the most likely candidates for referring expressions, though they may have other, non-referring uses too (see the articles in Chapter IX, Noun phrase semantics). The category of definite NPs includes proper names (e.g. Amelia Earhart), definite descriptions (e.g. the book Sally is reading), demonstrative descriptions (e.g. this house), and pronouns (e.g. you, that). We will have only a limited amount to say about pronouns; for the full story, see article 40 (Büring) Pronouns. (Another important category comprises generic NPs; for these, see article 47 (Carlson) Genericity.) Whether other types of NP, such as indefinite descriptions (e.g. a letter from my mother), can be properly said to be referring expressions is an issue of some dispute – see below, section 10.
4. Reference: Foundational issues
1.2. The metaphysical problem of reference Philosophers have explored the question of what it is, in virtue of which an expression has a reference – what links an expression to a reference and how did it come to do that? Frege (1892) argued that expressions express a sense, which is best thought of as a collection of properties. The reference of an expression is that entity which possesses exactly the properties contained in the sense. Definite descriptions are the clearest examples for this theory; the inventor of bifocals specifies a complex property of having been the first to think up and create a special type of spectacles, and that NP refers to Benjamin Franklin because he is the one who had that property. One problem with this answer to the question of what determines reference is the mysterious nature of senses. Frege insisted they were not to be thought of as mental entities, but he did not say much positive about what they are, and that makes them philosophically suspect. (See article 3 (Textor) Sense and reference.) Another answer, following Kripke (1972), is what is usually referred to as a “causal (or historical) chain”. The model in this case is proper names, and the idea is that there is some initial kind of naming event whereupon an entity is bestowed with a name, and then that name is passed down through the speech community as a name of that entity. In this article we will not be so much concerned with the metaphysical problem of what determines reference, but instead the linguistic problem of reference – determining what it is that referring expressions contribute semantically to the phrases and sentences in which they occur.
1.3. The linguistic problem of reference The two answers to the metaphysical problem of reference correlate with two answers to the linguistic question of what it is that NPs contribute to the semantic content of phrases and sentences in which they appear. Frege’s answer to the linguistic question is that expressions contribute their reference to the reference of the phrases in which they occur, and they contribute their sense to the sense of those phrases. But there are complications to this simple answer that we will review below in section 3. The other answer is that expressions contribute only their reference to the semantic content of the phrases and sentences in which they occur. Since this is a simpler answer, we will begin by looking at it in some more detail, in section 2, in order to able to understand why Frege put forward his more complex theory.
2. Direct reference The theory according to which an NP contributes only its reference to the phrases and sentences in which it occurs is currently called the “direct reference” theory (the term was coined by Kaplan 1989). It is also sometimes called the “Fido-Fido” theory, the idea being that you have the name Fido and its reference is the dog, Fido, and that’s all there is to reference and all there is to the meaning of such phrases. One big advantage of this simple theory is that it does not result in the postulation of any suspect entities. However there are two serious problems for this simple theory: one is the failure of coreferential NPs to be fully equivalent semantically, and the other is presented by seemingly meaningful NPs which do not have a reference – so called “empty NPs”. We will look more closely at each of these problems in turn.
51
52
I. Foundations of semantics
2.1. Failure of substitutivity According to the direct reference theory, coreferential NPs – NPs which refer to the same thing – should be able to be substituted for each other in any sentence without a change in the semantics of the sentence or its truth value. (This generalization about intersubstitutivity is sometimes referred to as “Leibniz’ Law”.) If all that an NP has to contribute semantically is its reference, then it should not matter how that reference is contributed. However almost any two coreferential NPs will not seem to be intersubstitutable – they will seem to be semantically different. Frege (1892) worried in particular about two different kinds of sentence that showed this failure of substitutivity.
2.1.1. Identity sentences The first kind are identity sentences – sentences of the form a = b, or (a little more colloquially) a is (the same entity as) b. If such a sentence is true, then the NPs a and b are coreferential so, according to the direct reference theory, it shouldn’t matter which NP you use, including in identity sentences themselves! That is, the two sentences in (1) should be semantically equivalent (these are Frege’s examples). (1)
a. The morning star is the morning star. b. The morning star is the evening star.
However, as Frege noted, the two sentences are very different in their cognitive impact. (1a) is a trivial sentence, whose truth is known to anyone who understands English (it is analytic). (1b) on the other hand gives the results of a major astronomical finding. Thus even though the only difference between (1a) and (1b) is that we have substituted coreferential NPs (the evening star for the morning star), there is still a semantic difference between them.
2.1.2. Propositional attitude sentences The other kind of sentence that Frege worried about was sentences about propositional attitudes – the attitudes of sentient beings about situations or states of affairs. Such sentences will have a propositional attitude verb like believe, hope, know, doubt, want, etc. as their main verb, plus a sentential complement saying what the subject of the verb believes, hopes, wants, etc. Just as with identity statements, coreferential NPs fail to be intersubstitutable in the complements of such sentences. However the failure is more serious in this case. Intersubstitution of coreferential NPs in identity sentences always preserves truth value, but in propositional attitude sentences the truth value may change. Thus (2a) could be true while (2b) was false. (2)
a. Mary knows that the morning star is a planet. b. Mary knows that the evening star is a planet.
We can easily imagine that Mary has learned the truth about the morning star, but not about the evening star.
4. Reference: Foundational issues
2.2. Empty NPs The other major problem for the direct reference theory is presented by NPs that do not refer to anything – NPs like the golden mountain or the round square. The direct reference theory seems to make the prediction that sentences containing such NPs should be semantically defective, since they contain a part which has no reference. Yet sentences like those in (3) do not seem defective at all. (3)
a. Lee is looking for the golden mountain. b. The philosopher’s stone turns base metals into gold.
Sentences about existence, especially those that deny existence, pose special problems here. Consider (4): (4)
The round square does not exist.
Not only is (4) not semantically defective, it is even true! So this is the other big problem for the Fido-Fido direct reference theory.
3. Frege’s theory of sense and reference As noted above, Frege proposed that referring expressions have semantic values on two levels, reference and sense. Frege was anticipated in this by Mill (1843), who had proposed a similar distinction between denotation (reference) and connotation (sense). (Mill’s use of the word “connotation” must be kept distinct from its current use to mean hints or associations connected with a word or phrase. Mill’s connotations functioned like Frege’s senses – to determine reference (denotation).) One important aspect of Frege’s work was his elegant arguments. He assumed two fundamental principles of compositionality. At the level of senses he assumed that the sense of a complex expression is determined by the senses of its component parts plus their syntactic mode of combination. Similarly at the level of reference, the reference of a complex expression is determined by the references of its parts plus their mode of combination. (Following Frege, it is commonly assumed today that meanings, whatever they are, are compositional. That is thought to be the only possible explanation of our ability to understand novel utterances. See article 6 (Pagin & Westerståhl) Compositionality.) Using these two principles Frege argued further that the reference of a complete sentence is its truth value, while the sense of a sentence is the proposition it expresses.
3.1. Solution to the problem of substitutivity Armed with senses and the principles of compositionality, plus an additional assumption that we will get to shortly, Frege was able to solve most, though not all, of the problems pointed out above.
53
54
I. Foundations of semantics
3.1.1. Identity sentences First, for the problem of failure of substitutivity of coreferential NPs in identity statements Frege presents a simple solution. Recall example (1) repeated here. (1)
a. The morning star is the morning star. b. The morning star is the evening star.
Although the morning star and the evening star have the same reference (making (1b) a true sentence), the two NPs differ in sense. Thus (1b) has a different sense from (1a) and so we can account for the difference in cognitive impact.
3.1.2. Propositional attitude sentences Turning to the problem of substitutivity in propositional attitude contexts, Frege again offers us a solution, although the story is a bit more complicated in this case. Here are the examples from (2) above. (2)
a. Mary knows that the morning star is a planet. b. Mary knows that the evening star is a planet.
Simply observing that the evening star has a different sense from the morning star will account for why (2b) has a different sense from (2a), but by itself does not yet account for the possible change in truth value. This is where the extra piece of machinery mentioned above comes in. Frege pointed out that expressions can sometimes shift their reference in particular contexts. When we quote expressions, for example, those expressions no longer have their customary reference, but instead refer to themselves. Consider the example in (5) (5)
“The evening star” is a definite description.
The phrase the evening star as it occurs in (5) does not refer to the planet Venus any more, but instead refers to itself. Frege argued that a similar phenomenon occurs in propositional attitude contexts. In such contexts, Frege argued, expressions also shift their reference, but here they refer to their customary sense. This means that the reference of (2a) (its truth value) involves the customary sense of the phrase the morning star rather than its customary reference, while the reference/truth value of (2b) involves instead the customary sense of the phrase the evening star. Since we have two different components, it is not unexpected that we could have two different references – truth values – for the two sentences.
3.2. Empty NPs NPs that have no reference were the other main problem area for the direct reference theory. One problem was the apparent meaningfulness of sentences containing such NPs. It is easy to see how Frege’s theory solved this problem. As long as such NPs have a sense, the sentences containing them can have a well-formed sense as well, so their
4. Reference: Foundational issues meaningfulness is not a problem. We should note, though, that Frege’s theory predicts that such sentences will not have a truth value. That is because the truth value, as we have noted, is determined by the references of the constituent expressions in a sentence, and if one of those expressions doesn’t have a reference then the whole sentence will not have one either. This means that true negative existential sentences, such as (4) repeated here: (4)
The round square does not exist.
remain a problem for Frege. Since the subject does not have a reference, the whole sentence should not have a truth value, but it does.
3.3. Further comments on Frege’s work Several further points concerning Frege’s work will be relevant in what follows.
3.3.1. Presupposition As we have just observed, Frege’s principle of compositionality at the level of reference, together with his conclusion that the reference of a sentence is its truth value, means that a sentence containing an empty NP will fail to have a truth value. Frege held that the use of an NP involves a presupposition, rather than an assertion, that the NP in question has a reference. (See article 91 (Beaver & Geurts) Presupposition.) Concerning example (6) (6)
The one who discovered the elliptical shape of the planetary orbits died in misery.
Frege said that if one were to hold that part of what one asserts in the use of (6) is that there is a person who discovered the elliptical shape of the planetary orbits, then one would have to say that the denial of (6) is (7). (7)
Either the one who discovered the elliptical shape of the planetary orbits did not die in misery, or no one discovered the elliptical shape of the planetary orbits.
But the denial of (6) is not (7) but rather simply (8). (8)
The one who discovered the elliptical shape of the planetary orbits did not die in misery.
Instead, both (6) and (8) presuppose that the definite description the one who discovered the elliptical shape of the planetary orbits has a reference, and if the NP were not to have a reference, then neither sentence would have a truth value.
3.3.2. Proper names It was mentioned briefly above that Mill’s views were very similar to Frege’s in holding that referring expressions have two kinds of semantic significance – both sense and
55
56
I. Foundations of semantics reference, or in Mill’s terms, connotation and denotation. Mill, however, made an exception for proper names, which he believed did not have connotation but only denotation. Frege appeared to differ from Mill on that point. We can see that, given Frege’s principle of compositionality of sense, it would be important for him to hold that proper names do have a sense, since sentences containing them can clearly have a sense, i.e. express a proposition. His most famous comments on the subject occur in a footnote to “On sense and reference” in which he appeared to suggest that proper names have a sense which is similar to the sense which a definite description might have, but which might vary from person to person. The name Aristotle, he seemed to suggest, could mean ‘the pupil of Plato and teacher of Alexander the Great’ for one person, but ‘the teacher of Alexander the Great who was born in Stagira’ for another person (Frege 1892, fn. 2).
3.3.3. Propositions According to Frege, the sense of a sentence is the proposition it expresses, but it has been very difficult to determine what propositions are. Frege used the German word “Gedanke” (‘thought’), and as we have seen, for Frege (as for many others) propositions are not only what sentences express, they are also the objects of propositional attitudes. Subsequent formalizations of Frege’s ideas have used the concept of possible worlds to analyze them. Possible worlds are simply alternative ways things (in the broadest sense) might have been. E.g. I am sitting in my office at this moment, but I might instead have gone out for a walk; there might have been only 7 planets in our solar system, instead of 8 or 9. Using this notion, propositions were analyzed as functions from possible worlds to truth values (or equivalently, as sets of possible worlds). This meshes nicely with the idea that the sense of a sentence combined with facts about the way things are (a possible world) determine a truth value. However there are problems with this view; for example, all mathematical truths are necessarily true, and thus true in every possible world, but the sentences expressing them do not seem to have the same meaning (in some pre-theoretic sense of meaning), and it seems that one can know the truth of one without knowing the truth of all of them – e.g. someone could know that two plus two is four, but not know that there are an infinite number of primes. (For continued defense of the possible worlds view of the objects of thought, see Stalnaker 1984, 1999.) Following Carnap (1956), David Lewis (1972) suggested that sentence meanings are best viewed as entities with syntactic structure, whose elements are the senses (intensions) of the constituent expressions. We will see an additional proposal concerning what at least some propositions are shortly, in section 4 on Russell.
3.4. Summary comments on Frege Frege’s work was neglected for some time, both within and outside Germany. Eventually it received the attention it deserved, especially during the development of formal semantic treatments of natural language by Carnap, Kripke, Montague, and others (see article 10 (Newen & Schröder) Logic and semantics, article 11 (Kempson) Formal semantics and representationalism). Although the distinction between sense and reference (or intension and extension) is commonly accepted, Frege’s analysis of propositional attitude contexts has fallen out of favor; Donald Davidson has famously declared Frege’s
4. Reference: Foundational issues theory of a shift of reference in propositional attitude contexts to be “plainly incredible” (Davidson 1968/1984: 108), and many others seem to have come to the same conclusion.
4. Reference vs. quantification and Russell’s theory of descriptions In his classic 1905 paper “On denoting”, Bertrand Russell proposed an alternative to Frege’s theory of sense and reference. To understand Russell’s work it helps to know that he was fundamentally concerned with knowledge. He distinguished knowledge by acquaintance, which is knowledge we gain directly via perception, from knowledge by description (cf. Russell 1917), and he sought to analyze the latter in terms of the former. Russell was a direct reference theorist, and rejected Frege’s postulation of senses (though he did accept properties, or universals, as the semantic content of predicative expressions). The only genuine referring expressions, for Russell, were those that could guarantee a referent, and the propositions expressed by sentences containing such expressions are singular propositions, which contain actual entities. If I were to point at Mary, and utter (9a), I would be expressing the singular proposition represented in (9b). (9)
a. She is happy. b.
Note that the first element of (9b) is not the name Mary, but Mary herself. Any NP which is unable to guarantee a referent cannot contribute an entity to a singular proposition. Russell’s achievement in “On denoting” was to show how such NPs could be analyzed away into the expression of general, quantificational propositions.
4.1. Quantification In traditional predicate logic, overtly quantificational NPs like every book, no chair do not have an analysis per se, but only in the context of a complete sentence. (In logics with generalized quantifiers, developed more recently, this is not the case; see article 43 (Keenan) Quantifiers.) Look at the examples in (10) and (11). (10) a. Every book is blue. b. ∀x[book(x) ⊃ blue(x)] (11) a. No table is sturdy. b. ~∃x[table(x) & sturdy(x)] (10b), the traditional logical analysis of (10a), says (when translated back into English) For every x, if x is a book then x is blue. We can see that every, the quantificational element in (10a), has been elevated to the sentence level, in effect, so that it expresses a relationship between two properties – the property of being a book and the property of being blue. Similarly (11b) says, roughly, It is not the case that there is an x such that x is a table and x is sturdy. It can be seen that this has the same truth conditions as No table is sturdy, and once again the quantificational element (no in this case) has been analyzed as expressing a relation between two properties, in this case the properties of being a table and being sturdy.
57
58
I. Foundations of semantics
4.2. Russell’s analysis of definite descriptions Russell’s analysis of definite descriptions was called “the paradigm of philosophy” (by Frank Ramsey), and if analysis is the heart of philosophy then indeed it is that. One of the examples Russell took to illustrate his method is given in (12a), and its analysis is in (12b). (12) a. The present king of France is bald. b. ∃x[king-of-France(x) & ∀y[king-of-France(y) ⊃ y=x] & bald(x)] The analysis in (12b) translates loosely into the three propositions expressed by the sentences in (13). (13) a. There is a king of France. b. There is at most one king of France. c. He is bald. (He, in (13c) must be understood as bound by the initial There is a… in (13a).) As can be seen, the analysis in (12b) contains no constituent that corresponds to the present king of France. Instead the is analyzed as expressing a complex relation between the properties of being king of France and being bald. Let us look now at how Russell’s analysis solves the problems for the direct reference theory.
4.3. Failure of substitutivity 4.3.1. Identity sentences Although Russell was a direct reference theorist, we can see that, under his analysis, English definite descriptions have more to contribute to the sentences in which they occur than simply their reference. In fact they no longer contribute their reference at all (because they are no longer referring expressions, and do not have a reference). Instead they contribute the properties expressed by each of the predicates occurring in the description. It follows that two different definite descriptions, such as the morning star and the evening star, will make two different contributions to their containing sentences. And thus it is no mystery why the two identity sentences from (1) above, repeated here in (14), have different cognitive impact. (14) a. The morning star is the morning star. b. The morning star is the evening star. The meaning of the second sentence involves the property of being seen in the evening as well as that of being seen in the morning.
4.3.2. Propositional attitude sentences When we come to propositional attitude sentences the story is a little more complicated. Recall that Russell’s analysis does not apply to a definite description by itself, but only in the context of a sentence. It follows that when a definite description occurs in an
4. Reference: Foundational issues embedded sentence, as in the case of propositional attitude sentences, there will be two ways to unpack it according to the analysis. Thus Russell predicts that such sentences are ambiguous. Consider our example from above, repeated here as (15). (15) Mary knows that the morning star is a planet. According to Russell’s analysis we may unpack the phrase the morning star with respect to either the morning star is a planet or Mary knows that the morning star is a planet. The respective results are given in (16). (16) a. Mary knows that ∃x[morning star(x) & ∀y[morning star(y) ⊃ y=x] & planet(x)] b. ∃x[morning star(x) & ∀y[morning star(y) ⊃ y=x] & Mary knows that planet(x)] The unpacking in (16a) is what is called the narrow scope or de dicto (roughly, about the words) interpretation of (15). The proposition that Mary is said to know involves he semantic content that the object in question is the star seen in the morning. The unpacking in (16b) is called the wide scope or de re (roughly, about the thing) interpretation of (15). It attributes to Mary knowledge concerning a certain entity, but not under any particular description of that entity. The short answer to the question of how Russell’s analysis solves the problem of failure of substitutivity in propositional attitude contexts is that, since there are no referring constituents in the sentence after its analysis, there is nothing to substitute anything for. However Russell acknowledged that one could, in English, make a verbal substitution of one definite description for a coreferential one, but only on the wide scope, or de re interpretation. If we consider a slightly more dramatic example than (15), we can see that there seems to be some foundation for Russell’s prediction of ambiguity for propositional attitude sentences. Observe (17): (17) Oedipus wanted to marry his mother. Our first reaction to this sentence is probably to think that it is false – after all, when Oedipus found out that he had married his mother, he was very upset. This reaction is to the narrow scope, or de dicto reading of the sentence which attributes to Oedipus a desire which involves being married to specifically his mother and which is false. However there is another way to take the sentence according to which it seems to be true: there was a woman, Jocasta, who happened to be Oedipus’s mother and whom he wanted to marry. This second interpretation is the wide scope, or de re, reading of (17), according to which Oedipus has a desire concerning a particular individual, but where the individual is not identified for the purposes of the desire itself by any description.
4.4. Empty NPs Recall our initial illustration of Russell’s analysis of definite descriptions, repeated here. (18) a. The present king of France is bald. b. ∃x[king-of-France(x) & ∀y[king-of-France(y) ⊃ y=x] & bald(x)]
59
60
I. Foundations of semantics The example shows Russell’s solution to the problem of empty NPs. While for Frege such sentences have a failed presupposition and lack a truth value, under Russell’s analysis they assert the existence of the entity in question, and are therefore simply false. Furthermore Russell’s analysis of definite descriptions solves the more pressing problem of empty NPs in existence sentences. Under his analysis The round square does not exist would be analyzed as in (19) (19) ~∃x[round(x) & square(x) & ∀y[[round(y) & square(y)] ⊃ y=x]] which is meaningful and true.
4.5. Proper names Russell’s view of proper names was very similar to Frege’s view; he held that they are abbreviations for definite descriptions (which might vary from person to person) and thus that they have semantic content in addition to, or more properly in lieu of, a reference (cf. Russell 1917).
4.6. Referring and denoting As we have seen, for Russell, definite descriptions are not referring expressions, though he did describe them as denoting. For Russell almost any NP is a denoting phrase, including, e.g. every hat and nobody. One might ask what, if any, expressions were genuine referring expressions for Russell. Ultimately he held that only a demonstrative like this, used demonstratively, would meet the criterion, since only such an expression could guarantee a referent. These were the only true proper names, in his view. (See Russell 1917: 216 and fn. 5). It seems clear that we often use language to convey information about individual entities; if Russell’s analysis is correct, it means that the propositions containing that information must almost always be inferred rather than being directly expressed. Russell’s analysis of definite descriptions (though not of proper names) has been defended at length by Neale (1990).
5. Strawson’s objections to Russell Russell’s paper “On Denoting” stood without opposition for close to 50 years, but in 1950 P.F. Strawson’s classic reply “On Referring” appeared. Strawson had two major objections to Russell’s analysis – his neglect of the indexicality of definite descriptions like the king of France, and his claim that sentences with definite descriptions in them were used to assert the existence of a reference for the description. Let us look at each of these more closely.
5.1. Indexicality Indexical expressions are those whose reference depends in part on aspects of the utterance context, and thus may vary depending on context. Obvious examples are pronouns like I and you, and adverbs like here, and yesterday. Such expressions make vivid the
4. Reference: Foundational issues difference between a sentence and the use of a sentence to make a statement – a difference which may be ignored for logical purposes (given that mathematical truths are nonindexical) but whose importance in natural language was stressed by Strawson. Strawson pointed out that a definite description like the king of France could have been used at different past times to refer to different people – Louis XV in 1750, but Louis XVI in 1770, for example. Hence he held that it was a mistake to speak of expressions as referring; instead we can only speak of using an expression to refer on a particular occasion. This is the pragmatic view of reference that was mentioned at the outset of this article. Russell lived long enough to publish a rather tart response, “Mr. Strawson on referring”, in which he pointed out that the problem of indexicality was independent of the problems of reference which were his main concern in “On denoting”. However indexicality does raise interesting and relevant issues, and we return to it below, in section 7. (See also article 61 (Schlenker) Indexicality and de se.)
5.2. Empty NPs and presupposition The remainder of Strawson’s paper was primarily concerned with arguing that Russell’s analysis of definite descriptions was wrong in its implication that sentences containing them would be used to assert the existence (and uniqueness) of entities meeting their descriptive content. Instead, he said, a person using such a sentence would only imply “in a special sense of ‘imply’ ” (Strawson 1950: 330) that such an entity exists. (Two years later he introduced the term presuppose for this special sense of imply, Strawson 1952: 175.) And in cases where there is no such entity – that is for sentences with empty NPs, like The king of France is bald as uttered in 1950 – one could not make either a true or a false statement. The question of truth or falsity, in such cases, simply does not arise. (See article 91 (Beaver & Geurts) Presupposition.) We can see that Strawson’s position on empty NPs is very much the same as Frege’s, although Strawson did not appear to be familiar with Frege’s work on the subject.
6. Donnellan’s attributive-referential distinction In 1966 Keith Donnellan challenged Russell’s theory of definite descriptions, as well as Strawson’s commentary on that theory. He argued that both Russell and Strawson had failed to notice that there are two distinct uses of definite descriptions.
6.1. The basic distinction When one uses a description in the attributive way in an assertion, one “states something about whoever or whatever is the so-and-so” (Donnellan 1966: 285). This use corresponds pretty well to Russell’s theory, and in this case, the description is an essential part of the thought being expressed. The main novelty was Donnellan’s claim of a distinct referential use of definite descriptions. Here one “uses the description to enable his audience to pick out whom or what he is talking about and states something about that person or thing” (Donnellan 1966: 285). In this case the description used is simply a device for getting one’s addressee to recognize whom or what one is talking about, and
61
62
I. Foundations of semantics is not an essential part of the utterance. Donnellan used the example in (20) to illustrate his distinction. (20) Smith’s murderer is insane. For an example of the attributive use, imagine the police detective at a gruesome crime scene, thinking that whoever could have murdered dear old Smith in such a brutal way would have to have been insane. For a referential use, we might imagine that Jones has been charged with the murder and that everybody is pretty sure he is the guilty party. He behaves very strangely during the trial, and an onlooker utters (20) by way of predicating insanity of Jones. The two uses involve different presuppositions: on the attributive use there is a presupposition that the description has a reference (or denotation in Russell’s terms), but on the referential use the speaker presupposes more specifically of a particular entity (Jones, in our example) that it is the one meeting the description. Note though, that a speaker can know who or what a definite description denotes and still use that description attributively. For example I might be well acquainted with the dean of my college, but when I advise my student, who has a grievance against the chair of the department, by asserting (21), (21) Take this issue to the dean of the college. I use the phrase the dean of the college attributively. I mean to convey the thought that the dean’s office is the one appropriate for the issue, regardless of who happens to be dean at the current time.
6.2. Contentious issues While it is generally agreed that an attributive-referential distinction exists, there have been several points of dispute. The most crucial one is the status of the distinction – whether it is semantic or pragmatic (something about which Donnellan himself seemed unsure).
6.2.1. A pragmatic analysis Donnellan had claimed that, on the referential use, a speaker can succeed in referring to an entity which does not meet the description used, and can make a true statement in so doing. The speaker who used (20) referentially to make a claim about Jones, for instance, would have said something true if Jones was indeed insane whether or not he murdered Smith. Kripke (1977) used this aspect to argue that Donnellan’s distinction is nothing more than the difference between speaker’s reference (the referential use) and semantic reference (the attributive use). He pointed out that similar kinds of misuses or errors can arise with proper names, for which Donnellan’s distinction, if viewed as semantic, could not be invoked (see below, section 8). Kripke argued further that since the same kind of attributive-referential difference in use of definite descriptions would arise in a language stipulated to be Russellian – that is, in which the only interpretation for definite descriptions was that proposed by Russell in “On denoting” – the fact that it occurs in English
4. Reference: Foundational issues does not argue that English is not Russellian, and thus does not argue that the distinction is semantic. (See Reimer 1998 for a reply.)
6.2.2. A semantic analysis On the other hand David Kaplan (1978) (among others, but cf. Salmon 2004) noted the similarity of Donnellan’s distinction to the de dicto/de re ambiguity which occurs in propositional attitude sentences. Kaplan suggested an analysis on which referential uses are involved in the expression of Russellian singular propositions. Suppose Jones is Smith’s murderer; then the referential use of (20) expresses the singular proposition consisting of Jones himself plus the property of being insane. (In suggesting this analysis Kaplan was rejecting Donnellan’s claim about the possibility of making true statements about misdescribed entities. Others have also adopted this revised view of the referential use, e.g. Wettstein 1983, Reimer 1998.) Kaplan’s analysis of the referential understanding is similar to the analysis of the complement of a propositional attitude verb when it is interpreted de re, while the ordinary Russellian analysis seems to match the analysis of the complement interpreted de dicto. However the two understandings of a sentence like (20), without a propositional attitude predicate, always have the same truth value while, as we have seen, in the context of a sentence about someone’s propositional attitude, the two interpretations can result in a difference in truth value. In explaining his analysis, Kaplan likened the referential use of definite descriptions to demonstrative NPs. This brings us back to the topic of indexicality.
7. Kaplan’s theory of indexicality A major contribution to our understanding of reference and indexicality came with Kaplan’s (1989) classic, but mistitled, article “Demonstratives”. The title should have been “Indexicals”. Acknowledging the error, Kaplan distinguished pure indexicals like I and tomorrow, which do not require an accompanying indication of the intended reference, from demonstrative indexicals, e.g. this, that book, whose uses do require such an indication (or demonstration, as Kaplan dubbed it). The paper itself was equally concerned with both subcategories.
7.1. Content vs. character The most important contribution of Kaplan’s paper was his distinction between two elements of meaning – the content of an expression and its character.
7.1.1. Content The content of the utterance of a sentence is the proposition it expresses. Assuming compositionality, this content is determined by the contents of the expressions which go to make up the uttered sentence. The existence of indexicals means that the content of an utterance is not determined simply by the expressions in it, but also by the context of utterance. Thus different utterances of, e.g., (22)
63
64
I. Foundations of semantics (22) I like that. will determine different propositions depending who is speaking, the time they are speaking, and what they are pointing at or otherwise indicating, and would vary depending on these parameters. In each case the proposition involved would be, on Kaplan’s view (following Russell), a singular proposition.
7.1.2. Character Although on Kaplan’s view the contribution of indexicals to propositional content is limited to their reference, they do have a meaning: I, for example, has a meaning involving the concept of being the speaker. These latter types of linguistically encoded meaning are what Kaplan referred to as “character”. In general, the character of an expression is a function which, given a context of utterance, returns the content of that expression in that context. In a way Kaplan is showing that Frege’s concept of sense actually needs to be subdivided into these two elements of character and content. (Cf. Kaplan 1989, fn. 26.)
7.2. An application of the distinction Indexicals, both pure and demonstrative, have a variable character – their character determines different contents in different contexts of utterance. However the content so determined is constant (an actual entity, on Kaplan’s view). Using the distinction between character and content, Kaplan is able to explain why (23) (23) I am here now. is in a sense necessary, but in another sense contingent. Its character is such that anyone uttering (23) would be making a true statement, but the content determined on any such occasion would be a contingent proposition. For instance if I were to utter (23) now, my utterance would determine the singular proposition containing me, my office; 2:50 pm on January 18, 2007; and the relation of being which relates an entity, a place, and a time. That proposition is true at the actual world, but false in many others.
7.3. The problem of the essential indexical Perry (1979), following Castañeda (1968), pointed out that indexicality seems to pose a special problem in propositional attitude sentences. Suppose Mary, a ballet dancer, has an accident resulting in amnesia. She sees a film of herself dancing, but does not recognize herself. As a result of seeing the film she comes to believe, de re, of the person in the film (i.e. herself) that that person is a good dancer. Still she lacks the knowledge that it is she herself who is a good dancer. As of now we have no way of representing this missing piece of knowledge. Perry proposed recognizing belief states, in addition to the propositions which are the objects of belief, in order to solve this problem; Mary grasps the proposition, but does not grasp it in the first person way. Lewis (1979) proposed instead viewing belief as attribution to oneself of a property; he termed this “belief de se”. Belief concerning a nonindexical proposition would then be self-attribution of the property of
4. Reference: Foundational issues belonging to a possible world where that proposition was true. Mary has the latter kind of belief with respect to the proposition that she is a good dancer, but does not (yet) attribute to herself good dancing capability.
7.4. Other kinds of NP As we have seen, Strawson pointed out that some definite descriptions which would not ordinarily be thought of as indexical can have an element of indexicality to them. An indexical definite description like the (present) king of France would, on Kaplan’s analysis, have both variable character and variable content. As uttered in 1770, for example, it would yield a function whose value in any possible world is whoever is king of France in 1770. That would be Louis XVI in the actual world, but other individuals in other worlds depending on contingent facts about French history. As uttered in 1950 the definite description has no reference in the actual world, but does in other possible worlds (since it is not a necessary fact that France is a republic and not a monarchy in 1950 – the French Revolution might have failed). A nonindexical definite description like the inventor of bifocals has a constant character but variable content. That is, in any context of utterance its content is the function from possible worlds that picks out whoever it is who invented bifocals in that world. For examples of NPs with constant character and constant content, we must turn to the category of proper names.
8. Proper names and Kripke’s return to Millian nondescriptionality It may be said that in asserting that proper names have denotation without connotation, Mill captured our ordinary pre-theoretic intuition. That is, it seems intuitively clear that proper names do not incorporate or express any properties like having taught Alexander the Great or having invented bifocals. On the other hand we can also understand why both Frege and Russell would be driven to the view that, despite this intuition, they do express some kind of property. That is because the alternative would be the direct reference, or Fido-Fido view, and the two kinds of problems that we saw arising for that view arise for proper names as well as definite descriptions. Thus identity sentences of the form a = b are informative with proper names as they are with definite descriptions, as exemplified in (24). (24) a. Mark Twain is Mark Twain. b. Samuel Clemens is Mark Twain. Intuitively (24b) conveys information over and above that conveyed by (24a). Similarly exchanging co-referential proper names in propositional attitude sentences can seem to change truth value. (25) a. Mary knows that Mark Twain wrote Tom Sawyer. b. Mary knows that Samuel Clemens wrote Tom Sawyer. We can well imagine someone named Mary for whom (25a) would be true yet for whom (25b) would seem false. Furthermore there are many proper names which are
65
66
I. Foundations of semantics non-referential, and for which negative identity sentences like (26) would seem true and not meaningless. (26) Santa Claus does not exist. If proper names have a sense, or are otherwise equivalent to definite descriptions, then some or all of these problems are solved. Thus it was an important development when Kripke argued for a return to Mill’s view on proper names. But before we get to that, we should briefly review a kind of weakened description view of proper names.
8.1. The ‘cluster’ view Both Wittgenstein (1953) and Searle (1958) argued for a view of proper names according to which they are associated semantically with a cluster of descriptions – something like a disjunction of properties commonly associated with the bearer of the name. Wittgenstein’s example used the name Moses, and he suggested that there is a variety of descriptions, such as “the man who led the Israelites through the wilderness”, “the man who as a child was taken out of the Nile by Pharaoh’s daughter”, which may give meaning to the name or support its use (Wittgenstein 1953, §79). No single description is assumed to give the meaning of the name. However, as Searle noted, on this view it would be necessarily true that Moses had at least one of the properties commonly attributed to him (Searle 1958: 172).
8.2. The return to Mill’s view In January of 1970 Saul Kripke gave an important series of lectures titled “Naming and necessity” which were published in an anthology in 1972 and in 1980 reissued as a book. In these lectures Kripke argued against both the Russell-Frege view of proper names as abbreviated definite descriptions and the Wittgenstein-Searle view of proper names as associated semantically with a cluster of descriptions, and in favor of a return to Mill’s nondescriptional view of proper names. Others had come to the same conclusion (e.g. Marcus 1961, Donnellan 1972), but Kripke’s defense of the nondescriptional view was the most thorough and influential. The heart of Kripke’s argument depends on intuitions about the reference of expressions in alternative possible worlds. These intuitions indicate a clear difference in behavior between proper names and definite descriptions. A definite description like the student of Plato who taught Alexander the Great refers to Aristotle in the actual world, but had circumstances been different – had Xenocrates rather than Aristotle taught Alexander the Great – then the student of Plato who taught Alexander the Great would refer to Xenocrates, and not to Aristotle. Proper names, on the other hand, do not vary their reference from world to world. Kripke dubbed them “rigid designators”. Thus sentences like (27) seem true to us. (27) Aristotle might not have taught Alexander the Great. Furthermore, Kripke pointed out, a sentence like (27) would be true no matter what contingent property description is substituted for the predicate. In fact something like (28) seems to be true:
4. Reference: Foundational issues (28) Aristotle might have had none of the properties commonly attributed to him. But the truth of (28) seems inconsistent with both the Frege-Russell definite description view of proper names and the Wittgenstein-Searle cluster view. On the other hand a sentence like (29) seems false. (29) Aristotle might not have been Aristotle. This supports Kripke’s claim of rigid designation for proper names; since the name Aristotle must designate the same individual in any possible world, there is no possible world in which that individual is not Aristotle. And thus, to put things in Kaplan’s terms, proper names have both constant character and constant content.
8.3. Natural kind terms Although it goes beyond our focus on NPs, it is worth mentioning that Kripke extended his theory of nondescriptionality to at least some common nouns – those naming species of plants or animal, like elm and tiger, as well as those for well-defined naturally occurring substances or phenomena, such as gold and heat, and some adjectives like loud, and red. In this Kripke’s views differed from Mill, but were quite similar to those put forward by Putnam (1975). Putnam’s most famous thought experiment involved imagining a “twin earth” which is identical to our earth except that the clear, colorless, odorless substance which falls from the sky as rain and fills the lakes and rivers, and which is called water by twin-English speaking twin earthlings, is not H2O but instead a complex compound whose chemical formula Putnam abbreviates XYZ. Putnam argues that although Oscar1 on earth and Oscar2 on twin earth are exactly the same mentally when they think “I would like a glass of water”, nevertheless the contents of their thoughts are different. His famous conclusion: “ ‘Meanings’ just ain’t in the head” (Putnam 1975: 227; see Segal 2000 for an opposing view.)
8.4. Summary Let us take stock of the situation. We saw that the simplest theory of reference, the Fido-Fido or direct reference theory, had problems with accounting for the apparent semantic inequivalence of coreferential NPs – the fact that true identity statements could be informative, and that exchanging coreferential NPs in propositional attitude contexts could even result in a change in truth value. This theory also had a problem with non-referring or empty NPs, a problem which became particularly acute in the case of true negative existence statements. Frege’s theory of sense seemed to solve most of these problems, and Russell’s analysis of definite descriptions seemed to solve all of them. However, though the theories of Frege and Russell are plausible for definite descriptions, as Kripke made clear they do not seem to work well for proper names, for which the direct reference theory is much more plausible. But the same two groups of problems – those involving co-referential NPs and those involving empty NPs – arise for proper names just as they do for definite descriptions. Of these problems, the one involving substituting coreferential NPs in propositional attitude
67
68
I. Foundations of semantics contexts has attracted the most attention. (See also article 60 (Swanson) Propositional attitudes.)
9. Propositional attitude contexts Kripke’s arguments for a return to Mill’s view of proper names have generally been found to be convincing (although exceptions will be noted below). This appears to leave us with the failure of substitutivity of coreferential names in propositional attitude contexts. However Kripke (1979) argued that the problem was not actually one of substitutivity, but a more fundamental problem in the attribution of propositional attitudes.
9.1. The Pierre and Peter puzzles Kripke’s initial example involved a young Frenchman, Pierre, who when young came to believe on the basis of postcards and other indirect evidence that London was a beautiful city. He would sincerely assert (30) whenever asked. (30) Londres et jolie. Eventually, however, he was kidnapped and transported to a very bad section of London, and learned English by the direct method. His circumstances did not allow him to explore the city (which he did not associate with the city he knew as Londres), and thus based on his part of town, he would assert (31). (31) London is not pretty. The question Kripke presses us to answer is that posed in (32): (32) Does Pierre, or does he not, believe that London is pretty? An alternative, monolingual, version of the puzzle involves Peter, who has heard of Paderewski the pianist, and Paderewski the Polish statesman, but who does not know that they were the same person and who is furthermore inclined to believe that anyone musically inclined would never go into politics. The question is (33): (33) Does Peter, or does he not, believe that Paderewski had musical talent? Kripke seems to indicate that these questions do not have answers: “…our normal practices of interpretation and attribution of belief are subjected to the greatest possible strain, perhaps to the point of breakdown. So is the notion of the content of someone’s assertion, the proposition it expresses” (Kripke 1979: 269; italics in original). Others, however, have not been deterred from answering Kripke’s questions in (32) and (33).
9.2. Proposed solutions Many solutions to the problem of propositional attitude attribution have been proposed. We will look here at several of the more common kinds of approaches.
4. Reference: Foundational issues
9.2.1. Metalinguistic approaches Metalinguistic approaches to the problem involve linguistic expressions as components of belief in one way or another. Quine (1956) had suggested the possibility of viewing propositional attitudes as relations to sentences rather than propositions. This would solve the problem of Pierre, but would seem to leave Peter’s problem, given that we have a single name Paderewski in English (but see Fiengo & May 1998). Others (Bach 1987, Katz 2001) have put forward metalinguistic theories of proper names, rejecting Kripke’s arguments for their nondescriptionality. The idea here is that a name N means something like “the bearer of N”. This again would seem to solve the Pierre puzzle (Pierre believes that the bearer of Londres is pretty, but not the bearer of London), but not Peter’s Paderewski problem. Bach argues that (33) would need contextual supplementation to be a complete question about Peter’s beliefs (see Bach 1987: 165ff).
9.2.2. Hidden indexical theories The remaining two groups of theories are consistent with Kripke’s nondescriptional analysis of proper names. Hidden indexical theories involve postulating an unmentioned (or hidden) element in belief attributions, which is “a mode of presentation” of the proposition, belief in which is being attributed. (Cf. Schiffer 1992, Crimmins & Perry 1989.) Thus belief is viewed as a three-place relation, involving a believer, a proposition believed, and a mode of presentation of that proposition. Furthermore these modes of presentation are like indexicals in that different ones may be invoked in different contexts of utterance. The answer to (32) or (33) could be either Yes or No, depending upon which kind of mode of presentation was understood. The approach of Richard (1990) is similar, except that the third element is intended as a translation of a mental representation of the subject of the propositional attitude verb.
9.2.3. Pragmatic theories Our third kind of approach is similar to the hidden indexical theories in recognizing modes of presentation. However the verb believe (like other propositional attitude verbs) is seen as expressing a two-place relation between a believer and a proposition, and no particular mode of presentation is entailed. Instead, this relation is defined in such a way as to entail only that there is at least one mode of presentation under which the proposition in question is believed. (Cf. Salmon 1986.) This kind of theory would answer either (32) or (33) with a simple Yes since there is at least one mode of presentation under which Pierre believes that London is pretty, and at least one under which Peter believes that Paderewski had musical talent. A pragmatic explanation is offered for our tendency to answer No to (32) on the basis of Pierre’s English assertion London is not pretty.
10. Indefinite descriptions We turn now to indefinite descriptions – NPs which in English begin with the indefinite article a/an.
69
70
I. Foundations of semantics
10.1. Indefinite descriptions are not referring expressions As we saw above, Russell did not view definite descriptions as referring expressions, so it will come as no surprise that he was even more emphatic about indefinite descriptions. He had several arguments for this view (cf. Russell 1919: 167ff). Consider his example in (34). (34) I met a man. Suppose that the speaker of (34) had met Mr. Jones, and that that meeting constituted her grounds for uttering (34). In that case, were a man referential, it would have to refer to Mr. Jones. Nevertheless someone who did not know Jones at all could easily have a full understanding of (34). And were the speaker of (34) to add (35) to her utterance (35) …but it wasn’t Jones. she would not be contradicting herself (though of course she would be lying, under the circumstances). On the other hand if it should turn out that the speaker of (34) did not meet Jones after all, but did meet some other man, it would be very hard to regard (34) as false. Russell’s arguments have been reiterated and augmented by Ludlow & Neale (1991).
10.2. Indefinite descriptions are referring expressions Since Russell’s time others (e.g. Strawson 1952) have argued that indefinite descriptions do indeed have referring uses. The clearest kinds of cases are ones in which chains of reference occur, as in (36) (from Chastain 1975: 202). (36) A man was sitting underneath a tree eating peanuts. A squirrel came by, and the man fed it some peanuts. Both a man and a squirrel in (36) seem to be coreferential with subsequent expressions that many people would consider to be referring – if not the man, then at least it. Chastain argues that there is no reason to deny referentiality to the indefinite NPs which initiate such chains of reference, and that indeed, that is where the subsequent expressions acquired their referents. It should be noted, though, that if serving as antecedent for one or more pronouns is considered adequate evidence for referentiality, then overtly quantificational NPs should also be considered referential, as shown in (37). (37) a. Everybody who came to my party had a good time. They all thanked me afterward. b. Most people don’t like apples. They only eat them for their health.
10.3. Parallels between indefinite and definite descriptions Another relevant consideration is the fact that indefinite descriptions seem to parallel definite descriptions in several ways. They show an ambiguity similar to the de dicto-de re ambiguity in propositional attitude contexts, as shown in (38).
4. Reference: Foundational issues (38) Mary wants to interview a diplomat. (38) could mean either that there is a particular diplomat whom Mary is planning to interview (where a diplomat has wide scope corresponding to the de re reading for definite descriptions), or that she wants to interview some diplomat or other – say to boost the prestige of her newspaper. (This reading, where a diplomat has narrow scope with respect to the verb wants, corresponds to the de dicto reading of definite descriptions.) Neither of these readings entails the other – either could be true while the other is false. Furthermore indefinite descriptions participate in a duality of usage, the specificnonspecific ambiguity (see article 42 (von Heusinger) Specificity), which is very similar to Donnellan’s referential-attributive ambiguity for definite descriptions. Thus while the indefinites in Chastain’s example above are most naturally taken specifically, the indefinite in (39) must be taken nonspecifically unless something further is added (since otherwise the request would be infelicitous). (39) Please hand me a pencil. (See also Fodor & Sag 1982.) In casual speech specific uses of indefinite descriptions can be unambiguously paraphrased using non-demonstrative this, as in (40). (40) This man was sitting underneath a tree eating peanuts. However non-demonstrative this cannot be substituted for the non-specific a in (39) without causing anomaly. So if at least some occurrences of definite descriptions are viewed as referential, these parallels provide an argument that the corresponding occurrences of indefinite descriptions should also be so viewed. (Devitt 2004 argues in favor of this conclusion.)
10.4. Discourse semantics More recently approaches to semantics have been developed which provide an interpretation for sentences in succession, or discourses. Initially developed independently by Heim (1982) and Kamp (1981), these approaches treat both definite and indefinite descriptions as similar in some ways to quantificational terms and in some ways to referring expressions such as proper names. Indefinite descriptions introduce new discourse entities, and subsequent references, whether achieved with pronouns or with definite descriptions, add information about those entities. (See article 37 (Kamp & Reyle) Discourse Representation Theory, article 38 (Dekker) Dynamic semantics.)
10.5. Another puzzle about belief We noted above that indefinite descriptions participate in scope ambiguities in propositional attitude contexts. The example below in (41) was introduced by Geach (1967), who argued that it raises a new problem of interpretation. (41) Hob thinks a witch has blighted Bob’s mare, and Nob wonders whether she (the same witch) killed Cob’s sow.
71
72
I. Foundations of semantics Neither the ordinary wide scope or narrow scope interpretation is correct for (41). The wide scope interpretation (there is a witch such that…) would entail the existence of a witch, which does not seem to be required for the truth of (41). On the other hand the narrow scope interpretation (Hob thinks that there is a witch such that…) would fail to capture the identity between Hob’s witch and Nob’s witch. This problem, which Geach referred to as one of “intentional identity”, like many others, has remained unsolved.
11. Summary As we have seen, opinions concerning referentiality vary widely, from Russell’s position on which almost no NPs are referential to a view on which almost any NP has at least some referential uses. The differences may seem inconsequential, but they are central to issues surrounding the relations among language, thought, and communication – issues such as the extent to which we can represent the propositional attitudes of others in our speech, and even the extent to which our own thoughts are encoded in the sentences we utter, as opposed to being inferred from hints provided by our utterances.
12. References Bach, Kent 1987. Thought and Reference. Oxford: Oxford University Press. Carnap, Rudolf 1956. Meaning and Necessity: A Study in Semantics and Modal Logic. 2nd edn. Chicago, IL: The University of Chicago Press. Castañeda, Hector-Neri 1968. On the logic of attributions of self knowledge to others. Journal of Philosophy 65, 439–456. Chastain, Charles 1975. Reference and context. In: K. Gunderson (ed.). Minnesota Studies in the Philosophy of Science, vol. 7: Language Mind and Knowledge. Minneapolis, MN: University of Minnesota Press, 194–269. Crimmins, Mark & John Perry 1989. The prince and the phone booth. Journal of Philosophy 86, 685–711. Davidson, Donald 1968. On saying that. Synthese 19, 130–146. Reprinted in: D. L. Davidson. Inquiries into Truth and Interpretation. Oxford: Clarendon Press, 1984, 93–108. Devitt, Michael 2004. The case for referential descriptions. In: M. Reimer & A. Bezuidenhout (eds.). Descriptions and Beyond. Oxford: Clarendon Press, 280–305. Donnellan, Keith S. 1966. Reference and definite descriptions. Philosophical Review 77, 281–304. Donnellan, Keith S. 1972. Proper names and identifying descriptions. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 356–379. Fiengo, Robert & Robert May 1998. Names and expressions. Journal of Philosophy 95, 377–409. Fodor, Janet D. & Ivan Sag 1982. Referential and quantificational indefinites. Linguistics & Philosophy 5, 355–398. Frege, Gottlob 1892. Über Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik, 25–50. English Translation in: P. Geach & M. Black (eds.). Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1980, 56–78. Geach, Peter T. 1967. Intentional identity. Journal of Philosophy 64, 627–632. Heim, Irene 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation. University of Massachusetts, Amherst, MA. Reprinted: Ann Arbor, MI: University Microfilms. Kamp, Hans 1981. A theory of truth and semantic representation. In: J. Groenendijk, T. M.V. Janssen & M. Stokhof (eds.). Formal Methods in the Study of Language. Amsterdam: Mathematical Centre, 277–322. Reprint in: J. Groenendijk, T.M.V. Janssen & M. Stokhof (eds.). Truth, Interpretation and information: Selected Papers from the Third Amsterdam Colloquium. Dordrecht: Foris, 1984, 1–41.
4. Reference: Foundational issues Kaplan, David 1978. Dthat. In: P. Cole (ed.). Syntax and Semantics 9: Pragmatics. New York: Academic Press, 221–243. Kaplan, David 1989. Demonstratives: An essay on the semantics, logic, metaphysics, and epistemology of demonstratives and other indexicals. In: J. Almog, J. Perry & H. Wettstein (eds.). Themes from Kaplan. Oxford: Oxford University Press, 481–563. Katz, Jerrold J. 2001. The end of Millianism: Multiple bearers, improper names, and compositional meaning. Journal of Philosophy 98, 137–166. Kripke, Saul 1972. Naming and necessity. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 253–355 and 763–769. Reissued separately with Preface, Cambridge, MA: Harvard University Press, 1980. Kripke, Saul 1977. Speaker’s reference and semantic reference. In: P. A. French, T. E. Uehling, Jr. & H. Wettstein (eds.). Midwest Studies in Philosophy, vol. II: Studies in the Philosophy of Language. Morris, MN: University of Minnesota, 255–276. Kripke, Saul 1979. A puzzle about belief. In: A. Margalit (ed.). Meaning and Use. Dordrecht: Reidel, 139–183. Lewis, David 1972. General semantics. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 169–218. Lewis, David. 1979. Attitudes de dicto and de se. Philosophical Review 88, 513–543. Ludlow, Peter & Stephen Neale 1991. Indefinite descriptions. Linguistics & Philosophy 14, 171–202. Marcus, Ruth Barcan 1961. Modalities and intensional languages. Synthese 13, 303–322. Mill, John Stuart 1843. A System of Logic, Ratiocinative and Inductive, Being a Connected View of the Principles of Evidence, and the Methods of Scientific Investigation. London: John W. Parker. Montague, Richard 1973. The proper treatment of quantification in ordinary English. In: J. Hintikka, J. Moravcsik & P. Suppes (eds.). Approaches to Natural Language: Proceedings of the 1970 Stanford Workshop on Grammar and Semantics. Dordrecht: Reidel, 221–242. Reprinted in: R. Thomason (ed.). Formal Philosophy: Selected Papers of Richard Montague. New Haven, CT: Yale University Press, 1974, 247–270. Neale, Stephen 1990. Descriptions. Cambridge, MA: The MIT Press. Perry, John 1979. The problem of the essential indexical. Noûs 13, 3–21. Putnam, Hilary 1975. The meaning of ‘meaning’. In: K. Gunderson (ed.). Language, Mind and Knowledge. Minneapolis, MN: University of Minnesota Press. Reprinted in: H. Putnam. Philosophical Papers, vol. 2: Mind, Language and Reality. Cambridge: Cambridge University Press, 1975, 215–271. Quine, Willard van Orman 1956. Quantifiers and propositional attitudes. Journal of Philosophy 53, 177–187. Reimer, Marga 1998. Donnellan’s distinction/Kripke’s test. Analysis 58, 89–100. Richard, Mark 1990. Propositional Attitudes. Cambridge: Cambridge University Press. Russell, Bertrand 1905. On denoting. Mind 14, 479–493. Russell, Bertrand 1917. Knowledge by acquaintance and knowledge by description. Reprinted in: B. Russell. Mysticism and Logic. Paperback edn. Garden City, NY: Doubleday, 1957, 202–224. Russell, Bertrand 1919. Introduction to mathematical philosophy. London: Allen & Unwin. Reissued n.d., New York: Touchstone Books. Russell, Bertrand 1957. Mr. Strawson on referring. Mind 66, 385–389. Salmon, Nathan 1986. Frege’s Puzzle. Cambridge, MA: The MIT Press. Salmon, Nathan 2004. The good, the bad, and the ugly. In: M. Reimer & A. Bezuidenhout (eds.). Descriptions and Beyond. Oxford: Clarendon Press, 230–260. Schiffer, Stephen 1992. Belief ascription. Journal of Philosophy 89, 499–521. Searle, John R. 1958. Proper names. Mind 67, 166–173. Segal, Gabriel M. 2000. A Slim Book about Narrow Content. Cambridge, MA: The MIT Press.
73
74
I. Foundations of semantics Stalnaker, Robert C. 1984. Inquiry. Cambridge, MA: The MIT Press. Stalnaker, Robert C. 1999. Context and Content. Oxford: Oxford University Press. Strawson, Peter F. 1950. On referring. Mind 59, 320–344. Strawson, Peter F. 1952. Introduction to Logical Theory. London: Methuen. Wettstein, Howard K. 1983. The semantic significance of the referential-attributive distinction. Philosophical Studies 44, 187–196. Wittgenstein, Ludwig 1953. Philosophical Investigations. Translated into English by G.E.M. Anscombe. Oxford: Blackwell.
Barbara Abbott, Lake Leelanau, MI (USA)
5. Meaning in language use 1. 2. 3. 4. 5. 6. 7.
Overview Deixis The Gricean revolution: The relation of utterance meaning to sentence meaning Implications for word meaning The nature of context and the relationship of pragmatics to semantics Summary References
Abstract In a speech community, meaning attaches to linguistic forms through the ways in which speakers use those forms, intending and expecting to communicate with their interlocutors. Grice’s (1957) insight that conventional linguistic meaning amounts to the expectation by members of a speech community that hearers will recognize speakers’ intentions in saying what they say the way they say it. This enabled him to sketch how this related to lexical meaning and presupposition, and (in more detail) implied meaning. The first substantive section of this article briefly recapitulates the work of Bar-Hillel (1954) on indexicals, leading to the conclusion that even definite descriptions have an indexical component. Section 3 describes Grice’s account of the relation of intention to intensions and takes up the notion of illocutionary force. Section 4 explores the implications of the meaning-use relationship for the determination of word meanings. Section 5 touches briefly on the consequences of the centrality of communication for the nature of context and the relation of context and pragmatic considerations to formal semantic accounts.
1. Overview At the beginning of the 20th century a tension existed between those who took languages to be representable as formal systems with complete truth-conditional semantics (Carnap, Russell, Frege, Tarski) – and who viewed natural language as rather defective in that regard, and those who took the contextual use of language to be determinative of the meanings of its forms (Austin, Strawson, the later Wittgenstein). (See also article 3 (Textor) Sense and reference.) For a very readable account of this intellectual Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 74–95
5. Meaning in language use battleground, see Recanati (2004). The debate has been largely resolved with acceptance (not always explicitly recognized) of the views of Bar-Hillel and Grice on the character of the human use of language. In a speech community, meaning attaches to linguistic forms through the ways in which speakers use those forms, intending and expecting to communicate with their interlocutors. This is immediately obvious in the case of the reference of indexical terms like I and you, and the determination of the illocutionary force of utterances. It extends also to matters of reference generally, implicature, coherence of texts and discourse understanding, as well as to the role of linguistic forms in maintaining social relations (politeness). Grice’s (1957) insight that conventional linguistic meaning amounts to the expectation by members of a speech community that hearers will recognize speakers’ intentions in saying what they say the way they say it enabled him to sketch how this related to lexical meaning and presupposition, and (in more detail) implied meaning. While the view presented here has its origins in Grice’s insight, it is considerably informed by elaborations and extensions of it over the past 40 years.
2. Deixis Bar-Hillel (1954) demonstrated that it is not linguistic forms that carry pragmatic information, but the facts of their utterance, and this notion was elaborated in Stalnaker (1972). Bar-Hillel claimed that indexicality is an inherent and unavoidable aspect of natural language, speculating that more than 90% of the declarative sentences humans utter have use-dependent meanings in that they involve implicit references to the speaker, the addressee and/or the speech time. The interpretation of first and second person pronouns, tenses, and deictic adverbials are only the tip of the iceberg. A whole host of other relational terms (not to mention deictic and anaphoric third-person references, and even illocutionary intentions) require an understanding of the speaker’s frame of reference for interpretation. To demonstrate, it is an elementary observation that utterances of sentences like those in (1) require an indication of when and where the sentence was uttered to be understood enough to judge whether they are true or false. (1)
a. I am hungry. b. It’s raining.
In fact, strictly speaking, the speaker’s intention is just as important as location in spacetime of the speech act. Thus, (1b) can be intended to refer to the location of the speaker at speech-time, or to some location (like the location of a sports event being viewed on television) that the speaker believes to be salient in the mind of the addressee, and will be evaluated as true or false accordingly. Of course, the reference of past and future tenses is also a function of the context in which they are uttered, referring to times that are a function of the time of utterance. Thus, if I utter (2a) at t0, I mean that I am hungry at t0; if I utter (2b) at t0, I mean that I was hungry at some point before t0, and if I say (2c) at t0, I mean that I will be hungry at some point after t0. (2)
a. I am hungry. b. I was hungry. c. I will be hungry.
75
76
I. Foundations of semantics Although (2a) indicates a unique time, (2b) and (2c) refer very vaguely to some time before or after t0; nothing in the utterance specifies whether it is on the order of minutes, days, weeks, years, decades, or millenia distant. Furthermore, it is not clear whether the time indicated by (2a) is a moment or an interval of indefinite duration which includes the moment of utterance. See McCawley (1971), Dowty (1979), Partee (1973), and Hinrichs (1986) for discussion of some issues and ramifications of the alternatives. Although Bar-Hillel never refers to the notion of intention, he seems to have recognized in discussing the deictic this that indexicals are multiply indeterminate, despite being linked to the context of utterance: “‘This’ is used to call attention to something in the centre of the field of vision of its producer, but, of course, also to something in his spatial neighborhood, even if not in his centre of vision or not in his field of vision at all, or to some thing or some event or some situation, etc., mentioned by himself or by somebody else in utterances preceding his utterance, and in many more ways” (BarHillel 1954: 373). The indexical this can thus be intended (a) deictically to refer to something gesturally indicated, (b) anaphorically to refer to a just completed bit of discourse, (c) cataphorically, to refer to a bit of discourse that will follow directly, or (d) figuratively, to refer to something evoked by whatever is indicated and deictically referred to (e.g., to evoke the content of a book by indicating an image of its dust jacket), as in (3), where this refers not to a part of a photograph that is gesturally indicated, or to the dust jacket on which it appears, but to the text of the book the dust jacket was designed to protect. (3)
Oh, I’ve read this!
Similarly, there is an indexical component to the interpretation of connectives and relational adverbials in that their contribution to the meaning of an utterance involves determining what bit of preceding discourse they are intended to connect whatever follows them to, as illustrated in (4). (4)
a. Therefore, Socrates is mortal. b. For that reason, we oppose this legislation.
In addition to such primary indexicals – linguistic elements whose interpretation is bound to the circumstances of their utterance, there are several classes of expressions whose interpretation is directly bound to the interpretation of such primary indexicals. Partee (1989) discussed covert pronominals, for example, local as in (5). (5)
Dan Rather went to a local bar.
Is the bar local relative to Rather’s characteristic location? to his location at speechtime? to the location of the speaker at speech time? to the characteristic location of the speaker? In addition, as has long been acknowledged in references to “the universe of discourse,” interpretation of the definite article invokes indexical reference; the uniqueness presupposition associated with use of the definite article amounts to a belief by the speaker that the intended referent of the definite NP is salient to the addressee. Indeed, insofar as the interpretation of ordinary kind names varies with the expectations about the universe of discourse that the speaker imputes to the addressee (Nunberg 1978), this
5. Meaning in language use larger class of indexicals encompasses the immense class of descriptive terms, including cat, mat, window, hamburger, red, and the like as in (6). (6)
a. The blond hamburger spilled her coffee. b. The cat shattered next to the mat.
(Some of the conclusions of Nunberg (1978) are summarized in Nunberg (1979), but the latter work does not give an adequate representation of Nunberg’s explanation of how the use of words by speakers in contexts facilitates reference. For example, Nunberg (1979) does not contain accounts of either the unextractability of interpretation from context (and the non-existence of null contexts), or the notion of a system of normal beliefs in a speech community. Yet both notions are central to understanding Nunberg’s arguments for his conclusions about polysemy and interpretation.) The important point here is that with these secondary indexicals (and in fact, even with the so-called primary indexicals, Nunberg 1993), interpretation is not a simple matter of observing properties of the speech situation (source of the speech sound, calendric time, etc.), but involves making judgements about possible mappings from signal to referent (essentially the same inferential processes as are involved in disambiguation – cf. Green (1995)). Bar-Hillel (1954) and Morgan (1978) pointed out that the interpretation of demonstratives involves choosing from among many (perhaps indefinitely many) objects the speaker may have been pointing at, as well as what distinct class or individual the speaker may have intended to refer to by indicating that object (as pointed out by Nunberg 1978). Thus, the interpretation of indexicals involves choices from the elements of a set (possibly an infinite set) that is limited differently in each speech situation. For further discussion, see also article 90 (Diessel) Deixis and demonstratives. Bar-Hillel’s observations on the nature of indexicals form the background for the development of context-dependent theories of semantics by Stalnaker (1970), Kamp (1981), Heim (1982) and others. See also articles 37 (Kamp & Reyle) Discourse Representation Theory and 38 (Dekker) Dynamic semantics. Starting from a very different set of considerations, Grice (1957) outlined how a speaker’s act of using a linguistic form to communicate so-called literal meaning makes critical reference to speaker and hearer, with far-reaching consequences. Section 3 takes up this account in detail.
3. The Gricean revolution: The relation of utterance meaning to sentence meaning 3.1. Meaning according to Grice Grice (1957) explored the fact that we use the same word (mean) for what an event entails, what a speaker intends to communicate by uttering something, and what a linguistic expression denotes. How these three kinds of meaning are related was illuminated by his later (and regrettably widely misunderstood) work on implicature (see Neale 1992 for discussion). Grice (1957) defined natural meaning (meaningN) as natural entailment: the subject of mean refers to an event or state, as in (7). (7)
Red spots on your body means you have measles.
77
78
I. Foundations of semantics He reserved the notion non-natural meaning (meaningNN) for signification by convention: the subject refers to an agent or a linguistic instrument, as in (8). (8)
a. By “my better half”, John meant his wife. b. In thieves’ cant, “trouble and strife” means ‘wife’.
Thus, Grice treated the linguistic meaning of expressions as strictly conventional. He declined to use the word sign for it as he equated sign with symptom, which characterizes natural meaning. He showed that non-natural meaning cannot be reduced via a behaviorist causal strategy to ‘has a tendency to produce in an audience a certain attitude, dependent on conditioning,’ as that would not distinguish non-natural meaning from natural meaning, and in addition, would fail to distinguish connotation from denotation. He argued that the causal account can characterize so-called “standard” meanings, but not meaning on an occasion of use, although meaning on an occasion of use is just what should explain standard meaning, especially on a causal account. Grice also rejected the notion that X meantNN Y is equivalent to ‘the speaker S intended the expression X to make the addressee H believe Y’, because that would include in addition to the conventional meaning of a linguistic expression, an agent’s utterly nonconventional manipulation of states and events. Finally, he ruled out interpretations of X meantNN Y as ‘the speaker intended X to make H believe Y, and to recognize S’s intention’, because it still includes in the same class as conventional linguistic meaning any intentional, nonconventional acts whose intention is intended to be recognized – such as presenting the head of St. John the Baptist on a platter. Grice supported instead a theory where A meantNN something by X entails that (a) an agent intended the utterance of X to induce a belief or intention in H, and (b) the agent intended X to be recognized as intended to induce that belief/intention. This formulation implies that A does not believe (a) will succeed without recognition of A’s intention. Grice’s argument begins by showing that in directive cases such as getting someone to leave, or getting a driver to stop a car (examples which do not crucially involve language), the intended effect must be the sort of thing that is within the control of the addressee. Thus, X meansNN something amounts to ‘people intend X to induce some belief/intention P by virtue of H taking his recognition of the intention to produce effect P as a reason to believe/intend P’. Grice pointed out that only primary intentions need to be recognized: the utterer/agent is held to intend what is normally conveyed by the utterance or a consequence of the act, and ambiguous cases are resolved with evidence from the context that bears on identifying a plausible intention. Grice’s (1975) account of communicated meaning is a natural extension of his (1957) account of meaning in general (cf. also Green 1990, Neale 1992), in that it characterizes meaning as inherently intentional: recognizing an agent’s intention is essential to recognizing what act she is performing (i.e., what she meantnn by her act). Grice’s reference to the accepted purpose or direction of the talk exchange (Grice 1975: 45) in the characterization of the Cooperative Principle implies that speaker and hearer are constantly involved (usually not consciously) in interpreting what each other’s goals must be in saying what they say. Disambiguating structurally or lexically ambiguous expressions like old men and women, or ear (i.e., of corn, or to hear with), inferring what referent a speaker intends to be picked out from her use of a definite noun phrase like the coffee place, and inferring what a speaker meant to implicate by an utterance that might seem unnecessary or irrelevant all depend equally on the assumptions that the speaker did intend something
5. Meaning in language use to be conveyed by her utterance that was sufficiently specific for the goal of the utterance, that she intended the addressee to recognize this intention, and by means of recognizing the intention, to recognize what the speaker intended to be conveyed. In semantics, as intended, the idea that the act of saying something communicates more than just what is said allowed researchers to distinguish constant, truth-conditional meanings that are associated by arbitrary convention with linguistic forms from aspects of understanding that are a function of a meaning being conveyed in a particular context (by whatever means). This in turn enabled syntacticians to abandon the hopeless quest for hidden structures whose analysis would predict non-truth-conditional meanings, and to concentrate on articulating syntactic theories that were compatible with theories of compositional semantics, articulating the details of the relation between form and conventional, truth-conditional meaning. Finally, it inspired a prodigious amount of research in language behavior (e.g., studies of rhetoric and politeness), where it has, unfortunately, been widely misconstrued. The domain of the principles described in Grice (1975) is actually much broader than usually understood: all intentional use of language, whether literal or not, and regardless of purpose. That is, Grice intended a broad rather than a narrow interpretation of the term conversation. Grice’s view of the overarching Cooperative Principle: “Make your conversational contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged” (Grice 1975: 45) is in fact that it is just the linguistic reflex of a more general principle which governs, in fact, defines, rational behavior: behavior intended to accomplish or enable the achievement of some purpose or goal. Insofar as these are notions universally attributable to human beings, such a principle should be universally applicable with regard to language use. Supposed counterexamples have not held up; for discussion, see Keenan (1976), Prince (1983), Green (1990). Thus, the Cooperative Principle will figure in the interpretation of language generally, not just clever talk, and in fact, will figure in the interpretation of behavior generally, whether communicative or not. Additional facets of this perspective are discussed in articles 91 (Beaver & Geurts) Presupposition, and 92 (Simons) Implicature.
3.2. The Cooperative Principle Since Grice was explicit about the Cooperative Principle not being restricted to linguistic acts, and because the imperative formulation has led to so much misunderstanding, it is useful to rephrase it more generally, and declaratively, where it amounts to this: Individuals act in accordance with their goals. Grice described four categories (Quantity, Quality, Relevance and Manner) of special cases of this principle, that is, applications of it to particular kinds of requirements, and gave examples of their application in both linguistic and non-linguistic domains. It is instructive to translate these into general, declarative formulations as well: An agent will do as much as is required for the achievement of the current goal. (QUANTITY I)
79
80
I. Foundations of semantics An agent will not do more than is required. (QUANTITY II) Agents will not deceive co-agents. (QUALITY) Consequently, an agent will try to make any assertion one that is true. (I) An agent will not say what she believes to be false. (II) An agent will not say that for which she lacks adequate evidence. An agent’s action will be relevant to and relative to an intention of the agent. (RELATION) An agent will make her actions perspicuous to others who share a joint intention. (MANNER) (I) Agents will not disguise actions from co-agents. Consequently, agents will not speak obscurely in attempting to communicate. (II) Agents will act so that intentions they intend to communicate are unambiguously reconstructible. (III) Agents will spend no more energy on actions than is necessary. (IV) Agents will execute sub-parts of a plan in an order that will maximize the perceived likelihood of achieving the goal.
The status of the maxims as just special cases of the Cooperative Principle implies that they are not (contra Lycan 1984: 75) logical consequences (corollaries), because they don’t follow as necessary consequences in all possible worlds. Nor are they additional stipulations, or an exhaustive list of special cases. Likewise, the maxims do not constitute the Cooperative Principle, as some writers have thought (e.g., Sperber & Wilson 1986: 36). On the contrary, the Cooperative Principle is a very general principle which determines, depending on the values shared by participants, any number of maxims instantiating ways of conforming to it. The maxims are not rules or norms that are taught or learned, as some writers would have it (e.g., Pratt 1981: 11, Brown & Yule 1983: 32, Blum-Kulka & Olshtain 1986: 175, Allwood, Anderson & Dahl 1977: 37, Ruhl 1989: 96, Sperber & Wilson 1986: 162). See Green (1990) for discussion. Rather, they are just particular ways of acting in accordance with one’s goals; all other things being equal, conforming to the Cooperative Principle involves conforming to all of them. When you can’t conform to all of them, as Grice discusses, you do the best you can. The premise of the Cooperative Principle, that individuals act in accordance with their goals is what allows Grice to refer to the Cooperative Principle as a definition of what it means to be rational (Grice 1975: 45, 47, 48–49), and the maxims as principles that willy-nilly govern interpersonal human behavior. If X’s goal is to get Y to do some act A, or believe some proposition P, it follows that X will want to speak in such a way that A or P is clearly identifiable (Maxims of Quality, Quantity, Manner), and X will not say things that will distract Y from getting the point (Maxim of Relevance). Furthermore, most likely, X will not want to antagonize Y (everybody’s Maxim of Politeness). Cohen & Levesque (1991) suggest that their analysis of joint intention enables one to understand “the social contract implicit in engaging in a dialogue in terms of the conversants’ jointly intending to make themselves understood, and to understand the other” (Cohen & Levesque 1991: 509), observing that this would predict the back-channel comprehension checks that pervade dialogue, as “means to attain the states of mutual belief that discharge this joint intention of understanding” (Cohen & Levesque 1991: 509).
5. Meaning in language use The characterization of the Cooperative Principle as the assumption that speakers act rationally (i.e., in accordance with their goals) makes a variety of predictions about the interpretation of behavior. First of all, it predicts that people will try to interpret weird, surprising, or unanticipated behavior as serving some unexpected goal before they discount it as irrational. The tenacity with which we assume that speakers observe the Cooperative Principle, and in particular the maxim of relevance was illustrated in Green (1990). That is, speakers assume that other speakers do what they do, say what they say, on purpose, intentionally, and for a reason (cf. Brown & Levinson 1978: 63). In other words, they assume that speech behavior, and indeed, all behavior that isn’t involuntary, is goal-directed. Speakers “know” the maxims as strategies and tactics for efficiently achieving goals, especially through speech. A person’s behavior will be interpreted as conforming to the maxims, even when it appears not to, because of the assumption of rationality (goal-directedness). Hearers will make inferences about the world or the speaker, or both, whenever novel (that is, previously unassumed) propositions have to be introduced into the context to make the assumption of goal-directedness and the assumption of knowledge of the strategies consistent with the behavior. Implicatures arise when the hearer additionally infers that the speaker intended those inferences to be made, and intended that intention to be recognized (cf. article 92 (Simons) Implicature). If no such propositions can be imagined, the speaker will be judged irrational, but irrationality will consist in believing the unbelievable, or believing something unfathomable, not in having imperfect knowledge of the maxims, or in not observing the maxims. Only our imagination limits the goals, and beliefs about the addressee’s goals, that we might attribute to the speaker. If we reject the assumption that the speaker is irrational, then we must at the very least assume that there is some goal to which his utterance is relevant in the given context, even if we can’t imagine what it is. Another illustration of the persistence of the assumption that utterances are acts executed in accordance with a plan to achieve a goal: even if we know that a sequence of sentences was produced by a computer, making choices entirely at random within the constraints of some grammar provided to it, if the sentences can be construed as connected and produced in the service of a single goal, it is hard not to understand them that way. That is why output like (9) from random sentence generators frequently produces giggles. (9)
a. b. c. d. e.
Sandy called the dog. Sandy touched the dog. Sandy wanted the dog. The dog arrived. The dog asked for Kim.
One further point requires discussion here. Researchers eager to challenge or to apply a Gricean perspective have often failed to appreciate how important it is that discourse is considered purposive behavior. Grice presumes that participants have goals in participating (apparently since otherwise they wouldn’t be participating). This is the gist of his remark that “each participant recognizes in [talk exchanges], to some extent, a common purpose or set of purposes, or at least a mutually accepted direction” (Grice 1975: 45).
81
82
I. Foundations of semantics This is perhaps the most misunderstood passage in “Logic and Conversation”. Grice is very vague about these purposes: how many there are, how shared they have to be. With decades of hindsight, we can see that the purposes are first of all not unique. Conversants typically have hierarchically embedded goals. Second, goals are not so much shared or mutual, as they are mutually modelled (Cohen & Perrault 1979, Cohen & Levesque 1980, Green 1982, Appelt 1985, Cohen & Levesque 1990, Perrault 1990): for George to understand Martha’s utterance of “X” to George, George must have beliefs about Martha which include Martha’s purpose in uttering “X” to George, which in turn subsumes Martha’s model of George, including George’s model of Martha, etc. Grice’s assertion (1975: 48) that “in characteristic talkexchanges, there is a common aim even if […] it is a second-order one, namely, that each party should, for the time being, identify himself with the transitory conversational interests of the other” is an underappreciated expression of this view. The idea that participants will at least temporarily identify with each other’s interests, i.e., infer what each other is trying to do, is what allows quarrels, monologues, and the like to be included in the talk exchanges that the Cooperative Principle governs. Grice actually cited quarrels and letterwriting as instances that did not fit an interpretation of the Cooperative Principle that he rejected (Grice 1975: 48), vitiating critiques by Pratt (1981) and Schauber & Spolsky (1986). The participants may have different values and agendas, but given Grice’s (1957) characterization of conventional meaning, for any communication to occur, each must make assumptions about the other’s goals, at least the low-level communicative goals. This is the sense in which participants recognize a “common goal”. When the assumptions participants make about each other’s goals are incorrect, and this affects non-trivial beliefs about each other, we say they are “talking at cross-purposes”, precisely because of the mismatch between actual goals and beliefs, and attributed goals and beliefs. Interpreting the communicative behavior of other human beings as intentional, and as relevant, necessary, and sufficient for the achievement of some presumed goal seems to be unavoidable. As Gould (1991: 60) noted, in quite a different context, “humans are pattern-seeking animals. We must find cause and meaning in all events.” This is, of course, equally true of behavior that isn’t intended as communicative. Crucially, we do not seem to entertain the idea that someone might be acting for no reason. That alternative, along with the possibility that he isn’t even doing what we perceive him to be doing, that it’s just happening to him, is one that we seem reluctant to accept without any independent support, such as knowledge that people do that sort of thing as a nervous habit, like playing with their hair. Thus, “cooperative” in the sense of the Cooperative Principle does not entail each party accepting all of their interlocutor’s goals as their own and helping to achieve them. Rather, it is most usefully understood as meaning no more – and no less – than ‘trying to understand the interaction from the other participants’ ‘point of view’, i.e., trying to understand what their goals and assumptions must be. When Grice refers to the Cooperative Principle as “rational”, it is just this assumption that actions are undertaken to accomplish goals that he has in mind.
3.3. Illocutionary intentions The second half of the 20th century saw focussed investigation of speech acts in general, and illocutionary aspects of meaning in particular, preeminently in the work of Austin
5. Meaning in language use (1962), Searle (1969) and Bach & Harnish (1979). Early on, the view within linguistics was that (i) illocutionary force was indicated by performative verbs (Austin 1962) or other Illocutionary-Force-Indicating-Devices (IFIDs, Stampe 1975) such as intonation or “markers” like preverbal please), and that (ii) where illocutionary force was ambiguous or unclear, that was because the performative clause prefixed to the sentence (Lakoff 1968, Ross 1970, Fraser 1974, Sadock 1974) was “abstract” and invisible. This issue has not been much discussed since the mid-1970s, and Dennis Stampe’s (1975) conclusion that so-called performative verbs are really constative, so that the appearance of performativity is an inference from the act of utterance may now be the default assumption. It makes performativity more a matter of the illocutionary intentions of the speaker than of the classification of visible or invisible markers of illocutionary force. The term “illocutionary intentions” is meant to include the relatively large but limited set of illocutionary forces described and classified by Austin (1962) and Searle (1969) and many others (stating, requesting, promising and the like). So, expositives are utterances which a speaker (S) makes with the intention that the addressee (A) recognize S’s intention that A believe that S believes their content. Promises are made with the intention that A recognize S’s intention that A believe that S will be responsible for making their content true. Interrogatives are uttered with the belief that A will recognize S’s intention that A provide information which S indicates. From the hearer’s point of view such illocutionary intentions do not seem significantly different from intentions about the essentially unlimited class of possible perlocutionary effects that a speaker might intend an addressee to recognize as intended on an occasion of use. So, reminders are expositives uttered with the intention that A will recognize that S believes A has believed what S wants A to believe. Warnings are utterances the uttering of which is intended to be recognized as intended to inform A that some imminent or contingent state of affairs will be bad for A. An insult is uttered with the intention that A recognize S’s intent to convey S’s view of A as having properties normally believed to be bad. Both kinds of intentions figure in conditions on the use of linguistic expressions of various sorts. For example, there are a whole host of phrasal verbs (many meaning roughly ‘go away’) such as butt out, that can be described as directive-polarity expressions (instantiating, reporting, or evoking directives), as illustrated in (10), where the asterisk prefixed to an expression indicates that it is ungrammatical. (10) a. b. c. d. e. f.
Butt out! *They may butt out. *Why do they butt out? They were asked to butt out. They refused to butt out. Why don’t you butt out?
But there are also expressions whose distribution depends on mental states of the speaker regarding the addressee that are not matters of illocutionary force. Often it is an intention to produce a particular perlocutionary effect, as with threat-polarity expressions like Or else!, exemplified in (11). (11) a. Get out, or else! b. *I want to get out, or else!
83
84
I. Foundations of semantics c. d. e. f.
*Who’s on first, or else? They said we had to get out, or else. They knew they had to get out, or else. *They were surprised that we had to get out, or else.
Sometimes, however, the relevant mental state of the speaker does not involve intentions at all. This is the case with ignorance-polarity constructions and idioms which are acceptable only in contexts where a relevant being (typically the speaker) is ignorant of the answer to an indicated question, as in (12) (cf. Horn 1972, 1978). (12) a. b. c. d.
Where the hell is it? I couldn’t figure out where the hell it was. *We went back to the place where the hell we left it. *I knew where the hell it would be found.
It is possible to represent this kind of selection formally, say, in terms of conjoined and/or nested propositions representing speaker intentions and speaker and addressee beliefs about intentions, actions, existence, knowledge and values. But it is not the linguistic sign which is the indicator of illocutionary intentions, it is the act of uttering it. An utterance is a warning just in case S intends A to recognize the uttering of it as intended to cause A to recognize that some situation may result in a state which A would consider bad. In principle, such conditions could be referenced as constraints in the lexical entries for restricted forms. A threat-polarity item like or else! presupposes a warning context, with the additional information that the speaker intends the addressee to recognize (that the speaker intends the addressee to recognize) that if the warning is not heeded, some individual or situation will be responsible for some state of affairs that endangers the referent of the subject of the clause to which or else is appended. Illocutionary force, like deixis, is an aspect of meaning that is fairly directly derivative of the use of a linguistic expression. At the other end of the spectrum, lexical meaning, which appears much more concrete and fixed, has also been argued to depend on the sort of social contract that the Cooperative Principle engenders, as explicated by Nunberg (1978, 2004). This is the topic of Section 4.
4. Implications for word meaning In general, to the extent that we are able to understand each other, it is because we all use language in accordance with the Cooperative Principle. This entails (cf. Grice 1957) that we will only use a referential term when we believe that our addressee will be able to identify our intended referent from our reference to it by that term in that context, and will believe that we intended him to do so. But the bottom line is that the task of the addressee is to deduce what sense the speaker most likely intended, and he will use all available clues to do so, without regard to whether they come from within or outside the immediate sentence or bit of discourse at hand. This means that even so-called literal meanings have an indexical character in depending on the speaker’s ability to infer correctly what an addressee will assume a term is intended to refer to on an occasion of use (cf. Nunberg 1978). Even the sense of predicative lexical items in an utterance is not fixed by or in a linguistic system, but can only be deduced in connection
5. Meaning in language use with assumptions about the speaker’s beliefs about the knowledge and expectations of the addressee. To illustrate, as has been often noted (cf. Ruhl 1989, Green 1998), practically any word can be used to denote an almost limitless variety of kinds of objects or functions: in addition to referring to a fruit, or a defective automobile, lemon might refer to the wood of the lemon tree, as in (13a), to the flavor of the juice of the fruit (13b), to the oil from the peel of the fruit (13c), to an object which has the color of the fruit (13d), to something the size of the fruit (13e), and to a substance with the flavor of the fruit (13f). These are only the most obvious uses from an apparently indefinitely large set. (13) a. b. c. d. e. f.
Lemon has an attractive grain, much finer than beech or cherry. I prefer the ’74 because the ’73 has a lemon aftertaste. Lemon will not penetrate as fast as linseed. The lemon is too stretchy, but the coral has a snag in it. Shape the dough into little lemons, and let rise. Two scoops of lemon, please, and one of Rocky Road.
The idea that what a word can be used to refer to might vary indefinitely is clearly unsettling. It makes the fact that we (seem to) understand each other most of the time something of a miracle, and it makes the familiar, comfortable Conduit Theory of communication (critiqued in Reddy 1979), according to which speakers encode ideas in words and sentences and send them to addressees to decode, quite irrational. But the conclusion that as language users we are free to use any word to refer to anything at all, any time we want is unwarranted. Lexical choice is always subject to the pragmatic constraint that we have to consider how likely it is that our intended audience will be able to correctly identify our intended referent from our use of the expression we choose. What would really be irrational would be using a word to refer to anything other than what we estimate our addressee is likely to take it to refer to, because it would be self-defeating. Thus, in spite of all the apparent freedom afforded to language users, rationality severely limits what a speaker is likely to use a term to refer to in a given context. Since people assume that people’s actions are goal-directed (so that any act will be assumed to have been performed for a reason), a speaker must be assumed to believe that, all things considered, the word she chooses is the best word to further her goals in its context and with respect to her addressee. Speakers frequently exploit the freedom they have, within the bounds of this constraint, referring to movies as turkeys, cars as lemons, and individuals in terms of objects associated with them, as when we say that the flute had to leave to attend his son’s soccer game, or that the corned beef spilled his beer. If this freedom makes communication sound very difficult to effect, and very fragile, it is important to keep in mind that we are probably less successful at it than we think we are, and generally oblivious of the work that is required as well. But it is probably not really that fragile. Believing as an operational principle in the convenient fiction that words have fixed meanings is what makes using them to communicate appear to require no effort. If we were aware of how much interpretation we depended on each other to do to understand us, we might hesitate to speak. Instead, we all act as if we believe, and believe that everyone else believes, that the denotation an individual word may have on an occasion of use is limited, somewhat arbitrarily, as a matter of linguistic convention. Nunberg (1978), extending observations
85
86
I. Foundations of semantics made by Lewis (1969), called this sort of belief a normal belief, defined so that the relation normally-believe holds of a speech community and a proposition P when people in that community believe that it is normal (i.e., unremarkable, to be expected) in that community to believe P and to believe that everyone in that community believes that it is normal in that community to believe P. See also Stalnaker (1974) and Atlas (2004). (The term speech community, following Nunberg (1978), is not limited to geographical or political units, or even institutionalized social units, but encompasses any group of individuals with common interests. Thus, we all belong simultaneously to a number of speech communities, depending on our interests and backgrounds; we might be women and mothers and Lutherans and lawyers and football fans and racketball players, who knit and surf the internet, and are members of countless speech communities besides these.) Illustrating the relevance to traditional semantic concerns of the notion of normal belief, it is normal beliefs about cars and trucks, and about what properties of them good old boys might find relevant that would lead someone to understand the coordination in (14a) with narrow adjectival scope and that in (14b) with wider scope: for example, because of the aerodynamic properties of trucks relative to cars, fast truck is almost an oxymoron. (14) a. The good ol’ boys there drive fast cars and trucks. b. The good ol’ boys there drive red cars and trucks. This technical use of normal belief should not be confused with other notions that may have the same name. A normal belief in the sense intended is only remotely related to an individual’s belief about how things normally are, and only remotely related (in a different direction) to a judgement that it is unremarkable to hold such a belief. The beliefs that are normal within a community are those that “constitute the background against which all utterances in that community are rationally made” (Nunberg 1978: 94–95). Addressing the issue of using words to refer to things, properties, and events, what it is considered normal to use a word like tack or host or rock or metal to refer to varies with the community. These are social facts, facts about societies, and only incidentally and contingently and secondarily facts about words. More precisely, they are facts about what speakers believe other speakers believe about conventions for using words. Thus, it is normal among field archaeologists to use mesh bound in frames to sift through excavated matter for remnants of material culture, and it is normally believed among them that this is normal, and that it is normal to refer to the sieves as screens. Likewise, among users of personal computers, it is normally believed that the contents of a data file may be inspected by projecting representations of portions of it on an electronic display, and it is normally believed that this belief is normally held, and that it is normal to refer to the display as a screen. Whether screen is (intended to be) understood as (normally) referring to a sort of sieve or to a video display depends on assumptions made by speaker and hearer about the assumptions each makes about the other’s beliefs, including beliefs about what is normal in a situation of the sort being described, and about what sort of situation (each believes the other believes) is being discussed at the moment of utterance. This is what makes word meaning irreducibly a matter of language use. Although different senses of a word may sometimes have different syntactic distributions (so-called selectional restrictions), McCawley (1968) showed that this is not so much a function of the words
5. Meaning in language use themselves as it is a function of properties that language users attribute to the presumed intended referents of the words. Normal use is defined in terms of normal belief, and normal belief is an intensional concept. If everybody believes that everybody believes that it is normal to believe P, then belief in P is a normal belief, even if nobody actually believes P. In light of this, we are led to a view of word usage in which, when a speaker rationally uses a word w to indicate some intended individual or class a, she must assume that the addressee will consider it rational to use w to indicate a in that context. She must assume that if she and her addressee do not in fact have the same assumptions about what beliefs are normal in the community-at-large, and in every relevant subgroup, at least the addressee will be able to infer what relevant beliefs the speaker imputes to the addressee, or expects the addressee to impute to the speaker, and so on, in order to infer the intended referent. If we define an additional, recursive relation mutually-believe as holding among two sentient beings A and B and a proposition when A believes the proposition, believes that B believes the proposition, believes that B believes that A believes the proposition, and so on (cf. Cohen & Levesque 1990), then we can articulate the notion normal meaning (not to be confused with ‘normal referent out of context’ – cf. Green 1995: 14f, Green 1996: 59 for discussion): some set (or property) m is a normal meaning (or denotation) of an expression w insofar as it is normally believed that w is used to indicate m.
A meaning m for an expression w is normal in a context insofar as speaker and addressee mutually believe that it is normally believed that w is used to indicate m in that context. We can then say that ‘member of the species canis familiaris’ is a normal meaning for the word dog insofar as speaker and addressee mutually believe that it is normally believed in their community that such entities are called dogs. Some uses of referential expressions like those exemplified in (13) are not so much abnormal or less normal than others as they are normal in a more narrowly defined community. In cases of systematic polysemy, all the use-types (or senses), whether normal in broadly defined or very narrowly exclusive communities, are relatable to one another in terms of functions like ‘source of’, ‘product of’, ‘part of’, ‘mass of’, which Nunberg (1978) characterized as referring functions (for discussion, see Nunberg 1978, 2004, Green 1996, 1998, Pelletier & Schubert 1986, Nunberg & Zaenen 1992, Copestake & Briscoe 1995, Helmreich 1994). For example, using the word milkshake as in (15) to refer to someone who orders a milkshake exploits the referring function ‘purchaser of’, and presumes a mutual belief that it is normal for restaurant personnel to use the name of a menu item to refer to a purchaser of that item, or more generally, for sales agents to use a description of a purchase to refer to the purchaser. (15) The milkshake claims you kicked her purse. This is in addition, of course, to the mutual belief it presumes about what the larger class of English speakers normally use milkshake to refer to, and the mutual belief that the person identified as the claimant ordered a milkshake. The assumption that people’s actions are purposeful, so that any act will be assumed to have been performed for a reason, is a universal normal belief – everyone believes
87
88
I. Foundations of semantics it and believes that everyone believes it (cf. Green 1993). The consequence of this for communicative acts is that people intend and expect that interpreters will attribute particular intentions to them, so consideration of just what intention will be attributed to speech actions must enter into rational utterance planning (cf. Green 1993, also Sperber & Wilson 1986). This is the Gricean foundation of this theory (cf. also Neale 1992). If the number of meanings for a given lexical term is truly indefinitely extendable (as it appears to be), or even merely very large, it is impractical in the extreme to try to list them. But the usual solution to the problem of representing an infinite class in a finite (logical) space is as available here as anywhere else, at least to the extent that potential denotations can be described in terms of composable functions on other denotations, and typically, this is the case (Nunberg 1978: 29–62). It is enough to know, Nunberg argues, that if a term can be used to refer to some class X, then it can be used, given appropriate context, to refer to objects describable by a recognizable function on X. This principle can be invoked recursively, and applies to functions composed of other functions, and to expressions composed of other expressions, enabling diverse uses like those for lemon in (13) to be predicted in a principled manner. Because the intended sense (and thence the intended referent) of an utterance of any referential term ultimately reflects what the speaker intends the hearer to understand from what the speaker says by recognizing that the speaker intends him to understand that, interpreting an utterance containing a polysemic ambiguity (or indeed, any sort of ambiguity) involves doping out the speaker’s intent, just as understanding a speaker’s discourse goals does. For additional discussion, see also articles 25 (de Swart) Mismatches and coercion and 26 (Tyler & Takahashi) Metaphors and metonymies.
5. The nature of context, and the relationship of pragmatics to semantics 5.1. Disambiguation and interpretation Pragmatic information is information about the speaker’s mental models. Consequently, such semantic issues as truth conditions and determination of ambiguity are interdependent with issues of discourse interpretation, and the relevant contexts essential to the resolution of both are not so much the surrounding words and phrases as they are abstractions from the secular situations of utterance, filtered through the minds of the participants (cf. Stalnaker 1974). As a result, it is unreasonable to expect that ambiguity resolution independent of models of those situations can be satisfactory. Linguistic pragmatics irreducibly involves the speaker’s model of the addressee, and the hearer’s model of the speaker (potentially recursively). For George to understand Martha’s utterance of “X” to him, he must not only recognize (speech perception, parsing) that she has said “X,” he must have beliefs about her which allow him to infer what her purpose was in uttering “X,” which means that he has beliefs about her model of him, including her model of his model of her, and so on. Any of these beliefs is liable to be incorrect at some level of granularity. George’s model of Martha (and hers of him) is more like a sketch than a photograph: there are lots of potentially relevant things they don’t know about each other, and they most likely have got a few (potentially relevant) things wrong right off the bat as well. Consequently, even under the
5. Meaning in language use best of circumstances, whatever proposition George interprets Martha’s utterance to be expressing may not exactly match the proposition she intended him to understand. The difference, which often is not detected, may or may not matter in the grand scheme of things. Since acts are interpreted at multiple levels of granularity, this holds for the interpretation of acts involved in choosing words and construction types, as well as for acts of uttering sentences containing or instantiating them. From this it follows that the distinction between pragmatic effects called “intrusive pragmatics” (Levinson 2000) or explicature (Sperber & Wilson 1986) and those described as implicature proper has little necessary significance for an information-based account of language structure. Because the computation of pragmatic effects by whatever name involves analysis of what is underspecified in the actual utterance, and how what is uttered compares to what might reasonably have been expected, it involves importing propositions, a process beyond the bounds of computation within a finite domain, even if supplemented by a finite set of functions. When a reader or hearer recognizes that an utterance is ambiguous, resolving that ambiguity amounts to determining which interpretation was intended. When recognizing an ambiguity affects parsing, resolving it may involve grammar and lexicon, as for example, when it involves a form which could be construed as belonging to different syntactic categories (e.g., The subdivision houses most of the officers vs. The subdivision houses are very similar, or Visiting relatives is a lot of fun vs. Visiting relatives are a lot of fun). But grammar and lexicon may not be enough to resolve such ambiguities, as in the case of familiar examples like (16). (16) a. I saw her duck. b. Visiting relatives can be a lot of fun. They will rarely suffice to resolve polysemies or attachment ambiguities like I returned the key to the library. In all of these cases, it is necessary to reconstruct what it would be reasonable for the speaker to have intended, given what is known or believed about the beliefs and goals of the speaker, exactly as when seeking to understand the relevance of an unambiguous utterance in a conversation – that is, to understand why the speaker bothered to utter it, or to say it the way she said it. The literature on natural language understanding contains numerous demonstrations that the determination of how an ambiguous or vague term is intended to be understood depends on identifying the most likely model of the speaker’s beliefs and intentions about the interaction. Nunberg (1978: 84–87) discusses the beliefs and goals that have to be attributed to him in order for his uncle to understand what he means by jazz when he asks him if he likes jazz. Crain & Steedman (1985), and Altmann & Steedman (1988) offer evidence that experimentally controllable aspects of context that reflect speakers’ beliefs about situations affect processing in more predictable ways than mechanistic parsing strategies. Green (1996: 119–122) describes what is involved in identifying what is meant by IBM, at, and seventy-one in Sandy bought IBM at 71. At the other end of the continuum of grammatical and pragmatic uncertainty is Sperber & Wilson’s (1986: 239–241) discussion of the process of understanding irony and sarcasm, as when one says I love people who don’t signal, intending to convey ‘I hate people who don’t signal.’ (A further step in the process is required to interpret I love people who signal! as intended to convey the same thing; the difference is subtle, because
89
90
I. Foundations of semantics the contextual conditions likely to provoke the two utterances are in a subset relation. One might say I love people who don’t signal to inform an interlocutor of one’s annoyance at someone who the speaker noticed did not signal, but I love people who signal is likely to be used sarcastically only when the speaker believes that it is obvious to the addressee that someone should have signaled and did not.) Nonetheless, it may be instructive here to examine the resolution of a salient lexical ambiguity. Understanding the officials’ statement in example (17) involves comparing how different assumptions about mutual beliefs about the situation are compatible with different interpretations, in order to determine whether plant refers to a vegetable organism or to the production facility of a business (or perhaps just to the apparatus for controlling the climate within it, or maybe to the associated grounds, offices and equipment generally), or even to some sort of a decoy. (17) Officials at International Seed Co. beefed up security at their South Carolina facility in the face of rumors that competitors would stop at nothing to get specimens of a newly-engineered variety, saying, “That plant is worth $5 million.” If the interpreter supposes that what the company fears is simply theft of samples of the organism, she will take the official as intending plant to refer to the (type of the) variety: being able to market tokens of that type represents a potential income of $5 million. On the other hand, if the interpreter supposes that the company fears damage to their production facility or the property surrounding it – say, because she knows that samples of the organism are not even located at the production facility any more, and/or that extortionists have threatened to vandalize company property if samples of the organism are not handed over, she is likely to take plant as intended to refer to buildings and grounds or equipment. Believing that the company believes that potential income from marketing the variety is many times greater than $5 million would have the same effect. If the interpreter believes that the statement was made in the course of an interview where a company spokesperson discussed the cost of efforts to protect against industrial espionage, and mentioned how an elaborate decoy system had alerted them to a threat to steal plant specimens, she might even take plant as intended to refer to a decoy. The belief that officials believe that everyone relevant believes that both the earnings potential of the organism, and the value of relevant structures and infrastructure are orders of magnitude more or less than $5 million would contribute to this conclusion, and might even suffice to induce it on its own. Two points are relevant here. First, depending on how much of the relevant information is salient in the context in which the utterance is being interpreted, the sentence might not even be recognized as ambiguous. This is equally true in the case of determining discourse intents. For example, identifying sarcastic intent is similar to understanding the reference of an expression in that it depends on attributing to the speaker intent to be sarcastic. (Such an inference is supported by finding a literal meaning to be in conflict with propositions assumed to be mutually believed, but this is neither necessary nor sufficient for interpreting an utterance as sarcastic.) Being misled in the attribution of intent is a common sort of misunderstanding, indeed, a sort that is likely to go undetected. These issues are discussed at length in articles 23 (Kennedy) Ambiguity and vagueness, 38 (Dekker) Dynamic semantics, 88 (Jaszczolt) Semantics
5. Meaning in language use and pragmatics, 89 (Zimmermann) Context dependency and 94 (Potts) Conventional implicature. Second, there is nothing linguistic about the resolution of the lexical ambiguity in (17). All of the knowledge that contributes to the identification of a likely intended referent is encyclopedic or contextual knowledge of (or beliefs about) the relevant aspects of the world, including the beliefs of relevant individuals in it (e.g., the speaker, the (presumed) addressees of the quoted speech, the reporter of the quoted speech, and the (presumed) addressees of the report). That disambiguation of an utterance in its context may require encyclopedic knowledge of a presumed universe of discourse is hardly a new observation; it has been a commonplace in the linguistics and Artificial Intelligence literature for decades. Its pervasiveness and its significance sometimes seem to be surprisingly underappreciated. A similar demonstration could be made for many structural ambiguities, including some of the ones mentioned at the beginning of this section. Insofar as language users resolve ambiguities that they recognize by choosing the interpretation most consistent with their model of the speaker and of the speaker’s model of the world, modelling this ability of theirs by means of probability-based grammars and lexicons (e.g., Copestake & Briscoe 1995) is likely to provide an arbitrarily limited solution. When language users fail to recognize ambiguities in the first place, it is surely because beliefs about the speaker’s beliefs and intentions in the context at hand which would support alternative interpretations are not salient to them. This view treats disambiguation, at all levels, as indistinct from interpretation, insofar as both involve comparing an interpretation of a speaker’s utterance with goals and beliefs attributed to that speaker, and rejecting interpretations which in the context are not plausibly relevant to the assumed joint goal for the discourse. This is a conclusion that is unequivocally based in Grice’s seminal work (Grice 1957, 1975), and yet it is one that he surely did not anticipate. See also Atlas (2004).
5.2. Computational modelling Morgan (1973) showed that determining the presuppositions of an utterance depends on beliefs attributed to relevant agents (e.g., the speaker, and agents and experiencers of propositional attitude verbs) and is not a strictly linguistic matter. Morgan’s account, and Gazdar’s (1979) formalization of it, show that the presuppositions associated with lexical items are filtered in being projected as presuppositions of the sentence of which they form a part, by conversational implicature, among other things. (See also articles 37 (Kamp & Reyle) Discourse Representation Theory, 91 (Beaver & Geurts) Presupposition and 92 (Simons) Implicature.) Conversational implicature, as discussed in section 3 of this article is a function of a theory of human behavior generally, not something specifically linguistic, because it is based on inference of intentions for actions generally, not on properties of the artifacts that are the result of linguistic actions: conversational implicatures arise from the assumption that it is reasonable (under the particular circumstances of the speech event in question) to expect the addressee A to infer that the speaker S intended A to recognize S’s intention from the fact that the speaker uttered whatever she uttered. It would be naive to expect that the filtering in the projection of presuppositions could be represented as a constraint or set of constraints on values of
91
92
I. Foundations of semantics any discrete linguistic feature, precisely because conversational implicature is inherently indeterminate (Grice 1975, Morgan 1973, Gazdar 1979). Despite the limitations on the computation and attribution of assumptions, much progress has been made in recent years on simultaneous disambiguation and parsing of unrestricted text. To take only the example that I am most familiar with, Russell (1993) describes a system which unifies syntactic and semantic information from partial parses to postulate (partial) syntactic and semantic information for unfamiliar words. This progress suggests that a promising approach to the problem of understanding unrestricted texts would be to expand the technique to systematically include information about contexts of utterance, especially since it can be taken for granted that words will be encountered which are being used in unfamiliar ways. The chief requirement for such an enterprise is to reject (following Reddy 1979) the simplistic view of linguistic expressions as simple conduits for thoughts, and model natural language use as action of rational agents who treat the exchange of ideas as a joint goal, as Grice, Cohen, Perrault, and Levesque have suggested in the articles cited. This is a nontrivial task, and if it does not offer an immediate payoff in computational efficiency, ultimately it will surely pay off in increased accuracy, not to mention in understanding the subtlety of communicative and interpretive techniques.
6. Summary Some of the conclusions outlined here are surely far beyond what Bar-Hillel articulated in 1954, and what Grice may have had in mind in 1957 or 1968, the date of the William James Lectures in which the Cooperative Principle was articulated. Nonetheless, our present understanding rests on the shoulders of their work. Bar-Hillel (1954) demonstrated that it is not linguistic forms that carry pragmatic information, but the facts of their utterance. The interpretation of first and second person pronouns, tenses, and deictic adverbials are only the most superficial aspect of this. Bar-Hillel’s observations on the nature of indexicals form the background for the development of context-dependent theories of semantics by Stalnaker (1970), Kamp (1981), Heim (1982) and others. Starting from a very different set of considerations, Grice (1957) outlined how a speaker’s act of using a linguistic form to communicate so-called literal meaning makes critical reference to speaker and hearer, with far-reaching consequences. Grice’s (1957) insight that conventional linguistic meaning depends on members of a speech community recognizing speakers’ intentions in saying what they say the way they say it enabled him to sketch how this related to lexical meaning and presupposition, and (in more detail) implied meaning. While the view presented here has its origins in Grice’s insight, it is considerably informed by elaborations and extensions of it over the past 40 years. The distinction between natural meaning (entailment) and non-natural meaning (conventional meaning) provided the background against which his theory of the social character of communication was developed in the articulation of the Cooperative Principle. While the interpersonal character of illocutionary acts was evident from the first discussions, it was less obvious that lexical meaning, which appears much more concrete and fixed, could also be argued to depend on the sort of social contract that the Cooperative Principle engenders. Subsequent explorations into the cooperative aspects of action generally make it reasonable to anticipate a much more integrated understanding of the interrelations of content and context, of meaning and use, than seemed likely forty years ago.
5. Meaning in language use
7. References Allwood, Jens, Lars-Gunnar Anderson & Östen Dahl 1977. Logic in Language. Cambridge: Cambridge University Press. Altmann, Gerry & Mark Steedman 1988. Interaction with context during sentence processing. Cognition 30, 191–238. Appelt, Douglas E. 1985. Planning English Sentences. Cambridge: Cambridge University Press. Atlas, Jay David 2004. Presupposition. In: L. R. Horn & G. Ward (eds.). The Handbook of Pragmatics. Oxford: Blackwell, 29–52. Austin, John L. 1962. How to Do Things with Words. Cambridge, MA: Harvard University Press. Bach, Kent & Robert M. Harnish 1979. Linguistic Communication and Speech Acts. Cambridge, MA: The MIT Press. Bar-Hillel, Yehoshua 1954. Indexical expressions. Mind 63, 359–379. Blum-Kulka, Shoshana & Elite Olshtain 1986. Too many words: Length of utterance and pragmatic failure. Studies in Second Language Acquisition 8, 165–180. Brown, Gillian & George Yule 1983. Discourse Analysis. Cambridge: Cambridge University Press. Brown, Penelope & Stephen Levinson 1978. Universals in language usage: Politeness phenomena. In: E. Goody (ed.). Questions and Politeness: Strategies in Social Interaction. Cambridge: Cambridge University Press, 56–311. (Expanded version published 1987 in book form as Politeness, by Cambridge University Press.) Cohen, Philip R. & Hector J. Levesque 1980. Speech acts and the recognition of shared plans. Proceedings of the Third National Conference of the Canadian Society for Computational Studies of Intelligence. Victoria, BC: University of Victoria, 263–271. Cohen, Philip R. & Hector J. Levesque 1990. Rational interaction as the basis for communication. In: P. Cohen, J. Morgan & M. Pollack (eds.). Intentions in Communication. Cambridge, MA: The MIT Press, 221–256. Cohen, Philip R. & Hector J. Levesque 1991. Teamwork. Noûs 25, 487–512. Cohen, Philip R. & C. Ray Perrault 1979. Elements of a plan based theory of speech acts. Cognitive Science 3, 177–212. Copestake, Ann & Ted Briscoe 1995. Semi-productive polysemy and sense extension. Journal of Semantics 12, 15–67. Crain, Stephen & Mark Steedman 1985. On not being led up the garden path: The use of context by the psychological parser. In: D. R. Dowty, L. Karttunen & A. M. Zwicky (eds.). Natural Language Parsing. Cambridge: Cambridge University Press, 320–358. Dowty, David 1979. Word Meaning and Montague Grammar. Dordrecht: Reidel. Fraser, Bruce 1974. An examination of the performative analysis. Papers in Linguistics 7, 1–40. Gazdar, Gerald 1979. Pragmatics, Implicature, Presupposition, and Logical Form. New York: Academic Press. Gould, Stephen J. 1991. Bully for Brontosaurus. New York: W. W. Norton. Green, Georgia M. 1982. Linguistics and the pragmatics of language use. Poetics 11, 45–76. Green, Georgia M. 1990. On the universality of Gricean interpretation. In: K. Hall et al. (eds.). Proceedings of the 16th Annual Meeting of the Berkeley Linguistics Society. Berkeley, CA: Berkeley Linguistics Society, 411–428. Green, Georgia M. 1993. Rationality and Gricean Inference. Cognitive Science Technical Report UIUC-BI-CS-93-09. Urbana, IL, University of Illinois. Green, Georgia M. 1995. Ambiguity resolution and discourse interpretation. In: K. van Deemter & S. Peters (eds.). Semantic Ambiguity and Underspecification. Stanford, CA: CSLI Publications, 1–26. Green, Georgia M. 1996. Pragmatics and Natural Language Understanding. 2nd edn. Hillsdale, NJ: Lawrence Erlbaum Associates. Green, Georgia M. 1998. Natural kind terms and a theory of the lexicon. In: E. Antonsen (ed.). Studies in the Linguistic Sciences 28. Urbana, IL: Department of Linguistics, University of Illinois, 1–26.
93
94
I. Foundations of semantics Grice, H. Paul 1957. Meaning. Philosophical Review 66, 377–388. Grice, H. Paul 1975. Logic and conversation. In: P. Cole & J. L. Morgan (eds.). Syntax and Semantics 3: Speech Acts. New York: Academic Press, 41–58. Heim, Irene 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation, University of Massachusetts, Amherst, MA. Reprinted: Ann Arbor, MI: University Microfilms. Helmreich, Stephen 1994. Pragmatic Referring Functions in Montague Grammar. Ph.D. dissertation. University of Illinois, Urbana, IL. Hinrichs, Erhard 1986. Temporal anaphora in discourses of English. Linguistics & Philosophy 9, 63–82. Horn, Laurence R. 1972. On the Semantic Properties of Logical Operators in English. Ph.D. dissertation, University of California, Los Angeles, CA. Reprinted: Bloomington, IN: Indiana University Linguistics Club, 1976. Horn, Laurence R. 1978. Some aspects of negation. In: J. Greenberg, C. Ferguson & E. Moravcsik (eds.). Universals of Human Language, vol. 4, Syntax. Stanford, CA: Stanford University Press, 127–210. Kamp, Hans 1981. A theory of truth and semantic representation. In: J. Groenendijk, T. M. V. Janssen & M. Stokhof (eds.). Formal Methods in the Study of Language. Amsterdam: Mathematical Centre, 277–321. Keenan, Elinor O. 1976. The universality of conversational implicature. Language in Society 5, 67–80. Kripke, Saul 1972. Naming and necessity. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 253–355 and 763–769. Lakoff, Robin L. 1968. Abstract Syntax and Latin Complementation. Cambridge, MA: The MIT Press. Levinson, Stephen C. 2000. Presumptive Meanings: The Theory of Generalized Conversational Implicature. Cambridge, MA: The MIT Press. Lewis, David 1969. Convention: A Philosophical Study. Cambridge, MA: Harvard University Press. Lycan, William G. 1984. Logical Form in Natural Language. Cambridge, MA: The MIT Press. McCawley, James D. 1968. The role of semantics in a grammar. In: E. Bach & R. T. Harms (eds.). Universals in Linguistic Theory. New York: Holt, Rinehart & Winston, 124–169. McCawley, James D. 1971. Tense and time reference in English. In: C. Fillmore & D. T. Langendoen (eds.). Studies in Linguistic Semantics. New York: Holt, Rinehart & Winston, 97–113. Morgan, Jerry L. 1973. Presupposition and the Representation of Meaning: Prolegomena. Ph.D. dissertation. University of Chicago, Chicago, IL. Morgan, Jerry L. 1978. Two types of convention in indirect speech acts. In: P. Cole (ed.). Syntax and Semantics 9: Pragmatics. New York: Academic Press, 261–280. Neale, Stephen 1992. Paul Grice and the philosophy of language. Language and Philosophy 15, 509–559. Nunberg, Geoffrey 1978. The Pragmatics of Reference. Ph.D. dissertation. CUNY, New York. Reprinted: Bloomington, IN: Indiana University Linguistics Club. (Also published by Garland Publishing, Inc.) Nunberg, Geoffrey 1979. The non-uniqueness of semantic solutions: Polysemy. Linguistics & Philosophy 3, 145–185. Nunberg, Geoffrey 1993. Indexicality and deixis. Linguistics & Philosophy 16, 1–44. Nunberg, Geoffrey 2004. The pragmatics of deferred interpretation. In: L. R. Horn & G. Ward (eds.). The Handbook of Pragmatics. Oxford: Blackwell, 344–364. Nunberg, Geoffrey & Annie Zaenen 1992. Systematic polysemy in lexicology and lexicography. In: H. Tommola, K. Varantola, & J. Schopp (eds.). Proceedings of Euralex 92. Part II. Tampere: The University of Tampere, 387–398. Partee, Barbara 1973. Some structural analogues between tenses and pronouns in English. Journal of Philosophy 70, 601–609.
5. Meaning in language use
95
Partee, Barbara 1989. Binding implicit variables in quantified contexts. In: C. Wiltshire, B. Music & R. Graczyk (eds.). Papers from the 25th Regional Meeting of the Chicago Linguistic Society. Chicago, IL: Chicago Linguistic Society, 342–365. Pelletier, Francis J. & Lenhart K. Schubert 1986. Mass expressions. In: D. Gabbay & F. Guenthner (eds.). Handbook of Philosophical Logic, vol. 4. Dordrecht: Reidel, 327–407. Perrault, Ray 1990. An application of default logic to speech act theory. In: P. Cohen, J. Morgan & M. Pollack (eds.). Intentions in Communication. Cambridge, MA: The MIT Press, 161–186. Pratt, Mary L. 1981. The ideology of speech-act theory. Centrum (N.S.) I:1, 5–18. Prince, Ellen 1983. Grice and Universality. Ms. Philadelphia, PA. University of Pennsylvania. http:// www.ling.upenn.edu/~ellen/grice.ps, August 3, 2008. Putnam, Hilary 1975. The meaning of ‘meaning.’ In: K. Gunderson (ed.). Language, Mind, and Knowledge. Minneapolis, MN: University of Minnesota Press, 131–193. Recanati, Francois 2004. Pragmatics and semantics. In: L. R. Horn & G. Ward (eds.). The Handbook of Pragmatics. Oxford: Blackwell, 442–462. Reddy, Michael 1979. The conduit metaphor – a case of frame conflict in our language about language. In: A. Ortony (ed.). Metaphor and Thought. Cambridge: Cambridge University Press, 284–324. Ross, John R. 1970. On declarative sentences. In: R. Jacobs & P. S. Rosenbaum (eds.). Readings in English Transformational Grammar. Waltham, MA: Ginn, 222–272. Ruhl, Charles 1989. On Monosemy. Albany, NY: SUNY Press. Russell, Dale W. 1993. Language Acquisition in a Unification-Based Grammar Processing System Using a Real-World Knowledge Base. Ph.D. dissertation. University of Illinois, Urbana, IL. Sadock, Jerrold M. 1974. Toward a Linguistic Theory of Speech Acts. New York: Academic Press. Schauber, Ellen & Ellen Spolsky 1986. The Bounds of Interpretation. Stanford, CA: Stanford University Press. Searle, John. 1969. Speech Acts. Cambridge: Cambridge University Press. Sperber, Dan & Deirdre Wilson 1986. Relevance: Communication and Cognition. Cambridge, MA: Harvard University Press. Stalnaker, Robert C. 1970. Pragmatics. Synthese 22, 272–289. Stalnaker, Robert C. 1972. Pragmatics. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 380–397. Stalnaker, Robert C. 1974. Pragmatic presuppositions. In: M. K. Munitz & P. K. Unger (eds.). Semantics and Philosophy. New York: New York University Press, 197–214. Stampe, Dennis 1975. Meaning and truth in the theory of speech acts. In: P. Cole & J. L. Morgan (eds.). Syntax and Semantics 3: Speech Acts. New York: Academic Press, 1–40.
Georgia M. Green, Urbana, IL (USA)
96
I. Foundations of semantics
6. Compositionality 1. 2. 3. 4. 5. 6. 7.
Background Grammars and semantics Variants and properties of compositionality Arguments in favor of compositionality Arguments against compositionality Problem cases References
Abstract This article is concerned with the principle of compositionality, i.e. the principle that the meaning of a complex expression is a function of the meanings of its parts and its mode of composition. After a brief historical background, a formal algebraic framework for syntax and semantics is presented. In this framework, both syntactic operations and semantic functions are (normally) partial. Using the framework, the basic idea of compositionality is given a precise statement, and several variants, both weaker and stronger, as well as related properties, are distinguished. Several arguments for compositionality are discussed, and the standard arguments are found inconclusive. Also, several arguments against compositionality, and for the claim that it is a trivial property, are discussed, and are found to be flawed. Finally, a number of real or apparent problems for compositionality are considered, and some solutions are proposed.
1. Background Compositionality is a property that a language may have and may lack, namely the property that the meaning of any complex expression is determined by the meanings of its parts and the way they are put together. The language can be natural or formal, but it has to be interpreted. That is, meanings, or more generally, semantic values of some sort must be assigned to linguistic expressions, and compositionality concerns precisely the distribution of these values. Particular semantic analyses that are in fact compositional were given already in antiquity, but apparently without any corresponding general conception. For instance, in Sophist, chapters 24–26, Plato discusses subject-predicate sentences, and suggests (pretty much) that such a sentence is true [false] if the predicate (verb) attributes to what the subject (noun) signifies things that are [are not]. Notions that approximate the modern concept of compositionality did emerge in medieval times. In the Indian tradition, in the 4th or 5th century CE, Śabara says that The meaning of a sentence is based on the meaning of the words.
and this is proposed as the right interpretation of a sūtra by Jaimini from sometime 3rd– 6th century BCE (cf. Houben 1997: 75–76). The first to propose a general principle of this Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 96–123
6. Compositionality nature in the Western tradition seems to have been Peter Abelard (2008, 3.00.8) in the first half of the 12th century, saying that Just as a sentence materially consists in a noun and a verb, so too the understanding of it is put together from the understandings of its parts. (Translation by and information from Peter King 2007: 8.)
Abelard’s principle directly concerns only subject-predicate sentences, it concerns the understanding process rather than meaning itself, and he is unspecific about the nature of the putting-together operation. The high scholastic conception is different in all three respects. In early middle 14th century John Buridan (1998, 2.3, Soph. 2 Thesis 5, QM 5.14, fol. 23vb) states what has become known as the additive principle: The signification of a complex expression is the sum of the signification of its non-logical terms. (Translation by and information from Peter King 2001: 4).
The additive principle, with or without the restriction to non-logical terms, appears to have become standard during the late middle ages (for instance, in 1372, Peter of Ailly refers to the common view that it ‘belongs to the [very] notion of an expression that every expression has parts each one of which, when separated, signifies something of what is signified by the whole’; 1980: 30). The medieval theorists apparently did not possess the general concept of a function, and instead proposed a particular function, that of summing (collecting). Mere collecting is inadequate, however, since the sentences All A’s are B’s and All B’s are A’s have the same parts, hence the same collection of part-meanings and hence by the additive principle have the same meaning. With the development of mathematics and concern with its foundations came a renewed interest in semantics. Gottlob Frege is generally taken to be the first person to have formulated explicitly the notion of compositionality and to claim that it is an essential feature of human language (although some writers have doubted that Frege really expressed, or really believed in, compositionality; e.g. Pelletier 2001 and Janssen 2001). In “Über Sinn und Bedeutung”, 1892, he writes: Let us assume for the time being that the sentence has a reference. If we now replace one word of the sentence by another having the same reference, this can have no bearing upon the reference of the sentence. (Frege 1892: 62)
This is (a special case of) the substitution version of the idea of semantic values being determined; if you replace parts by others with the same value, the value of the whole doesn’t change. Note that the values here are Bedeutungen (referents), such as truth values (for sentences) and individual objects (for individual-denoting terms). Both the substitution version and the function version (see below) were explicitly stated by Rudolf Carnap in (1956) (for both extension and intension), and collectively labeled ‘Frege’s Principle’. The term ‘compositional’, was introduced by Hilary Putnam (a former student of Carnap’s) in Putman (1975a: 77), read in Oxford in 1960 but not published until in the
97
98
I. Foundations of semantics collection Putnam (1975b). Putnam says “[. . .] the concept of a compositional mapping should be so defined that the range of a complex sentence should depend on the ranges of sentences of the kinds occurring in the ‘derivational history’ of the complex sentence.” The first use of the term in print seems to be due to Jerry Fodor (a former student of Putnam’s) and Jerrold Katz (1964), to characterize meaning and understanding in a similar sense. Today, compositionality is a key notion in linguistics, philosophy of language, logic, and computer science, but there are divergent views about its exact formulation, methodological status, and empirical significance. To begin to clarify some of these views we need a framework for talking about compositionality that is sufficiently general to be independent of particular theories of syntax or semantics and yet allows us to capture the core idea behind compositionality.
2. Grammars and semantics The function version and the substitution version of compositionality are two sides of the same coin: that the meaning (value) of a compound expression is a function of certain other things (other meanings (values) and a ‘mode of composition’). To formulate these versions, two things are needed: a set of structured expressions and a semantics for them. Structure is readily taken as algebraic structure, so that the set E of linguistic expressions is a domain over which certain syntactic operations or rules are defined, and moreover E is generated by these operations from a subset A of atoms (e.g. words). In the literature there are essentially two ways of fleshing out this idea. One, which originates with Montague (see 1974a), takes as primitive the fact that linguistic expressions are grouped into categories or sorts, so that a syntactic rule comes with a specification of the sorts of each argument as well as of the value. This use of many-sorted algebra as an abstract linguistic framework is described in Janssen (1986) and Hendriks (2001). The other approach, first made precise in Hodges (2001), is one-sorted but uses partial algebras instead, so that rather than requiring the arguments of an operation to be of certain sorts, the operation is simply undefined for unwanted arguments. (A many-sorted algebra can in a straightforward way be turned into a one-sorted partial one (but not always vice versa), and under a natural condition the sorts can be recovered in the partial algebra (see Westerståhl 2004 for further details and discussion. Some theorists combine partiality with primitive sorts; for example, Keenan & Stabler 2004 and Kracht 2007.) The partial approach is in a sense simpler and more general than the many-sorted one, and we follow it here. Thus, let a grammar E = (E, A, Σ) be a partial algebra, where E and A are as above and Σ is a set of partial functions over E of finite arity which generate all expressions in E from A. To illustrate, the familiar rules NP → Det N S → NP VP
(NP-rule) (S-rule)
6. Compositionality
99
correspond to binary partial functions, say α, β ∈Σ, such that, if most, dog, and bark are atoms in A, one derives as usual the sentence Most dogs bark in E, by first applying α to most and dog, and then applying β to the result of that and bark. These functions are necessarily partial; for example, β is undefined whenever its second argument is dog. It may happen that one and the same expression can be generated in more than one way, i.e. the grammar may allow structural ambiguity. So it is not really the expressions in E but rather their derivation histories, or analysis trees, that should be assigned semantic values. These derivation histories can be represented as terms in a (partial) term algebra corresponding to E, and a valuation function is then defined from terms to surface expressions (usually finite strings of symbols). However, to save space we shall ignore this complication here, and formulate our definitions as if semantic values were assigned directly to expressions. More precisely, the simplifying assumption is that each expression is generated in a unique way from the atoms by the rules. One consequence is that the notion of a subexpression is well-defined: the subexpressions of t are t itself and all expressions used in the generation of t from atoms (it is fairly straightforward to lift the uniqueness assumption, and reformulate the definitions given here so that they apply to terms in the term algebra instead; see e.g. Westerståhl 2004 for details). The second thing needed to talk about compositionality is a semantics for E. We take this simply to be a function μ from a subset of E to some set M of semantic values (‘meanings’). In the term algebra case, μ takes grammatical terms as arguments. Alternatively, one may take disambiguated expressions such as phrase structure markings by means of labeled brackets. Yet another option is to have an extra syntactic level, like Logical Form, as the semantic function domain. The choice between such alternatives is largely irrelevant from the point of view of compositionality. The semantic function μ is also allowed to be partial. For example, it may represent our partial understanding of some language, or our attempts at a semantics for a fragment of a language. Further, even a complete semantics will be partial if one wants to maintain a distinction between meaningfulness (being in the domain of μ) and grammaticality (being derivable by the grammar rules). No assumption is made about meanings. What matters for the abstract notion of compositionality is not meanings as such, but synonymy, i.e. the partial equivalence relation on E defined by: u ≡μ t iff µ(u), µ(t) are both defined and µ(u) = µ(t). (We use s, t, u, with or without subscripts, for arbitrary members of E.)
3. Variants and properties of compositionality 3.1. Basic compositionality Both the function version and the substitution version of compositionality can now be easily formulated, given a grammar E and a semantics μ as above. Funct(μ)
For every rule α ∈ Σ there is a meaning operation rα such that if α (u1, . . ., un) is meaningful, then µ(α (u1, . . ., un)) = rα( µ(u1), . . ., µ(un)).
100
I. Foundations of semantics Note that Funct(μ) presupposes the Domain Principle (DP): subexpressions of meaningful expressions are also meaningful. The substitution version of compositionality is given by Subst(≡μ)
If s[u1, . . ., un] and s[t1, . . ., tn] are both meaningful expressions, and if ui ≡μ ti for 1 ≤ i ≤ n, then s[u1, . . ., un] ≡μ s[t1, . . . , tn].
The notation s[u1, . . ., un] indicates that s contains (not necessarily immediate) disjoint occurrences of subexpressions among u1, . . ., un, and s[t1, . . ., tn] results from replacing each ui by ti. Restricted to immediate subexpressions Subst(≡μ) says that ≡μ is a partial congruence relation: If α(u1, . . ., un) and α(t1, . . ., tn) are both meaningful and ui ≡μ ti for 1 ≤ i ≤ n, then α(u1, . . ., un) ≡μ α(t1, . . ., tn). Under DP, this is equivalent to the unrestricted version. Subst(≡μ) does not presuppose DP, and one can easily think of semantics for which DP fails. However, a first observation is: (1)
Under DP, Funct(μ) and Subst(≡μ) are equivalent.
That Rule(μ) implies Subst(≡μ) is obvious when Subst(≡μ) is restricted to immediate subexpressions, and otherwise proved by induction over the generation complexity of expressions. In the other direction, the operations rα must be found. For m1, . . ., mn ∈ M, let rα(m1, . . ., mn) = μ(α (u1, . . ., un)) if there are expressions ui such that μ(ui) = mi, 1 ≤ i ≤ n, and μ(α (u1, . . ., un)) is defined. Otherwise, rα(m, . . ., mn) can be undefined (or arbitrary). This is enough, as long as we can be certain that the definition is independent of the choice of the ui, but that is precisely what Subst(≡μ) says. The requirements of basic compositionality are in some respects not so strong, as can be seen from the following observations: (2)
If μ gives the same meaning to all expressions, then Funct(μ) holds.
(3)
If μ gives different meanings to all expressions, then Funct(μ) holds.
(2) is of course trivial. For (3), consider Subst(≡μ) and observe that if no two expressions have the same meaning, then ui ≡μ ti entails ui = ti, so Subst(≡μ), and therefore Funct(μ), holds trivially.
3.2. Recursive semantics The function version of compositional semantics is given by recursion over syntax, but that does not imply that the meaning operations are defined by recursion over meaning, in which case we have recursive semantics. Standard semantic theories are typically both recursive and compositional, but the two notions are mutually independent. In the recursive case we have: Rec(μ)
There is a function b and for every α ∈ Σ an operation rα such that for every meaningful expression s,
6. Compositionality ⎧b(s) if s is atomic μ (s ) = ⎨ ⎩rα (μ (u1 ), ..., μ (un ), u1 , ..., un ) if s = α (u1 , ..., un ) For μ to be recursive, the basic function b and the meaning composition operation rα must themselves be recursive, but this is not required in the function version of compositionality. In the other direction, the presence of the expressions u1, . . ., un themselves as arguments to rα has the effect that the compositional substitution laws need not hold (cf. Janssen 1997). If we drop the recursiveness requirement on b and rα, Rec(μ) becomes vacuous. This is because rα (m1, . . ., mn, u1, . . ., un) can simply be defined to be μ(α(u1, . . ., un)) whenever mi = μ(ui) for all i and α(u1, . . ., un) is meaningful (and undefined otherwise). Since intersubstitution of synonymous but distinct expressions changes at least one argument of rα, no counterexample is possible.
3.3. Weaker versions Basic (first-level) compositionality takes the meaning of a complex expression to be determined by the meanings of the immediate subexpressions and the top-level syntactic operation. We get a weaker version – second-level compositionality – if we require only that the operations of the two highest levels, together with the meanings of expressions at the second level, determine the meaning of the whole complex expression. A possible example comes from constructions with quantified noun phrases where the meanings of both the determiner and the restricting noun – i.e. two levels below the head of the construction in question – are needed for semantic composition, a situation that may occur with possessives and some reciprocals. In Peters & Westerståhl (2006, ch. 7) and in Westerståhl (2008) it is argued that, in general, the corresponding semantics is second-level but not (first-level) compositional. Third-level compositionality is defined analogously, and is weaker still. In the extreme case we have bottom-level, or weak functional compositionality, if the meaning of the complex term is determined only by the meanings of its atomic constituents and the entire syntactic construction (i.e. the derived operation that is extracted from a complex expression by knocking out the atomic constituents). A function version of this becomes somewhat cumbersome (but see Hodges 2001, sect. 5), whereas the substitution version becomes simply: AtSubst(≡μ) Just like Subst(≡μ) except that the ui and ti are all atomic. Although weak compositionality is not completely trivial (a language could lack the property), it does not serve the language users very well: the meaning operation rα that corresponds to a complex syntactic operation α cannot be predicted from its build-up out of simpler syntactic operations and their corresponding meaning operations. Hence, there will be infinitely many complex syntactic operations whose semantic significance must be learned one by one. It may be noted here that terminology concerning compositionality is somewhat fluctuating. David Dowty (2007) calls (an approximate version of) weak functional compositionality Frege’s Principle, and refers to Funct(μ) as homomorphism compositionality, or strictly local compositionality, or context-free semantics. In Larson & Segal (1995), this
101
102
I. Foundations of semantics is called strong compositionality. The labels second-level compositionality, third-level, etc. are not standard in the literature but seem appropriate.
3.4. Stronger versions We get stronger versions of compositionality by enlarging the domain of the semantic function, or by placing additional restrictions on meaningfulness or on meaning composition operations. An example of the first is Zoltán Szabó’s (2000) idea that the same meaning operations define semantic functions in all possible human languages, not just for all sentences in each language taken by itself. That is, whenever two languages have the same syntactic operation, they also associate the same meaning operation with it. An example of the second option is what Wilfrid Hodges has called the Husserl property (going back to ideas in Husserl 1900): (Huss) Synonymous expressions belong to the same (semantic) category. Here the notion of category is defined in terms of substitution; say that u ∼μ t if, for every s in E, s[u] ∈ dom(μ) iff s[t] ∈ dom(μ). So (Huss) says that synonymous terms can be inter-substituted without loss of meaningfulness. This is often a reasonable requirement (though Hodges 2001 mentions some putative counter-examples). (Huss) also has the consequence that Subst(≡μ) can be simplified to Subst1(≡μ), which only deals with replacing one subexpression by another. Then one can replace n subexpressions by applying Subst1(≡μ) n times; (Huss) guarantees that all the ‘intermediate’ expressions are meaningful. An example of the third kind is that of requiring the meaning composition operations to be computable. To make this more precise we need to impose more order on the meaning domain, viewing meanings too as given by an algebra M = (M, B, Ω), where B ⊆ M is a finite set of basic meanings, Ω is a finite set of elementary operations from n-tuples of meanings to meanings, and M is generated from B by means of the operations in Ω. This allows the definition of meaning operations by recursion over M. The semantic function µ is then defined simultaneously by recursion over syntax and by recursion over the meaning domain. Assuming that the elementary meaning operations are computable in a sense relevant to cognition, the semantic function itself is computable. A further step in this direction is to require that the meaning operations be easy to compute, thereby reducing or minimizing the complexity of semantic interpretation. For instance, meaning operations that are either elementary or else formed from elementary operations by function composition and function application would be of this kind (cf. Pagin 2011 for work in this direction). Another strengthening, also introduced in Hodges (2001), concerns Frege’s so-called Context Principle. A famous but cryptic saying by Frege (1884, x) is: “Never ask for the meaning of a word in isolation, but only in the context of a sentence”. This principle has been much discussed in the literature (for example, Dummett 1973, Dummett 1981, Janssen 2001, Pelletier 2001), and sometimes taken to conflict with compositionality. However, if not seen as saying that words somehow lose their meaning in isolation, it can be taken as a constraint on meanings, in the form of what we might call the Contribution Principle:
6. Compositionality (CP) The meaning of an expression is the contribution it makes to the meanings of complex expressions of which it is a part. This is vague, but Hodges notes that it can be made precise with an additional requirement on the synonymy ≡μ. Assume (Huss), and consider: InvSubst∃(≡μ)
If u ≢μ t, there is an expression s such that either exactly one of s[u] and s[t] is meaningful, or both are and s[u] ≢μ s[t].
So if two expressions of the same category are such that no complex expression of which the first is a part changes meaning when the first is replaced by the second, they are synonymous. That is, if they make the same contribution to all such complex expressions, their meanings cannot be distinguished. This can be taken as one half of (CP), and compositionality in the form of Subst1(≡μ) as the other. Remark: Hodges’ main application of these notions is to what has become known as the extension problem: given a partial compositional semantics μ, under what circumstances can μ be extended to a larger fragment of the language? Here (CP) can be used as a requirement, so that the meaning of a new word w, say, must respect the (old) meanings of complex expressions of which w is a part. This is especially suited to situations when all new items are parts of expressions that already have meanings (cofinality). Hodges defines a corresponding notion of fregean extension of μ, and shows that in the situation just mentioned, and given that μ satisfies (Huss), a unique fregean extension always exists. Another version of the extension problem is solved in Westerståhl (2004). An abstract account of compositional extension issues is given in Fernando (2005). End of remark We can take a step further in this direction by requiring that replacement of expressions by expressions with different meanings always changes meaning: InvSubst∀(≡µ)
If for some i, 0 ≤ i ≤ n, ui ≢µ ti, then for every expression s, either exactly one of s[u1, . . ., un] and s[t1, . . ., tn] are meaningful, or both are and s[u1, . . ., un] ≢µ s[t1, . . ., tn].
This disallows synonymy between complex expressions transformable into each other by substitution of constituents at least some of which are non-synonymous, but it does allow synonymous expressions with different structure. Carnap’s principle of synonymy as intensional isomorphism forbids this, too. With the concept of intension from possible-worlds semantics it can be stated as (RC)
t ≡µ u iff i) t, u are atomic and co-intensional, or ii) for some α, t = α(t1, . . ., tn), u = α(u1, . . ., un), and ti ≡µ ui, 1 ≤ i ≤ n
(RC) entails both Subst(≡µ) and InvSubst∀(≡µ), but is very restrictive. It disallows synonymy between brother and male sibling as well as between John loves Susan and Susan is loved by John, and allows different expressions to be synonymous only if they differ at most in being transformed from each other by substitution of synonymous atomic expressions.
103
104
I. Foundations of semantics (RC) seems too strong. We get an intermediate requirement as follows. First, define μ-congruence, ถμ in the following way: (ถµ)
t ถµ u iff i) t or u is atomic, t ≡µ u, and neither is a constituent of the other, or ii) t = α(t1, . . ., tn), u = β(u1, . . ., un), ti ถ ui, 1 ≤ i ≤ n, and for all s1, . . ., sn, α(s1, . . ., sn) ≡µ β(s1, . . ., sn), if either is defined.
Then require synonymous expressions to be congruent: (Cong) If t ≡µ u, then t ถµ u. By (Cong), synonymous expressions cannot differ much syntactically, but they may differ in the two crucial respects forbidden by (RC). (Cong) does not hold for natural language if logically equivalent sentences are taken as synonymous. That it holds otherwise remains a conjecture (but see Johnson 2006). It follows from (Cong) that meanings are (or can be represented as) structured entities: entities uniquely determined by how they are built, i.e. entities from which constituents can be extracted. We then have projection operations: (Rev)
n
For every meaning operation r : E → E there are projection operations sr, i such that sr, i(r(m1, . . ., mn)) = mi.
Together with the fact that the operations ri are meaning operations for a compositional semantic function µ, (Rev) has semantic consequences, the main one being a kind of inverse functional compositionality: InvFunct(µ)
The syntactic expression of a complex meaning m is determined, up to µ-congruence, by the composition of m and the syntactic expressions of its parts.
For the philosophical significance of inverse compositionality, see sections 4.6 and 5.2 below. For (ถµ), (Cong), InvFunct(µ), and a proof that (Rev) is a consequence of (Cong) (really of the equivalent statement that the meaning algebra is a free algebra), see Pagin (2003a). (Rev) seems to be what Jerry Fodor understands by ‘reverse compositionality’ in e.g. Fodor (2000: 371).
3.5. Direct and indirect compositionality Pauline Jacobson (2002) distinguishes between direct and indirect compositionality, as well as between strong direct and weak direct compositionality. This concerns how the analysis tree of an expression maps onto the expression itself, an issue we have avoided here, for simplicity. Informally, in strong direct compositionality, a complex expression t is built up from sub-expressions (corresponding to subtrees of the analysis tree for t) simply by means of concatenation. In weak direct compositionality, one expression may
6. Compositionality
105
wrap around another (as call up wraps around him in call him up). In indirect compositionality, there is no such simple correspondence between the composition of analysis trees and elementary operations on strings. Even under our assumption that each expression has a unique analysis, our notion of compositionality here is indirect in the above sense: syntactic operations may delete strings, reorder strings, make substitutions and add new elements. Strictly speaking, however, the direct/indirect distinction is not a distinction between kinds of semantics, but between kinds of syntax. Still, discussion of it tends to focus on the role of compositionality in linguistics, e.g. whether to let the choice of syntactic theory be guided by compositionality (cf. Dowty 2007 and Kracht 2007. For discussion of the general significance of the distinction, see Barker & Jacobson 2007).
3.6. Compositionality for “interpreted languages” Some linguists, among them Jacobson, tend to think of grammar rules as applying to signs, where a sign is a triple 〈e, k, m〉 consisting of a string, a syntactic category, and a meaning. This is formalized by Marcus Kracht (see 2003, 2007), who defines an interpreted language to be a set L of signs in this sense, and a grammar G as a set of partial functions from signs to signs, such that L is generated by the functions in G from a subset of atomic (lexical) signs. Thus, a meaning assignment is built into the language, and grammar rules are taken to apply to meanings as well. This looks like a potential strengthening of our notion of grammar, but is not really used that way, partly because the grammar is taken to operate independently (though in parallel) at each of the three levels. Let p1, p2, and p3 be the projection functions on triples yielding their first, second, and third elements, respectively. Kracht calls a grammar compositional if for each n-ary grammar rule α there are three operations rα,1, rα,2, and rα,3 such that for all signs σ1, . . ., σn for which α is defined, α(σ1, . . ., σn) = 〈rα, 1(p1(σ1), . . ., p1(σn)), rα, 2(p2(σ1), . . ., p2(σn)), rα, 3(p3(σ1), . . ., p3(σn))〉 and moreover α(σ1, . . ., σn) is defined if and only if each rα,i is defined for the corresponding projections. In a sense, however, this is not really a variant of compositionality but rather another way to organize grammars and semantics. This is indicated by (4) and (5) below, which are not hard to verify. First, call G strict if α(σ1, . . ., σn) defined and p1(σi) = p1(τi) for 1 ≤ i ≤ n entails α(τ1, . . ., τn) defined, and similarly for the other projections. All compositional grammars are strict. (4)
Every grammar G in Kracht’s sense for an interpreted language L is a grammar (E, A, Σ) in the sense of section 2 (with E = L, A = the set of atomic signs in L, and Σ = the set of partial functions of G). Provided G is strict, G is compositional (in Kracht’s sense) iff each of p1, p2, and p3, seen as assignments of values to signs (so p3 is the meaning assignment), is compositional (in our sense).
106
I. Foundations of semantics (5)
Conversely, if E = (E, A, Σ) is a grammar and µ a semantics for E, let L = {〈u, u, µ(u)〉 : u ∈ dom(µ)}. Define a grammar G for L (with the obvious atomic signs) by letting α(〈u1, u1, µ(u1)〉, . . ., 〈un, un, µ(un)〉) = 〈α(u1, . . ., un), α(u1, . . ., un), µ(α(u1, . . ., un))〉 whenever α ∈ Σ is defined for u1, . . ., un and α(u1, . . ., un) ∈ dom(µ) (undefined otherwise). Provided µ is closed under subexpressions and has the Husserl property, µ is compositional iff G is compositional.
3.7. Context dependence In standard possible-worlds semantics the role of meanings are served by intensions: functions from possible worlds to extensions. For instance, the intension of a sentence returns a truth value, when the argument is a world for which the function is defined. Montague (1968) extended this idea to include not just worlds but arbitrary indices i from some set I, as ordered tuples of contextual factors relevant to semantic evaluation. Speaker, time, and place of utterance are typical elements in such indices. The semantic function µ then assigns a meaning µ(t) to an expression t, which is itself a function such that for an index i ∈ I, µ(t)(i) gives an extension as value. Kaplan’s (1989) two-level version of this first assigns a function (character) to t taking certain parts of the index (the context, typically including the speaker) to a content, which is in turn a function from selected parts of the index to extensions. In both versions, the usual concept of compositionality straightforwardly applies. The situation gets more complicated when semantic functions themselves take contextual arguments, e.g. if a meaning-in-context for an expression t in context c is given as µ(t, c). The reason for such a change might be the view that the contextual meanings are contents in their own right, not just extensional fall-outs of the standing, context-independent meaning. But with context as an additional argument we have a new source of variation. The most natural extension of compositionality to this format is given by C-Funct(µ)
For every rule α ∈ Σ there is a meaning operation rα such that for every context c, if α(u1, . . ., un) has meaning in c, then µ(α(u1, . . ., un), c)= rα(µ(u1, c), . . ., µ(un, c)).
C-Funct(µ) seems like a straightforward extension of compositionality to a contextual semantics, but it can fail in a way non-contextual semantics cannot, by a context-shift failure. For we can suppose that although µ(ui, c) = µ(ui, c'), 1 ≤ i ≤ n, we still have µ(α(u1, . . ., un), c) ≠ µ(α(u1, . . ., un), c'). One might see this as a possible result of so-called unarticulated constituents. Maybe the meaning of the sentence (6)
It rains.
is sensitive to the location of utterance, while none of the constituents of that sentence (say, it and rains) is sensitive to location. Then the contextual meaning of the sentence at a location l is different from the contextual meaning of the sentence at another location l', even though there is no such difference in contextual meaning for any of the parts. This may hold even if substitution of expressions is compositional.
6. Compositionality There is therefore room for a weaker principle that cannot fail in this way, where the meaning operation itself takes a context argument: C-Funct(µ)c
For every rule α ∈ Σ there is a meaning operation rα such that for every context c, if α(u1, . . ., un) has meaning in c, then µ(α(u1, . . ., un), c) = rα(µ(u1, c), . . ., µ(un, c), c).
The only difference is the last argument of rα. Because of this argument, C-Funct(µ)c is not sensitive to the counterexample above, and is more similar to non-contextual compositionality in this respect. This kind of semantic framework is discussed in Pagin (2005); a general format, and properties of the various notions of compositionality that arise, are presented in Westerståhl (2011). For example, it can be shown that (weak) compositionality for contextual meaning entails compositionality for the corresponding standing meaning, but the converse does not hold. So far, we have dealt with extra-linguistic context, but one can also extend compositional semantics to dependence on linguistic context. The semantic value of some particular occurrence of an expression may then depend on whether it is an occurrence in, say, an extensional context, or an intensional context, or a hyperintensional context, a quotation context, or yet something else. A framework for such a semantics needs a set C of context types, an initial null context type θ ∈ C for unembedded occurrences, and a binary function ψ from context types and syntactic operators to context types. If α(t1, . . ., tn) occurs in context type ci, then t1, . . ., tn will occur in context type ψ(ci, α). The context type for a particular occurrence t oi of an expression ti in a host expression t is then determined by its immediately embedding operator α1, its immediately embedding operator, and so on until the topmost operator occurrence. The semantic function µ takes an expression t and a context type c into a semantic value. The only thing that will differ for linguistic context from C-Funct(µ)c above is that the context of the subexpressions may be different (according to the function ψ) from the context of the containing expression: LC-Funct(µ)c
For every α ∈ Σ there is an operation rα such that for every context c, if α(u1, . . ., un) has meaning in c, then µ(α(u1, . . ., un), c) = rα(µ(u1, c'), . . ., µ(un, c'), c), where c' = ψ(c, α).
4. Arguments in favor of compositionality 4.1. Learnability Perhaps the most common argument for compositionality is the argument from learnability: A natural language has infinitely many meaningful sentences. It is impossible for a human speaker to learn the meaning of each sentence one by one. Rather, it must be
107
108
I. Foundations of semantics possible for a speaker to learn the entire language by learning the meaning of a finite number of expressions, and a finite number of construction forms. For this to be possible, the language must have a compositional semantics. The argument was to some extent anticipated already in Sanskrit philosophy of language. During the first or second century BCE Patañjali writes: . . . Brhaspati addressed Indra during a thousand divine years going over the grammatical . expressions by speaking each particular word, and still he did not attain the end. . . . But then how are grammatical expressions understood? Some work containing general and particular rules has to be composed . . . (Cf. Staal 1969: 501–502. Thanks to Brendan Gillon for the reference.)
A modern classical passage plausibly interpreted along these lines is due to Donald Davidson: It is conceded by most philosophers of language, and recently by some linguists, that a satisfactory theory of meaning must give an account of how the meanings of sentences depend upon the meanings of words. Unless such an account could be supplied for a particular language, it is argued, there would be no explaining the fact that we can learn the language: no explaining the fact that, on mastering a finite vocabulary and a finite set of rules, we are prepared to produce and understand any of a potential infinitude of sentences. I do not dispute these vague claims, in which I sense more than a kernel of truth. Instead I want to ask what it is for a theory to give an account of the kind adumbrated. (Davidson 1967: 17)
Properly spelled out, the problem is not that of learning the meaning of infinitely many meaningful sentences (given that one has command of a syntax), for if I learn that they all mean that snow is white, I have already accomplished the task. Rather, the problem is that there are infinitely many propositions that are each expressed by some sentence in the language (with contextual parameters fixed), and hence infinitely many equivalence classes of synonymous sentences. Still, as an argument for compositionality, the learnability argument has two main weaknesses. First, the premise that there are infinitely many sentences that have a determinate meaning although they have never been used by any speaker, is a very strong premise, in need of justification. That is, at a given time t0, it may be that the speaker or speakers employ a semantic function μ defined for infinitely many sentences, or it may be that they employ an alternative function µ0 which agrees with μ on all sentences that have in fact been used but is simply undefined for all that have not been used. On the alternative hypothesis, when using a new sentence s, the speaker or the community gives some meaning to s, thereby extending µ0 to µ1, and so on. Phenomenologically, of course, the new sentence seemed to the speakers to come already equipped with meaning, but that was just an illusion. On this alternative hypothesis, there is no infinite semantics to be learned. To argue that there is a learnability problem, we must first justify the premise that we employ an infinite semantic function. This cannot be justified by induction, for we cannot infer from finding sentences meaningful that they were meaningful before we found them, and exactly that would have to be the induction base.
6. Compositionality The second weakness is that even with the infinity premise in place, the conclusion of the argument would be that the semantics must be computable, but computability does not entail compositionality, as we have seen.
4.2. Novelty Closely related to the learnability argument is the argument from novelty: speakers are able to understand sentences they have never heard before, which is possible only if the language is compositional. When the argument is interpreted so that, as in the learnability argument, we need to explain how speakers reliably track the semantics, i.e. assign to new sentences the meaning that they independently have, then the argument from novelty shares the two main weaknesses with the learnability argument.
4.3. Productivity According to the pure argument from productivity, we need an explanation of why we are able to produce infinitely many meaningful sentences, and compositionality offers the best explanation. Classically, productivity is appealed to by Noam Chomsky as an argument for generative grammar. One of the passages runs The most striking aspect of linguistic competence is what we may call the ‘creativity of language’, that is, the speaker’s ability to produce new sentences that are immediately understood by other speakers although they bear no physical resemblance to sentences that are ‘familiar’. The fundamental importance of this creative aspect of normal language use has been recognized since the seventeenth century at least, and it was the core of Humboldtian general linguistics. (Chomsky 1971: 74)
This passage does not appeal to pure productivity, since it makes an appeal to the understanding by other speakers (cf. Chomsky 1980: 76–78). The pure productivity aspect has been emphasized by Fodor (e.g. 1987: 147–148), i.e. that natural language can express an open-ended set of propositions. However, the pure productivity argument is very weak. On the premise that a human speaker can think indefinitely many propositions, all that is needed is to assign those propositions to sentences. The assignment does not have to be systematic in any way, and all the syntax that is needed for the infinity itself is simple concatenation. Unless the assignment is to meet certain conditions, productivity requires nothing more than the combination of infinitely many propositions and infinitely many expressions.
4.4. Systematicity A related argument by Fodor (1987: 147–150) is that of systematicity. It can be stated either as a property of speaker understanding or as an expressive property of a language. Fodor tends to favor the former (since he is ultimately concerned with the mental). In the simplest case, Fodor points out that if a language user understands a sentence of the
109
110
I. Foundations of semantics form tRu, she will also understand the corresponding sentence uRt, and argues that this is best explained by appeal to compositionality. Formally, the argument is to be generalized to cover the understanding of any new sentence that is formed by recombination of constituents that occur, and construction forms that are used, in sentences already understood. Hence, in this form it reduces to one of three different arguments; either to the argument from novelty, or to the productivity argument, or finally, to the argument from intersubjectivity (below), and only spells out a bit the already familiar idea of old parts in new combinations. It might be taken to add an element, for it not only aims at explaining the understanding of new sentences that is in fact manifested, but also predicts what new sentences will be understood. However, Fodor himself points out the problem with this aspect, for if there is a sentence s formed by a recombination that we do not find meaningful, we will not take it as a limitation of the systematicity of our understanding, but as revealing that the sentence s is not in fact meaningful, and hence that there is nothing to understand. Hence, we cannot come to any other conclusion than that the systematicity of our understanding is maximal. The systematicity argument can alternatively be understood as concerning natural language itself, namely as the argument that sentences formed by grammatical recombination are meaningful. It is debatable to what extent this really holds, and sentences (or so-called sentences) like Chomsky’s Colorless green ideas sleep furiously have been used to argue that not all grammatical sentences are meaningful. But even if we were to find meaningful all sentences that we find grammatical, this does not in itself show that compositionality, or any kind of systematic semantics, is needed for explaining it. If it is only a matter of assigning some meaning or other, without any further condition, it would be enough that we can think new thoughts and have a disposition to assign them to new sentences.
4.5. Induction on synonymy We can observe that our synonymy intuitions conform to Subst(≡µ). In case after case, we find the result of substitution synonymous with the original expression, if the new part is taken as synonymous with the old. This forms the basis of an inductive generalization that such substitutions are always meaning preserving. In contrast to the argument from novelty, where the idea of tracking the semantics is central, this induction argument may concern our habits of assigning meaning to, or reading meaning into, new sentences: we tend to do it compositionally. There is nothing wrong with this argument, as far as it goes, beyond what is in general problematic with induction. It should only be noted that the conclusion is weak. Typically, arguments for compositionality aim at the conclusion that there is a systematic pattern to the assignment of meaning to new sentences, and that the meaning of new sentences can be computed somehow. This is not the case in the induction argument, for the conclusion is compatible with the possibility that substitutivity is the only systematic feature of the semantics. That is, assignment to meaning of new sentences may be completely random, except for respecting substitutivity. If the substitutivity version of compositionality holds, then (under DP) so does the function version, but the semantic function need not be computable, and need not even be finitely specifiable. So, although the argument may
6. Compositionality be empirically sound, it does not establish what arguments for compositionality usually aim at.
4.6. Intersubjectivity and communication The problems with the idea of tracking semantics when interpreting new sentences can be eliminated by bringing in intersubjective agreement in interpretation. For by our common sense standards of judging whether we understand sentences the same way or not, there is overwhelming evidence (e.g. from discussing broadcast news reports) that in an overwhelming proportion of cases, speakers of the same language interpret new sentences similarly. This convergence of interpretation, far above chance, does not presuppose that the sentences heard were meaningful before they were used. The phenomenon needs an explanation, and it is reasonable to suppose that the explanation involves the hypothesis that the meaning of the sentences are computable, and so it isn’t left to guesswork or mere intuition what the new sentences mean. The appeal to intersubjectivity disposes of an unjustified presupposition about semantics, but two problems remain. First, when encountering new sentences, these are almost invariably produced by a speaker, and the speaker has intended to convey something by the sentence, but the speaker hasn’t interpreted the sentence, but fitted it to an antecedent thought. Secondly, we have an argument for computability, but not for compositionality. The first observation indicates that it is at bottom the success rate of linguistic communication with new sentences that gives us a reason for believing that sentences are systematically mapped on meanings. This was the point of view in Frege’s famous passage from the opening of ‘Compound Thoughts’: It is astonishing what language can do. With a few syllables it can express an incalculable number of thoughts, so that even a thought grasped by a terrestrial being for the very first time can be put into a form of words which will be understood by someone to whom the thought is entirely new. This would be impossible, were we not able to distinguish parts in the thoughts corresponding to the parts of a sentence, so that the structure of the sentence serves as the image of the structure of the thought. (Frege 1923: 55)
As Frege depicts it here, the speaker is first entertaining a new thought, or proposition, finds a sentence for conveying that proposition to a hearer, and by means of that sentence the hearer comes to entertain the same proposition as the speaker started out with. Frege appeals to semantic structure for explaining how this is possible. He claims that the proposition has a structure that mirrors the structure of the sentence (so that the semantic relation may be an isomorphism), and goes on to claim that without this structural correspondence, communicative success with new propositions would not be possible. It is natural to interpret Frege as expressing a view that entails that compositionality holds as a consequence of the isomorphism idea. The reason Frege went beyond compositionality (or homomorphism, which does not require a one-one relation) seems to be an intuitive appeal to symmetry: the speaker moves from proposition to sentence, while the hearer moves from sentence to proposition. An isomorphism is a one-one relation, so that each relatum uniquely determines the other.
111
112
I. Foundations of semantics Because of synonymy, a sentence that expresses a proposition in a particular language is typically not uniquely determined within that language by the proposition expressed. Still, we might want the speaker to be able to work out what expression to use, rather searching around for suitable sentences by interpreting candidates one after the other. The inverse functional compositionality principle, InvFunct(μ), of section 3.4, offers such a method. Inverse compositionality is also connected with the idea of structured meanings, or thoughts, while compositionality by itself isn’t, and so in this respect Frege is vindicated (these ideas are developed in Pagin 2003a).
4.7. Summing up Although many share the feeling that there is “more than a kernel of truth” (cf. section 4.1) in the usual arguments for compositionality, some care is required to formulate and evaluate them. One must avoid question-begging presuppositions; for example, if a presupposition is that there is an infinity of propositions, the argument for that had better not be that standardly conceived natural or mental languages allow the generation of such an infinite set. Properly understood, the arguments can be seen as inferences to the best explanation, which is a respectable but somewhat problematic methodology. (One usually hasn’t really tried many other explanations than the proposed one.) Another important (and related) point is that virtually all arguments so far only justify the principle that the meaning is computable or recursive, and the principle that up to certain syntactic variation, an expression of a proposition is computable from that proposition. Why should the semantics also be compositional, and possibly inversely compositional? One reason could be that compositional semantics, or at least certain simple forms of compositional semantics, is very simple, in the sense that a minimal number of processing steps are needed by the hearer for arriving at a full interpretation (or, for the speaker, a full expression, cf. Pagin 2011), but these issues of complexity need to be further explored.
5. Arguments against compositionality Arguments against compositionality of natural language can be divided into four main categories: a) arguments that certain constructions are counterexamples and make the principle false, b) arguments that compositionality is an empirically vacuous, or alternatively trivially correct, principle, c) arguments that compositional semantics is not needed to account for actual linguistic communication, d) arguments that actual linguistic communication is not suited for compositional semantics. The first category, that of counterexamples, will be treated in a separate section dealing with a number of problem cases. Here we shall discuss arguments in the last three categories.
6. Compositionality
5.1
Vacuity and triviality arguments
Vacuity. Some claims about the vacuity of compositionality in the literature are based on mathematical arguments. For example, Zadrozny (1994) shows that for every semantics μ there is a compositional semantics ν such that ν(t)(t) = μ(t) for every expression t, and uses this fact to draw a conclusion of that kind. But note that the mathematical fact is itself trivial: let ν(t) = μ for each t and the result is immediate from (2) in section 3.1 above (other parts of Zadrozny’s results use non-wellfounded sets and are less trivial). Claims like these tend to have the form: for any semantics μ there is a compositional semantics ν from which μ can be easily recovered. But this too is completely trivial as it stands: if we let ν(t) = 〈μ(t), t〉, ν is 1-1, hence compositional by (3) in section 3.1, and μ is clearly recoverable from ν. In general, it is not enough that the old semantics can be computed from the new compositional semantics: for the new semantics to have any interest it must agree with the old one in some suitable sense. As far as we know there are no mathematical results showing that such a compositional alternative can always be found (see Westerståhl 1998 for further discussion). Triviality. Paul Horwich (e.g. in 1998) has argued that compositionality is not a substantial property of a semantics, but is trivially true. He exemplifies with the sentence dogs barks, and says (1998: 156–157) that the meaning property (7)
x means DOGS BARK
consists in the so-called construction property (8)
x results from putting expressions whose meanings are DOG and BARK, in that order, into a schema whose meaning is NS V.
As far as it goes, the compositionality of the resulting semantics is a trivial consequence of Horwich’s conception of meaning properties. Horwich’s view here is equivalent to Carnap’s conception of synonymy as intensional isomorphism. Neither allows that an expression with different structure or composed from parts with different meanings could be synonymous with an expression that means DOGS BARK. However, for supporting the conclusion that compositionality is trivial, these synonymy conditions must themselves hold trivially, and that is simply not the case.
5.2. Superfluity arguments Mental processing. Stephen Schiffer (1987) has argued that compositional semantics, and public language semantics altogether, is superfluous in the account of linguistic communication. All that is needed is to account for how the hearer maps his mental representation of an uttered sentence on a mental representation of meaning, and that is a matter of a syntactic transformation, i.e. a translation, rather than interpretation. In Schiffer’s example (1987: 192–200), the hearer Harvey is to infer from his belief that
113
114
I. Foundations of semantics (9)
Carmen uttered the sentence ‘Some snow is white’.
the conclusion that (10) Carmen said that some snow is white. Schiffer argues that this can be achieved by means of transformations between sentences in Harvey’s neural language M. M contains a counterpart α to (9), such that α gets tokened in Harvey’s so-called belief box when he has the belief expressed by (9). By an inner mechanism the tokening of α leads to the tokening of β, which is Harvey’s M counterpart to (10). For this to be possible for any sentence of the language in question, Harvey needs a translation mechanism that implements a recursive translation function f from sentence representations to meaning representations. Once such a mechanism is in place, we have all we need for the account, according to Schiffer. The problem with the argument is that the translation function f by itself tells us nothing about communicative success. By itself it just correlates neural sentences of which we know nothing except for their internal correlation. We need another recursive function g that maps the uttered sentence Some snow is white on α, and a third recursive function h that maps β on the proposition that some snow is white, in order to have a complete account. But then the composed function h(f(g(. . .))) seems to be a recursive function that maps sentences on meanings (cf. Pagin 2003b). Pragmatic composition. According to François Recanati (2004), word meanings are put together in a process of pragmatic composition. That is, the hearer takes word meanings, syntax and contextual features as his input, and forms the interpretation that best corresponds to them. As a consequence, semantic compositionality is not needed for interpretation to take place. A main motivation for Recanati’s view is the ubiquity of those pragmatic operations that Recanati calls modulations, and which intuitively contribute to “what is said”, i.e. to communicated content before any conversational implicatures. (Under varying terms and conceptions, these phenomena have been described e.g. by Sperber & Wilson 1995, Bach 1994, Carston 2002 and by Recanati himself.) To take an example from Recanati, in reply to an offer of something to eat, the speaker says (11) I have had breakfast. thereby saying that she has had breakfast in the morning of the day of utterance, which involves a modulation of the more specific kind Recanati calls free enrichment, and implicating by means of what she says that she is not hungry. On Recanati’s view, communicated contents are always or virtually always pragmatically modulated. Moreover, modulations in general do not operate on a complete semantically derived proposition, but on conceptual constituents. For instance, in (11) it is the property of having breakfast that is modulated into having breakfast this day, not the proposition as a whole or even the property of having had breakfast. Hence, it seems that what the semantics delivers does not feed into the pragmatics. However, if meanings, i.e. the outputs of the semantic function, are structured entities, in the sense specified by (Rev) and InvFunct(μ) of section 3.4, then the last objection is met, for then semantics is able to deliver the arguments to the pragmatic
6. Compositionality operations, e.g. properties associated with VPs. Moreover, the modulations that are in fact made appear to be controlled by a given semantic structure: as in (11), the modulated part is of the same category and occupies the same slot in the overall structure as the semantically given argument that it replaces. This provides a reason for thinking that modulations operate on a given (syntactically induced) semantic structure, rather than on pragmatically composed material (this line of reasoning is elaborated in Pagin & Pelletier 2007).
5.3. Unsuitability arguments According to a view that has come to be called radical contextualism, truth evaluable content is radically underdetermined by semantics, i.e. by literal meaning. That is, no matter how much a sentence is elaborated, something needs to be added to its semantic content in order to get a proposition that can be evaluated as true or false. Since there will always be indefinitely many different ways of adding, the proposition expressed by means of the sentence will vary from context to context. Well-known proponents of radical contextualism include John Searle (e.g. 1978), Charles Travis (e.g. 1985), and Sperber & Wilson (1995). A characteristic example from Charles Travis (1985: 197) is the sentence (12) Smith weighs 80 kg. Although it sounds determinate enough at first blush, Travis points out that it can be taken as true or as false in various contexts, depending on what counts as important in those contexts. For example, it can be further interpreted as being true in case Smith weighs (12')
a. b. c. d. e.
80 kg when stripped in the morning. 80 kg when dressed normally after lunch. 80 kg after being force fed 4 liters of water. 80 kg four hours after having ingested powerful diuretic. 80 kg after lunch adorned in heavy outer clothing.
Although the importance of such examples is not to be denied, their significance for semantics is less clear. It is in the spirit of radical contextualism to minimize the contribution of semantics (literal meaning) for determining expressed content, and thereby the importance of compositionality. However, strictly speaking, the truth or falsity of the compositionality principle for natural language is orthogonal to the truth or falsity of radical contextualism. For whether the meaning of a sentence s is a proposition or not is irrelevant to the question whether that meaning is determined by the meaning of the constituents of s and their mode of composition. The meaning of s may be unimportant but still compositionally determined. In an even more extreme version, the (semantic) meaning of sentence s in a context c is what the speaker uses s to express in c. In that case meaning itself varies from context to context, and there is no such thing as an invariant literal meaning. Not even the extreme version need be in conflict with compositionality (extended to context dependence), since the substitution properties may hold within each context by itself. Context
115
116
I. Foundations of semantics shift failure, in the sense of section 3.7, may occur, if e.g. word meanings are invariant but the meanings of complex expressions vary between contexts. It is a further question whether radical contextualism itself, in either version, is a plausible view. It appears that the examples of contextualism can be handled by other methods, e.g. by appeal to pragmatic modulations mentioned in section 5.2 (cf. Pagin & Pelletier 2007), which does allow propositions to be semantically expressed. Hence, the case for radical contextualism is not as strong as it may prima facie appear. On top, radical contextualism tends to make a mystery out of communicative success.
6. Problem cases A number of natural language constructions present apparent problems for compositional semantics. In this concluding section we shall briefly discuss a few of them, and mention some others.
6.1. Belief sentences Belief sentences offer diffculties for compositional semantics, both real and merely apparent. At first blush, the case for a counterexample against compositionality seems very strong. For in the pair (13)
a. b.
John believes that Fred is a child doctor. John believes that Fred is a pediatrician.
(13a) may be true and (13b) false, despite the fact that child doctor and pediatrician are synonymous. If truth value is taken to depend only on meaning and on extra-semantic facts, and the extra-semantic facts as well as the meanings of the parts and the modes of composition are the same between the sentences, then the meaning of the sentences must nonetheless be different, and hence compositionality fails. This conclusion has been drawn by Jeff Pelletier (1994). What would be the reason for this difference in truth value? When cases such as these come up, the reason is usually that there is some kind of discrepancy in the understanding of the attributee (John) between synonyms. John may e.g. erroneously believe that pediatrician only denotes a special kind of child doctors, and so would be disposed to assent to (13a) but dissent from (13b) (cf. Mates 1950 and Burge 1978; Mates took such cases as a reason to be skeptical about synonymy). This is not a decisive reason, however, since it is what the words mean in the sentences, e.g. depending on what the speaker means, that is relevant, not what the attributee means by those words. The speaker contributes with words and their meanings, and the attributee contributes with his belief contents. If John’s belief content matches the meaning of the embedded sentence Fred is a pediatrician, then (13b) is true as well, and the problem for compositionality is disposed of. A problem still arises, however, if belief contents are more fine-grained than sentence meanings, and words in belief contexts are somehow tied to these finer differences in grain. For instance, as a number of authors have suggested, perhaps belief contents are propositions under modes of presentation (see e.g. Burdick 1982, Salmon 1986. Salmon, however, existentially quantifies over modes of presentations, which preserves substitutivity).
6. Compositionality It may then be that different but synonymous expressions are associated with different modes of presentation. In our example, John may believe a certain proposition under a mode of presentation associated with child doctor but not under any mode of presentation associated with pediatrician, and that accounts for the change in truth value. In that case, however, there is good reason to say that the underlying form of a belief sentence such as (13a) is something like (14) Bel(John, the proposition that Fred is a child doctor, M(‘Fred is a child doctor’)) where M(-) is a function from a sentence to a mode of presentation or a set of modes of presentation. In this form, the sentence Fred is a pediatrician occurs both used and mentioned (quoted), and in its used occurrence, child doctor may be replaced by pediatrician without change of truth value. Failure of substitutivity is explained by the fact that the surface form fuses a used and a mentioned occurrence. In the underlying form, there is no problem for compositionality, unless caused by quotation. Of course, this analysis is not obviously the right one, but it is enough to show that the claim that compositionality fails for belief sentences is not so easy to establish.
6.2. Quotation Often quotation is set aside for special treatment as an exception to ordinary semantics, which is supposed to concern used occurrences of expressions rather than mentioned ones. Sometimes, this is regarded as cheating, and quotation is proposed as a clear counterexample to compositionality: brother and male sibling are synonymous, but ‘brother’ and ‘male sibling’ are not (i.e. the expressions that include the opening and closing quote). Since enclosing an expression in quotes is a syntactic operation, we have a counterexample. If quoting is a genuine syntactic operation, the syntactic rules include a total unary operator κ such that, for any simple or complex expression t, (15) κ (t) = ‘t’ The semantics of quoted expressions is given simply by (Q)
μ (κ (t)) = t
Then, since t ≡µ u does not imply t = u, substitution of u for t in κ(t) may violate compositionality. However, such a non-compositional semantics for quotation can be transformed into a compositional one, by adapting Frege’s view in (1892) that quotation provides a special context type in which expressions refer to themselves, and using the notion of linguistically context-dependent compositionality from section 3.7 above. We shall not give the details here, only indicate the main steps. Start with a grammar E = (E, A, Σ) (for a fragment of English, say) and a compositional semantics µ for E. First, extend E to a grammar containing the quotation operator
117
118
I. Foundations of semantics κ , allowing not only quote-strings of the form ‘John’, ‘likes’, “Mary”, etc., but also things like John likes ‘Mary’ (meaning that he likes the name), whereas we disallow things like John ‘likes’ Mary or ‘John likes’ Mary as ungrammatical. Let E' be the closure of E under the thus extended operations and κ, and let Σ' = {α ' : α ∈ Σ}∪{κ}. Then we have a new grammar E' =(E', A, Σ') that incorporates quotation. Next, extend μ to a semantics μ' for E', using the semantic composition operations that exist by Funct(μ), and letting (Q) above take care of κ. As indicated, the semantics µ⬘ is not compositional: even if Mary is the same person as Sue, John likes ‘Mary’ doesn’t mean the same as John likes ‘Sue’. However, we can extend µ⬘ to a semantics µ⬙ for E' which is compositional in the sense of LC-Funct(µ)c in section 3.7. In the simplest case, there are two context types: cu, the use context type, which is the default type (the null context), and the quotation context type cq. The function ψ from context types and operators to context types is given by ⎧⎪c ψ (c , β ) = ⎨ ⎪⎩c q
if β ≠ κ if β = κ
for β ∈ Σ' and c equal to cu or cq. µ⬙ is obtained by redefining the given composition operations in a fairly straightforward way, so that LC-Funct(µ⬙)c is automatically insured. µ⬙ then extends µ in the sense that if t ∈ E is meaningful, µ⬙ (t, cu) = µ(t), and furthermore µ⬙ (κ(t), cu) = µ⬙ (t, cq) = t. So µ⬙ is compositional in the contextually extended sense. That t ≡µ u holds does not license substitution of u for t in κ(t), since t there occurs in a quotation context, and we may have µ⬙(t, cq) ≠ µ⬙ (u, cq). This approach is further developed in Pagin & Westerståhl (2009).
6.3. Idioms Idioms are almost universally thought to constitute a problem for compositionality. For example, the VP kick the bucket can also mean ‘die’, but the semantic operation corresponding to the standard syntax of, say, fetch the bucket, giving its meaning in terms of the meanings of its immediate constituents fetch and the bucket, cannot be applied to give the idiomatic meaning of kick the bucket. This is no doubt a problem of some sort, but not necessarily for compositionality. First, that a particular semantic operation fails doesn’t mean that no other operation works. Second, note that kick the bucket is ambiguous between its literal and its idiomatic meaning, but compositionality presupposes non-ambiguous meaning bearers. Unless we take the ambiguity itself to be a problem for compositionality (see the next subsection), we should first find a suitable way to disambiguate the phrase, and only then raise the issue of compositionality. Such disambiguation may be achieved in various ways. We could treat the whole phrase as a lexical item (an atom), in view of the fact that its meaning has to be learnt separately. Or, given that it does seem to have syntactic structure, we could treat it as formed by a different rule than the usual one. In neither case is it clear that compositionality would be a problem.
6. Compositionality To see what idioms really have to do with compositionality, think of the following situation. Given a grammar and a compositional semantics for it, suppose we decide to give some already meaningful phrase a non-standard, idiomatic meaning. Can we then extend the given syntax (in particular, to disambiguate) and semantics in a natural way that preserves compositionality? Note that it is not just a matter of accounting for one particular phrase, but rather for all the phrases in which the idiom may occur. This requires an account of how the syntactic rules apply to the idiom, and to its parts if it has structure, as well as a corresponding semantic account. But not all idioms behave the same. While the idiomatic kick the bucket is fine in John kicked the bucket yesterday, or Everyone kicks the bucket at some point, it is not good in (16) The bucket was kicked by John yesterday. (17) Andrew kicked the bucket a week ago, and two days later, Jane kicked it too. By contrast, pull strings preserves its idiomatic meaning in passive form, and strings is available for anaphoric reference with the same meaning: (18) Strings were pulled to secure Henry his position. (19) Kim’s family pulled some strings on her behalf, but they weren’t enough to get her the job. This suggests that these two idioms should be analyzed differently; indeed the latter kind is called “compositional” in Nunberg, Sag & Wasow (1994) (from which (19) is taken), and is analyzed there using the ordinary syntactic and semantic rules for phrases of this form but introducing instead idiomatic meanings of its parts (pull and string), whereas kick the bucket is called “non-compositional”. In principle, nothings prevents a semantics that deals differently with the two kinds of idioms from being compositional in our sense. Incorporating idioms in syntax and semantics is an interesting task. For example, in addition to explaining the facts noted above one has to prevent kick the pail from meaning ‘die’ even if bucket and pail are synonymous, and likewise to prevent the idiomatic versions of pull and string to combine illegitimately with other phrases. For an overview of the semantics of idioms, see Nunberg, Sag & Wasow (1994). Westerståhl (2002) is an abstract discussion of various ways to incorporate idioms while preserving compositionality.
6.4. Ambiguity Even though the usual formulation of compositionality requires non-ambiguous meaning bearers, the occurrence of ambiguity in language is usually not seen as a problem for compositionality. This is because lexical ambiguity seems easily dealt with by introducing different lexical items for different meanings of the same word, whereas structural ambiguity corresponds to different analyses of the same surface string.
119
120
I. Foundations of semantics However, it is possible to argue that even though there are clear cases of structural ambiguity in language, as in Old men and women were released first from the occupied building, in other cases the additional structure is just an ad hoc way to avoid ambiguity. In particular, quantifier scope ambiguities could be taken to be of this kind. For example, while semanticists since Montague have had no trouble inventing different underlying structures to account for the two readings of (20) Every critic reviewed four films. it may be argued that this sentence in fact has just one structural analysis, a simple constituent structure tree, and that meaning should be assigned to that one structure. A consequence is that meaning assignment is no longer functional, but relational, and hence compositionality either fails or is just not applicable. Pelletier (1999) draws precisely this conclusion. But even if one agrees with such an account of the syntax of (20), abandonment of compositionality is not the only option. One possibility is to give up the idea that the meaning of (20) is a proposition, i.e. something with a truth value (in the actual world), and opt instead for underspecified meanings of some kind. Such meanings can be uniquely, and perhaps compositionally, assigned to simple structures like constituent structure trees, and one can suppose that some further process of interpretation of particular utterances leads to one of the possible specifications, depending on various circumstantial facts. This is a form of context-dependence, and we saw in section 3.7 how similar phenomena can be dealt with compositionally. What was there called standing meaning is one kind of underspecified meaning, represented as a function from indices to ‘ordinary’ meanings. In the present case, where several meanings are available, one might try to use the set of those meanings instead. A similar but more sophisticated way of dealing with quantifier scope is so-called Cooper storage (see Cooper 1983). It should be noted, however, that while such strategies restore a functional meaning assignment, the compositionality of the resulting semantics is by no means automatic; it is an issue that has to be addressed anew. Another option might be to accept that meaning assignment becomes relational and attempt instead to reformulate compositionality for such semantics. Although this line has hardly been tried in the literature, it may be an option worth exploring (For some first attempts in this direction, see Westerståhl 2007).
6.5. Other problems Other problems than those above, some with proposed solutions, include possessives (cf. Partee 1997; Peters & Westerståhl 2006), the context sensitive use of adjectives (cf. Lahav 1989; Szabó 2001; Reimer 2002), noun-noun compounds (cf. Weiskopf 2007), unless+quantifiers (cf. Higginbotham 1986; Pelletier 1994), any embeddings (cf. Hintikka 1984), and indicative conditionals (e.g. Lewis 1976). All in all, it seems that the issue of compositionality in natural language will remain live, important and controversial for a long time to come.
6. Compositionality
7. References Abelard, Peter 2008. Logica ‘Ingredientibus’. 3. Commentary on Aristotele’s De Interpretatione. Corpus Christianorum Contiunatio Medievalis, Turnhout: Brepols Publishers. Ailly, Peter of 1980. Concepts and Insolubles. Dordrecht: Reidel. Originally published as Conceptus et Insolubilia, Paris, ca. 1500. Bach, Kent 1994. Conversational impliciture. Mind & Language 9, 124–162. Barker, Chris & Pauline Jacobson (eds.) 2007. Direct Compositionality. Oxford: Oxford University Press. Burdick, Howard 1982. A logical form for the propositional attitudes. Synthese 52, 185–230. Burge, Tyler 1978. Belief and synonymy. Journal of Philosophy 75, 119–138. Buridan, John 1998. Summulae de Dialectica 4. Summulae de Suppositionibus, volume 10–4 of Artistarium. Nijmegen: Ingenium. Carnap, Rudolf 1956. Meaning and Necessity. 2nd edn. Chicago, IL: The University of Chicago Press. Carston, Robyn 2002. Thoughts and Utterances. The Pragmatics of Explicit Communication. Oxford: Oxford University Press. Chomsky, Noam 1971. Topics in the theory of Generative Grammar. In: J. Searle (ed.). Philosophy of Language. Oxford: Oxford University Press, 71–100. Chomsky, Noam 1980. Rules and Representations. Oxford: Blackwell. Cooper, Robin 1983. Quantification and Syntactic Theory. Dordrecht: Reidel. Davidson, Donald 1967. Truth and meaning. Synthese 17, 304–323. Reprinted in: D. Davidson. Inquiries into Truth and Interpretation. Oxford: Clarendon Press, 1984, 17–36. Page reference to the reprint. Dowty, David 2007. Compositionality as an empirical problem. In: C. Barker & P. Jacobson (eds.). Direct Compositionality. Oxford: Oxford University Press, 23–101. Dummett, Michael 1973. Frege. Philosophy of Language. London: Duckworth. Dummett, Michael 1981. The Interpretation of Frege’s Philosophy. London: Duckworth. Fernando, Tim 2005. Compositionality inductively, co-inductively and contextually. In: E. Machery, M. Werning & G. Schurz (eds.). The Compositionality of Meaning and Content: Foundational Issues, vol. I. Frankfurt/M.: Ontos, 87–96. Fodor, Jerry 1987. Psychosemantics. Cambridge, MA: The MIT Press. Fodor, Jerry 2000. Reply to critics. Mind & Language 15, 350–374. Fodor, Jerry & Jerrold Katz 1964. The structure of a semantic theory. In: J. Fodor & J. Katz (eds.). The Structure of Language. Englewood Cliffs, NJ: Prentice Hall, 479–518. Frege, Gottlob 1884. Die Grundlagen der Arithmetik: eine logisch-mathematische Untersuchung über den Begriff der Zahl. Breslau: W. Koebner. English translation in: J. Austin. The Foundations of Arithmetic: A logico-mathematical enquiry into the concept of number. 1st edn. Oxford: Blackwell, 1950. Frege, Gottlob 1892. Über Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik 100, 25–50. English translation in: P. Geach & M. Black (eds.). Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1980, 56–78. Frege, Gottlob 1923. Logische Untersuchungen. Dritter Teil: Gedankengefüge. Beiträge zur Philosophie des deutschen Idealismus III (1923–1926), 36–51. English translation in: P. Geach (ed.). Logical Investigations. Oxford: Blackwell, 1977, 55–77. Hendriks, Herman 2001. Compositionality and model-theoretic interpretation. Journal of Logic, Language and Information 10, 29–48. Higginbotham, James 1986. Linguistic theory and Davidson’s program in semantics. In: E. Lepore (ed.). Linguistic Theory and Davidson’s Program in Semantics. Oxford: Blackwell, 29–48. Hintikka, Jaakko 1984. A hundred years later: The rise and fall of Frege’s influence in language theory. Synthese 59, 27–49.
121
122
I. Foundations of semantics Hodges, Wilfrid 2001. Formal features of compositionality. Journal of Logic, Language and Information 10, 7–28. Horwich, Paul 1998. Meaning. Oxford: Oxford University Press. Houben, Jan 1997. The Sanskrit tradition. In: W. van Bekkum et al. (eds.). The Sanskrit Tradition. Amsterdam: Benjamins, 49–145. Husserl, Edmund 1900. Logische Untersuchungen II/1. Translated by J. N. Findlay as Logical Investigations. London: Routledge & Kegan Paul, 1970. Jacobson, Pauline 2002. The (dis)organisation of the grammar: 25 years. Linguistics & Philosophy 25, 601–626. Janssen, Theo 1986. Foundations and Applications of Montague Grammar. Amsterdam: CWI Tracts 19 and 28. Janssen, Theo 1997. Compositionality. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 417–473. Janssen, Theo 2001. Frege, contextuality and compositionality. Journal of Logic, Language and Information 10, 115–136. Johnson, Kent 2006. On the nature of reverse compositionality. Erkenntnis 64, 37–60. Kaplan, David 1989. Demonstratives. In: J. Almog, J. Perry & H. Wettstein (eds.). Themes from Kaplan. Oxford: Oxford University Press, 481–563. Keenan, Edward & Edward Stabler 2004. Bare Grammar: A Study of Language Invariants. Stanford, CA: CSLI Publications. King, Peter 2001. Between logic and psychology. John Buridan on mental language. Paper presented at the conference John Buridan and Beyond, Copenhagen, September 2001. King, Peter 2007. Abelard on mental language. American Catholic Philosophical Quarterly 81, 169–187. Kracht, Marcus 2003. The Mathematics of Language. Berlin: Mouton de Gruyter. Kracht, Marcus 2007. The emergence of syntactic structure. Linguistics & Philosophy 30, 47–95. Lahav, Ran 1989. Against compositionality: The case of adjectives. Philosophical Studies 57, 261–279. Larson, Richard & Gabriel Segal 1995. Knowledge of Meaning. An Introduction to Semantic Theory. Cambridge, MA: The MIT Press. Lewis, David 1976. Probabilities of conditionals and conditional probabilities. The Philosophical Review 85, 297–315. Mates, Benson 1950. Synonymity. University of California Publications in Philosophy 25, 201–226. Reprinted in: L. Linsky (ed.). Semantics and the Philosophy of Language, Urbana, IL: University of Illinois Press, 1952, 111–136. Montague, Richard 1968. Pragmatics. In: R. Klibanski (ed.). Contemporary Philosophy: A Survey I: Logic and foundations of Mathematics. La Nuove Italia Editice, Florence, 1968, 102–122. Reprinted in: R. Thomason (ed.). Formal Philosophy. Selected Papers of Richard Montague. New Haven, CT: Yale University Press, 1974, 95–118. Nunberg, Geoffry, Ivan Sag & Thomas Wasow 1994. Idioms. Language 70, 491–538. Pagin, Peter 2003a. Communication and strong compositionality. Journal of Philosophical Logic 32, 287–322. Pagin, Peter 2003b. Schiffer on communication. Facta Philosophica 5, 25–48. Pagin, Peter 2005. Compositionality and context. In: G. Preyer & G. Peter (eds.). Contextualism in Philosophy. Oxford: Oxford University Press, 303–348. Pagin, Peter 2011. Communication and the complexity of semantics. In: W. Hinzen, E. Machery & M. Werning (eds.). The Oxford Handbook of Compositionality. Oxford: Oxford University Press. Pagin, Peter & Francis J. Pelletier 2007. Content, context and communication. In: G. Preyer & G. Peter (eds.). Context-Sensitivity and Semantic Minimalism. New Essays on Semantics and Pragmatics. Oxford: Oxford University Press, 25–62.
6. Compositionality
123
Pagin, Peter & Dag Westerståhl 2009. Pure Quotation, Compositionality, and the Semantics of Linguistic Context. Submitted. Partee, Barbara H. 1997. The genitive. A case study. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 464–470. Pelletier, Francis J. 1994. The principle of semantic compositionality. Topoi 13, 11–24. Expanded version reprinted in: S. Davis & B. Gillon (eds.), Semantics: A Reader. Oxford: Oxford University Press, 2004, 133–157. Pelletier, Francis J. 1999. Semantic compositionality: Free algebras and the argument from ambiguity. In: M. Faller, S. Kaufmann & M. Pauly (eds.). Proceedings of the Seventh CSLI Workshop on Logic, Language and Computation. Stanford, CA: CSLI Publications, 207–218. Pelletier, Francis J. 2001. Did Frege believe Frege’s Principle? Journal of Logic, Language and Information 10, 87–114. Peters, Stanley & Dag Westerståhl 2006. Quantifiers in Language and Logic. Oxford: Oxford University Press. Putnam, Hilary 1975a. Do true assertions correspond to reality? In: H. Putnam. Mind, Language and Reality. Philosophical Papers vol. 2. Cambridge: Cambridge University Press, 70–84. Putnam, Hilary 1975b. Mind, Language and Reality. Philosophical Papers vol. 2. Cambridge: Cambridge University Press. Recanati, François 2004. Literal Meaning. Cambridge: Cambridge University Press. Reimer, Marga 2002. Do adjectives conform to compositionality? Noûs 16, 183–198. Salmon, Nathan 1986. Frege’s Puzzle. Cambridge, MA: The MIT Press. Schiffer, Stephen 1987. Remnants of Meaning. Cambridge, MA: The MIT Press. Searle, John 1978. Literal meaning. Erkenntnis 13, 207–224. Sperber, Dan & Deirdre Wilson 1995. Relevance. Communication & Cognition. 2nd edn. Oxford: Blackwell. Staal, J. F. 1969. Sanskrit philosophy of language. In: T. A. Sebeok (ed.). Linguistics in South Asia. The Hague: Mouton, 499–531. Szabó, Zoltán 2000. Compositionality as supervenience. Linguistics & Philosophy 23, 475–505. Szabó, Zoltán 2001. Adjectives in context. In: R. M. Harnich & I. Kenesei (eds.). Adjectives in Context. Amsterdam: Benjamins, 119–146. Travis, Charles 1985. On what is strictly speaking true. Canadian Journal of Philosophy 15, 187–229. Weiskopf, Daniel 2007. Compound nominals, context, and compositionality. Synthese 156, 161–204. Westerståhl, Dag 1998. On mathematical proofs of the vacuity of compositionality. Linguistics & Philosophy 21, 635–643. Westerståhl, Dag 2002. On the compositionality of idioms: An abstract approach. In: J. van Benthem, D. I. Beaver & D. Barker-Plummer (ed.). Words, Proofs, and Diagrams. Stanford, CA: CSLI Publications, 241–271. Westerståhl, Dag 2004. On the compositional extension problem. Journal of Philosophical Logic 33, 549–582. Westerståhl, Dag 2007. Remarks on scope ambiguity. In: E. Ahlsén et al. (eds.). Communication – Action – Meaning. A Festschrift to Jens Allwood. Göteborg: Department of Linguistics, University of Gothenburg, 43–55. Westerståhl, Dag 2008. Decomposing generalized quantifiers. Review of Symbolic Logic 1:3, 355–371. Westerståhl, Dag 2011. Compositionality in Kaplan style semantics. In: W. Hinzen, E. Machery & M. Werning (eds.). The Oxford Handbook of Compositionality. Oxford: Oxford University Press. Zadrozny, Wlodek 1994. From compositional to systematic semantics. Linguistics & Philosophy 17, 329–342.
Peter Pagin, Stockholm (Sweden) Dag Westerståhl, Gothenburg (Sweden)
124
I. Foundations of semantics
7. Lexical decomposition: Foundational issues 1. 2. 3. 4. 5.
The purpose of lexical decomposition The early history of lexical decomposition Theoretical aspects of decomposition Conclusion References
Abstract Theories of lexical decomposition assume that lexical meanings are complex. This complexity is expressed in structured meaning representations that usually consist of predicates, arguments, operators, and other elements of propositional and predicate logic. Lexical decomposition has been used to explain phenomena such as argument linking, selectional restrictions, lexical-semantic relations, scope ambiguites, and the inference behavior of lexical items. The article sketches the early theoretical development from nounoriented semantic feature theories to verb-oriented complex decompositions. It also deals with a number of theoretical issues, including the controversy between decompositional and atomistic approaches to meaning, the search for semantic primitives, the function of decompositions as definitions, problems concerning the interpretability of decompositions, and the debate about the cognitive status of decompositions.
1. The purpose of lexical decomposition 1.1. Composition and decomposition The idea that the meaning of single lexical units is represented in the form of lexical decompositions is based on the assumption that lexical meanings are complex. This complexity is expressed as a structured representation often involving predicates, arguments, operators, and other elements known from propositional and predicate logic. For example, the noun woman is represented as a predicate that involves the conjunction of the properties of being human, female, and adult, whereas the verb empty can be thought of as expressing a causal relation between x and the becoming empty of y. (1)
a. woman: b. to empty:
λx[human(x) & female(x) & adult(x)] λyλx[cause(x, become(empty(y)))]
The structures involved in lexical decompositions resemble semantic structures on the phrasal and sentential level. There is of course an important difference between semantic decomposition and semantic composition; semantic complexity on the phrasal and sentential level mirrors the syntactic complexity of the expression while the assumed semantic complexity on the lexical level – at least as far as non-derived words are concerned – need not correspond to any formal complexity of the lexical expression. Next, we give an overview of the main linguistic phenomena treated within decompositional approaches (section 1.2). Section 2 looks at the origins of the idea of lexical decomposition (section 2.1) and sketches some early formal theories on the lexical decomposition of nouns (sections 2.2, 2.3) and verbs (section 2.4). Section 3 is devoted Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 124–144
7. Lexical decomposition: Foundational issues to a discussion of some long-standing theoretical issues of lexical decomposition, the controversy between decompositional and non-decompositional approaches to lexical meaning (section 3.1), the location of decompositions within a language theory (section 3.2), the status of semantic primitives (section 3.3), the putative role of decompositions as definitions (section 3.4), the semantic interpretation of decompositions (section 3.5), and their cognitive plausibility (section 3.6). The discussion relies heavily on the overview of frameworks of lexical decomposition of verbs given in article 17 (Engelberg) Frameworks of decomposition that can be consulted for a more detailed description of the theories mentioned in the present article.
1.2. The empirical coverage of lexical decompositions Which phenomena lexical decompositions are supposed to explain varies from approach to approach. The following have been tackled fairly often in decompositional theories: (i)
Argument linking: One of the main purposes for decomposing verbs has been the attempt to form generalizations about the relationship between semantic arguments and their syntactic realization. In causal structures like those given for empty (1b) the first argument of a cause relation becomes the subject of the sentence and is marked with nominative in nominative-accusative languages or absolutive in ergative-absolutive languages. Depending on the linking theory pursued, this can be expressed in different kinds of generalizations, for example, by simply claiming that the first argument of cause always becomes the subject in active sentences or – more general – that the least deeply embedded argument of the decomposition is associated with the highest function in a corresponding syntactic hierarchy. (ii) Selectional restrictions: Lexical decompositions account for semantic co-occurrence restrictions. The arguments selected by a lexical item are usually restricted to particular semantic classes. If the item filling the argument position is not of the required class, the resulting expression is semantically deviant. For instance, the verb preach selects an argument filler denoting a human being for its first argument slot. The decompositional features of woman (1a) account for the fact that the woman preached is semantically unobtrusive while the hope / chair / tree preached is not. (iii) Ambiguity resolution: Adverbs often lead to a kind of sentential ambiguity that was attempted to be resolved by reference to lexical decompositions. In a scenario where Rebecca is pointing a gun at Jamaal, sentence (2a) may describe three possible outcomes. (2)
a. Rebecca almost killed Jamaal. b. kill: λyλx[do(x,cause(become(dead(y))))]
Assuming a lexical decomposition for kill as in (2b), ambiguity resolution is achieved by attaching almost to different predicates within the decomposition, yielding a scenario where Rebecca almost pulled the trigger (almost do …), a scenario where she pulled the trigger but missed Jamaal (almost cause …), and a scenario where she pulled the trigger, hit him but did not wound him fatally (almost become dead …).
125
126
I. Foundations of semantics (iv) Lexical relations: Lexical decompositions have also been employed in the analysis of semantic relations like hyperonymy, complementarity, synonymy, etc. (cf. Bierwisch 1970: 170). For example, assuming that a lexeme A is a hyperonym of a lexeme B iff the set of properties conjoined in the lexical decomposition of lexeme A is a proper part of the set of properties conjoined in the lexical decomposition of lexeme B, we derive that child (3a) is a hyperonym of girl (3b). (3)
a. child: λx[human(x) & ¬adult(x)] b. girl: λx[human(x) & ¬adult(x) & female(x)]
(v) Lexical field structure: Additionally, lexical decompositions have been used in order to uncover the structure of lexical fields (cf. section 2.2). (vi) Inferences: Furthermore, lexical decompositions allow for semantic inferences that can be derived from the semantics of primitive predicates. For example, a predicate like become, properly defined, allows for the inference that Jamaal in (2a) was not dead immediately before the event.
2. The early history of lexical decomposition 2.1. The roots of lexical decomposition The idea that a meaning of a word can be explained by identifying it with the meaning of a more complex expression is deeply rooted not only in linguistics but also in our common sense understanding of language. When asked to explain to a non-native speaker what the German word Junggeselle means, one would probably say that a Junggeselle is an unmarried man. A decompositional way of meaning explanation is also at the core of the Aristotelian conception of word meaning in which the meaning of a noun is sufficiently explained by its genus proximum (here man) and its differentia specifica (here unmarried). Like the decompositions in (1), this conception attempts to define the meaning of a word. However, the distinction between genus proximum and differentia specifica is not explicitly expressed in lexical decompositions: From a logical point of view, each predicate in a conjunction as in (1a) qualifies as a genus proximum. The Aristotelian distinction is also an important device in lexicographic meaning explanations as in (4a), where the next superordinate concept (donkey) of the lexical item in question (jackass) and one or more distinctive features (male) are given (cf. e.g., Svensén 1993: 120ff). Interestingly, meaning explanations based on genus proximum and differentia specifica have provoked some criticism within lexicography (Wiegand 1989) as well, and a closer look into standard monolingual dictionaries reveals that many meaning explanations are not of the Aristotelian kind represented in (4a): They involve near-synonyms (4b), integration of encyclopaedic (4c) and pragmatic information (4d), extensional listings of members of the class denoted by the lexeme (4e), pictorial illustrations (cf. numerous examples, e.g., in Harris 1923), or any combinations thereof. (4)
a. jackass […] 1. male donkey […] (Thorndike 1941: 501) b. grumpy […] surly; ill-humoured; gruff. […] (Thorndike 1941: 413)
7. Lexical decomposition: Foundational issues c. scimitar […] A saber with a much-curved blade with the edge on the convex side, used chiefly by Mohammedans, esp. Arabs and Persians. […] (Webster’s, Harris 1923: 1895) d. Majesty […] title used in speaking to or of a king, queen, emperor, empress, etc.; as, Your Majesty, His Majesty, Her Majesty. […] (Thorndike 1941: 562) e. cat […] 2. Any species of the family Felidae, of which the domestic cat is the type, including the lion, tiger, leopard, puma, and various species of tiger cats, and lynxes, also the cheetah. […] (Webster’s, Harris 1923: 343) This foreshadows some persistent problems of later approaches to lexical decomposition.
2.2. Semantic feature theories and the semantic structure of nouns As we have seen, the concept of some kind of decomposition has been around ever since people began to systematically think about word meanings. Yet, it was not until the advent of Structural Semantics that lexical decompositions have become part of more restrictive semantic theories. Structural Semantics emerged in the late 1920s as a reaction to the semantic mainstream, which, at the time, was oriented towards psychological explanations of idiolectal variation and the diachronic change of single word meanings. It conceived of lexical semantics as a discipline that revealed the synchronic structure of the lexicon from a non-psychological perspective. The main tenet was that the meaning of a word can only be captured in its relation to the meaning of other words. Within Structural Semantics, lexical decompositions developed in the form of breaking down word meanings into semantic features (depending on the particular approach also called ‘semantic components’, ‘semantic markers’, or ‘sememes’). An early analysis of this sort can be found in Hjelmslev’s Prolegomena from 1943 (Hjelmslev 1963: 70) who observed that systematic semantic relationships can be traced back to shared semantic components (cf. Tab. 7.1). He favored a strict decompositional approach in that (i) he explicitly conceived of decompositions like the ones in Tab. 7.1 as definitions of words and (ii) assumed that content-entities like ‘ram’, ‘woman’, ‘boy’ have to be eliminated from the inventory of content-entities if they can be defined by decompositions (Hjelmslev 1963: 72ff). Tab. 7.1: Semantic components (after Hjelmslev 1963: 70)
‘sheep’ ‘human being’ ‘child’ ‘horse’
‘he’
‘she’
‘ram’ ‘man’ ‘boy’ ‘stallion’
‘ewe’ ‘woman’ ‘girl’ ‘mare’
Following the Prague School’s feature-based approach to phonology, it was later assumed that semantic analyses should be based on a set of functional oppositions like [±human], [±male], etc. (cf. also article 16 (Bierwisch) Semantic features and primes). Semantic feature theories developed along two major lines. In Europe, structuralists like Pottier (1963, 1964), Coseriu (1964), and Greimas (1966) employed semantic features to reveal
127
128
I. Foundations of semantics the semantic structure of lexical fields. A typical example for a semantic feature analysis in the European structuralist tradition is Pottier’s (1963) analysis of the lexical field of sitting furniture with legs (French siège) that consists of the lexemes chaise, fauteuil, tabouret, canapé, and pouf (cf. Tab. 7.2). Six binary features serve to define and structure the field: s1 = avec dossier ‘with back’, s2 = sur pied ‘on feet’, s3 = pour 1 personne ‘for one person’, s4 = pour s’asseoir ‘for sitting’, s5 = avec bras ‘with armrest’, and s6 = avec matériau rigide ‘with rigid material’. Tab. 7.2: Semantic feature analysis of the lexical field siège (‘seat with legs’) in French (Pottier 1963: 16)
chaise fauteuil tabouret canapé pouf
s1
s2
s3
s4
s5
s6
+ + – + –
+ + + + +
+ + + – +
+ + + + +
– + – + –
+ + + + –
In North America, Katz, Fodor, and others tried to develop a formal theory of the lexicon as a part of so-called Interpretive Semantics that constituted the semantic component of the Standard Theory of Generative Grammar (Chomsky 1965). In this tradition, semantic features served, for example, as targets for selectional restrictions (Katz & Fodor 1963). The semantic description of lexical items consists of two types of features, ‘semantic markers’ and ‘distinguishers’, by which the meaning of a lexeme is decomposed exhaustively into its atomic concepts: “The semantic markers assigned to a lexical item in a dictionary entry are intended to reflect whatever systematic semantic relations hold between that item and the rest of the vocabulary of the language. On the other hand, the distinguishers assigned to a lexical item are intended to reflect what is idiosyncratic about its meaning.” (Katz & Fodor 1963: 187). An example entry is given in Tab. 7.3. Tab. 7.3: Readings of the english noun bachelor distinguished by semantic markers (in parentheses) and distinguishers (in square brackets) (Katz, after Fodor 1977: 65) bachelor, [+N, …], –– (Human), (Male), [who has never married] –– (Human), (Male), [young knight serving under the standard of another knight] –– (Human), [who has the first or lowest academic degree] –– (Animal), (Male), [young fur seal when without a mate during the breeding time]
Besides this feature-based specification of the meaning of lexical items, Interpretive Semantics assumed recursive rules that operate over syntactic deep structures and build up meaning specifications for phrases and sentences out of lexical meaning specifications (Katz & Postal 1964). As we have seen, semantic feature theories make it possible to tackle phenomena in the area of lexical fields and selectional restrictions (cf. also article 21 (Cann) Sense relations). They can also be used in formal accounts of lexical-semantic relations. For example, expression A is incompatible with expression B iff A and B have different
7. Lexical decomposition: Foundational issues values for at least one of their semantic features: boy [+human, -adult, -female] is incompatible with woman [+human, +adult, +female]. Expression A is complementary to expression B iff A and B have different values for exactly one of their semantic features: for instance, girl [+human, -adult, +female] is complementary to boy. Expression A is hyperonymous to expression B iff the set of feature-value assignments for A is included in the set of feature-value assignments for B: thus, child [+human, -adult] is hyperonymous to boy. In European structuralism, the status of semantic features was a matter of debate. They were usually conceived of as part of a descriptive, language-independent semantic metalanguage, but were also treated as cognitive entities. In Generative Grammar, Katz conceived of semantic features as derived from a universal conceptual structure: “Semantic markers must […] be thought of as theoretical constructs introduced into semantic theory to designate language invariant but language linked components of a conceptual system that is part of the cognitive structure of the human mind.” (Katz 1967: 129). In a similar vein, Bierwisch (1970: 183) assumed that the basic semantic components “are not learned in any reasonable sense of the term, but are rather an innate predisposition for language acquisition.” Thus, language-specific semantic structures come about by the particular combination of semantic features that yield a lexical item.
2.3. Some inadequacies of semantic feature analyses Semantic feature theories considerably stimulated research in lexical semantics. Beyond that, semantic features had found their way into contemporary generative syntax as a target for selectional restrictions (cf. Chomsky 1965). Yet, a number of empirical weaknesses have quickly become evident. (i)
Relational lexemes: The typical cases of semantic feature analyses seem to presuppose that all predicates are one-place predicates. Associating woman with the feature bundle [+human, +adult, +female] means that the referent of the sole argument of woman(x) has the three properties of being human, adult, and female. Simple feature bundles cannot account for relational predicates like mother(x,y) or devour(x,y), because the argument to which a semantic feature attaches has to be specified. With mother(x,y), the feature [+female] applies to the first, but not to the second argument. (ii) Structure of verb decompositions: Semantic feature analyses usually represent word meanings as unordered sets of features. However, in particular with verbal predicates, the decomposition cannot be adequately formulated as a flat structure (cf. section 2.4). (iii) Undecomposable words: It has been criticized that cohyponyms in larger taxonomies, such as lion, tiger, puma, etc. as cohyponyms of cat (cf. 4e) or rose, tulip, daffodil, carnation, etc. as cohyponyms of flower, cannot be differentiated by semantic features in a non-trivial way. If one of the semantic features of rose is [+flower], then what is its distinguishing feature? This feature should abstract from a rose being a flower since [+flower] has already been added to the feature list. Moreover, it should not be unique for the entry of rose or else the number of features
129
130
I. Foundations of semantics
(iv)
(v)
(vi)
(vii)
threatens to surpass the number of lexical entries. In other words, there does not seem to be any plausible candidate for P that would make ∀x[rose(x) ↔ (p(x) & flower(x))] a valid definition (cf. Fodor 1977: 150; Roelofs 1997: 46ff for arguments of this sort). Besides cohyponymy in large taxonomies, there are other lexical relations as well that cannot be adequately captured by binary feature descriptions, for instance the scalar nature of antonymy and lexical rows like hot > warm > tepid > cool > cold. In general, it is simply unclear for many lexical items what features might be used to distinguish them from near-synonyms (cf. grumpy in (4b)). Exhaustiveness: For most lexical items, it seems to be impossible to give an exhaustive lexical analysis, that is, one that provides the features that are necessary and sufficient to distinguish the item from all other lexical items of the language without introducing features that are used solely for the description of this one particular item. Katz’s distinction between markers and distinguishers does not solve the problem. Apart from the fact that the difference between ‘markers’ as targets for selectional restrictions and ‘distinguishers’ as lexeme-specific idiosyncratic features is not supported by the data (cf. Bierwisch 1969: 177ff; Fodor 1977: 144ff), this concession to semantic diversity weakens the explanatory value of semantic feature theories considerably since no restrictions are provided for what can occur as a distinguisher. Finiteness: Only rarely have large inventories of features been assembled (e.g., by Lorenz & Wotjak 1977). Moreover, semantic feature theory has not succeeded in developing operational procedures by which semantic features can be discovered. Thus, it has not become evident that there is a finite set of semantic features that allows a description of the entire vocabulary of a language, in particular that this set is smaller than the set of lexical items. Universality: Another point of criticism has been that the alleged universality of a set of features has not been convincingly demonstrated. As Lyons (1968: 473) stated, cross-linguistic semantic comparisons of semantic structures rather point to the contrary. However, the search for a universal semantic metalanguage has continued and has become a major topic in particular among the proponents of Natural Semantic Metalanguage (cf. article 17 (Engelberg) Frameworks of decomposition, section 8). Theoretical status: The often unclear theoretical status of semantic features has drawn some criticism as well. Among other things, it has been argued that in order to express that a mare is a female horse it is not necessary to enrich the metalanguage by numerous features. The relation can equally well be expressed on the level of object language by assuming a meaning postulate in form of a biconditional: ◻∀x[mare(x) ↔ horse(x) & female(x)].
2.4. Lexical decomposition and the semantic structure of verbs The rather complex semantic structure of many verbs could not be adequately captured by semantic feature approaches for two reasons: They focused on one-place lexemes, and they expressed lexical meaning in flat structures, that is, by simply conjoining semantic features. Instead, hierarchical structures were needed. Katz (1971) tackled this problem in the form of decompositions that also included aspectually relevant features such as ‘activity’ (cf. Tab. 7.4).
7. Lexical decomposition: Foundational issues
131
Tab. 7.4: Decomposition of chase (after Katz 1971: 304) chase → Verb, Verb transitive, …; (((Activity) (Nature: (Physical)) of X), ((Movement) (Rate: Fast)) (Character: Following)), (Intention of X: (Trying to catch ((Y) ((Movement) (Rate: Fast))))); (SR).
While this form of decomposition never caught on, other early verb decompositions (cf. Bendix 1966, Fillmore 1968, Bierwisch 1970) look more familiar to semantic representations still employed in many theories of verb semantics: (5) a. give(x,y,z): b. persuade(x,y,z):
x cause (y have z) x cause (y believe z)
(after Bendix 1966: 69) (after Fillmore 1968: 377)
It was the rise of Generative Semantics in the late 1960s that caused a shift in interest from decompositional structures of nouns to lexical decompositions of verbs. The history of lexical decompositions of verbs that emerged from these early approaches is reviewed in article 17 (Engelberg) Frameworks of decomposition. Starting from early generative approaches, verb decompositions have been employed in theories as different as Conceptual Semantics (cf. also article 30 (Jackendoff) Conceptual Semantics), Natural Semantic Metalanguage, and Distributed Morphology (cf. also article 81 (Harley) Semantics in Distributed Morphology).
3. Theoretical aspects of decomposition 3.1. Decompositionalism versus atomism Directly opposed to decompositional approaches to lexical meaning stands a theory of lexical meaning that is known as lexical atomism or holism and whose main proponent is Jerry A. Fodor (1970, 1998, Fodor et al. 1980). According to a decompositional concept of word meaning, knowing the meaning of a word involves knowing its decomposition, that is, the linguistic or conceptual entities and relations it consists of. Fodor’s atomistic conception of word meaning rejects this view and assumes instead that there is a direct correspondence between a word and the mental particular it stands for. A lexical meaning does not have constituents, and – in a strict formulation of atomism – knowing it does not involve knowing the meaning of other lexical units. Fodor (1998: 45) observes in favor of his atomistic, anti-definitional approach that there are practically no words whose definition is generally agreed upon – an argument that cannot be easily dismissed. Atomists are also skeptical about the claim that decompositions/definitions are simpler than the words they are attached to: “Does anybody present really think that thinking bachelor is harder than thinking unmarried? Or that thinking father is harder than thinking parent?” (Fodor 1998: 46). Discussing Jackendoff’s (1992) decompositional representation of keep, Fodor (1998: 55) comments on the relation that is expressed in examples as different as someone kept the money and someone kept the crowd happy: “I would have thought, saying what relation they both instance is precisely what the word ‘keep’ is for; why on earth do you suppose that you can say it ‘in other words’?” And he adds: “I can’t think of a better way to say what ‘keep’ means than to say that it means keep. If, as I suppose, the concept keep is
132
I. Foundations of semantics an atom, it’s hardly surprising that there’s no better way to say what ‘keep’ means than to say that it means keep.” More detailed arguments for and against atomistic positions will appear throughout this article. The controversy between decompositionalists and atomists is often connected to the question whether decompositions or meaning postulates should be employed to characterize lexical meaning. Meaning postulates are used to express analytic knowledge concerning particular semantic expressions (Carnap 1952: 67). Lexical meaning postulates are necessarily true. They consist of entailments where the antecedent is an open lexical proposition (6a). (6)
a. ◻∀x[bachelor(x) → man(x)] ◻∀x[bachelor(x) → ¬married(x)] b. ◻∀x[bachelor(x) ↔ (man(x) & ¬married(x))] c. bachelor: λx[man(x) & unmarried(x)]
Meaning postulates can also express bidirectional entailments as in (6b) where the biconditional expresses a definition-like equivalence between a word and its decomposition. I will assume that in the typical case on a pure lexical level of meaning description decompositional approaches like (6c) conceive of word meanings as bidirectional entailments as in (6b) while atomistic approaches involve monodirectional entailments as in (6a) (cf. similarly Chierchia & McConnell-Ginet 1990: 360ff). Thus, meaning postulates do not per se characterize atomistic approaches to meaning, but it is rather the kind of meaning postulate that serves to distinguish the two basic stances on word meaning. Informally, one might say that bidirectional meaning postulates provide definitions, monodirectional ones single aspects of word meaning in form of relations to other semantic elements. Three caveats are in order here: (i) Semantic reconstruction on the basis of meaning postulates is not uniformly accepted either in the decompositional or in the atomistic camp. Some proponents of decompositional approaches do not adhere to a definitional view of decompositions; they claim that their decompositions do not cover the whole meaning of the lexical item (cf. section 3.4). At the same time, some radical approaches to atomism reject lexical meaning postulates completely (Fodor 1998). (ii) Decompositions and bidirectional meaning postulates are only equivalent on the level of meaning explanation (cf. Chierchia & McConnell-Ginet 1990: 362). They differ, however, in that in decompositional approaches the word defined (bachelor in (6c)) is not accessible within the semantic derivation while the elements of the decomposition (man(x) & unmarried(x)) are. This can have an effect, for example, on the explanation of scope phenomena. (iii) Furthermore, decompositions as in (6c) and bidirectional meaning postulates as in (6b) can give rise to different predictions with respect to language processing (cf. section 3.6).
3.2. Decompositions and the lexicon One of the most interesting differences in the way verb decompositions are used in different language theories concerns their location within the theory. Some approaches locate decompositions and the principles and rules that build them up in syntax (e.g.,
7. Lexical decomposition: Foundational issues Generative Semantics, Distributed Morphology), some in semantics (e.g., Dowty’s Montague-based theory, Lexical Decomposition Grammar, Natural Semantic Metalanguage), and others in conceptual structure (e.g., Conceptual Semantics) (cf. article 17 (Engelberg) Frameworks of decomposition). Decompositions are sometimes constrained by interface conditions as well. These interface relations between linguistic levels of representation are specified to a different degree in different theories. Lexical Decomposition Grammar has put some effort into establishing interface conditions between syntactic, semantic, and conceptual structure. In syntactic approaches to decomposition (Lexical Relational Structures, Distributed Morphology), however, the relation between syntactic decompositions and semantic representations often remains obscure – one of the few exceptions being von Stechow’s (1995) analysis of the scope properties of German wieder ‘again’ in syntactic decompositions (cf. article 17 (Engelberg) Frameworks of decomposition). The way decompositions are conceived has an impact on the structure of the lexicon and its role within the architecture of the language theory pursued. While some approaches advocate rather rich meaning representations (e.g., Natural Semantic Metalanguage, Conceptual Semantics), others downplay semantic representation and reduce it to a rather unstructured domain of encyclopaedic knowledge (Distributed Morphology) (cf. the overview in Ramchand 2008). Meaning representation itself can occur on more than one level. Sometimes the distinction is between semantics proper and some sort of conceptual representation (e.g., Lexical Decomposition Grammar); sometimes different levels of conceptual representation are distinguished such as Jackendoff’s (2002) Conceptual Structure and Spatial Structure. Theories also differ in how much the lexicon is structured by rules and principles. While syntactic approaches often conceive of the lexicon as a mere inventory, other approaches (e.g. Levin & Rappaport Hovav’s Lexical Conceptual Structures, Wunderlich’s Lexical Decomposition Grammar, Pustejovsky’s Event Structures) assume different kinds of linking principles, interface conditions and structure rules for decompositions (for references, cf. article 17 (Engelberg) Frameworks of decomposition).
3.3. Decompositions and primitives Decompositional approaches to lexical meaning usually claim that all lexical items can be completely reduced to their component parts; that is, they can be defined. This requires certain conditions on decompositions in order to avoid infinite regress: (i) The predicates used in the decompositions are either semantic primitives or can be reduced to semantic primitives by definitions. It is necessary that the primitives are not reduced to other elements within the vocabulary, but are grounded elsewhere. (ii) Another condition is that the set of primitives be notably smaller than the lexicon (cf. Fodor et al. 1980: 268). (iii) Apart from their primitivity, it is often required that predicates within decompositions be general, that is, distinctive for a large number of lexemes, and universal, that is, relevant to the description of lexemes in all or most languages (cf. Löbner 2002: 132ff) (cf. also the discussion in article 19 (Levin & Rappaport Hovav) Lexical Conceptual Structure). The status of these primitives has been a constant topic within decompositional semantics. Depending on the particular theories, the vocabulary of semantic primitives is located on different levels of linguistic structure. Theories differ as to whether these predicates are elements of the object language (e.g., Natural Semantic Metalanguage),
133
134
I. Foundations of semantics of a semantic metalanguage (e.g., Montague Semantics) or of some set of conceptual entities (e.g., Lexical Conceptual Semantics). A finite set of primitives is rarely given, the notable exception being Natural Semantic Metalanguage. Most theories obviously assume a core of the decompositional vocabulary, including such items as cause, become, do, etc., but they also include many other predicates like alive, believe, in, mouth, write. Since they are typically not concerned with all subtleties of meaning, most theories often do not bother about the status of these elements. They might be conceived of as definable or not. While in Dowty’s (1979) approach cause gets a counterfactual interpretation in the vein of Lewis (1973), Lakoff (1972: 615f) treats cause as a primitive universal and, similarly, in Natural Semantic Metalanguage because is taken as a primitive. However, no matter how many primitives a theory assumes, something has to be said about how these elements can be grounded. Among the possible answers are the following: (i) Semantic primitives are innate (or acquired before language acquisition) (e.g., Bierwisch 1970). To my knowledge, no evidence from psycholinguistics or neurolinguistics has been obtained for this claim. (ii) Semantic primitives can be reduced to perceptual features. Considering the abstract nature of some semantic features, a complete reduction to perception seems unlikely (cf. Jackendoff 2002: 339). (iii) Semantic primitives are conceptually grounded (e.g., Natural Semantic Metalanguage, Conceptual Semantics). This is often claimed but rarely pursued empirically (but cf. Jackendoff 1983, Engelberg 2006).
3.4. Decompositions and definitions If lexical decompositions are semantically identified with biconditional meaning postulates, they can be regarded as definitions: They provide necessary and sufficient conditions. It has been questioned whether word meaning can be captured this way. The non-equivalence of kill and cause to die had been a major argument against Generative Semantics (cf. article 17 (Engelberg) Frameworks of decomposition, section 2). It has been observed that the definitional approach simply fails on a word-by-word basis: “There are practically no defensible examples of definitions; for all the examples we’ve got, practically all words (/concepts) are undefinable. And, of course, if a word (/concept) doesn’t have a definition, then its definition can’t be its meaning.” (Fodor 1998: 45) Given what was said about lexicography in section 2.1, we must concede that even by a less strict view on semantic equivalence, the definitional approach can only be applied to a subgroup of lexical items. Atomistic and prototype-based approaches to lexical meaning thrive on these observations. Decompositionalist approaches to word meaning have reacted differently to these problems. Some have just denied them: Natural Semantic Metalanguage claims that based on a limited set of semantic primitives and their combinatorial potential a complete decomposition is possible. Other approaches, in particular Conceptual Semantics, point to the particular descriptive level of their decompositions. They claim that cause does not have the same meaning as cause, the former being a conceptual entity. This allows Jackendoff (2002: 335f) to state that decompositions are not definitions since no equation between a word and a synonymous phrasal expression is attempted. However, the meanings of the decompositional predicates employed are all the more in need of explanations since our intuition about the meaning of natural language lexemes does not account for them anymore. As Pulman (2005) puts it in a discussion of Jackendoff’s approach: “[…] if your intuition is that part of the meaning
7. Lexical decomposition: Foundational issues of ‘drink’ is that liquid should enter a mouth, then unless there is some explicit connection between the construct mouth and the meaning of the English word ‘mouth’, that intuition is not accounted for.” Finally, some researchers assume that decompositions only capture the core meaning of a word (e.g., Kornfilt & Correra 1993: 83). Thus, decompositions do not exhaust a word’s meaning, and they are not definitions. This, of course, poses the questions what they are decompositions of and by what semantic criteria core aspects of lexical meaning can be identified. In any case, a conception of decompositions as incomplete meaning descriptions weakens the approach considerably. “It is, after all, not in dispute that some aspects of lexical meanings can be represented in quite an exiguous vocabulary; some aspects of anything can be represented in quite an exiguous vocabulary,” Fodor (1998: 48) remarks and adds: “It is supposed to be the main virtue of definitions that, in all sorts of cases, they reduce problems about the defined concepts to corresponding problems about its primitive parts. But that won’t happen unless each definition has the very same content as the concept it defines.” (Fodor 1998: 49) In some approaches the partial specification of meaning in semantic decompositions results from a distinction between semantic and conceptual representation (e.g., Bierwisch 1997). Semantic decompositions are underspecified and are supplemented on the utterance level with information from a conceptual representation. The completeness question is sometimes tied to the attempt to distinguish those aspects of meaning that are grammatically relevant from those that are not. This is done in different ways. Some assume that decompositions are incomplete and represent only what is grammatically relevant. Others differentiate between levels of representation; Lexical Decomposition Grammar distinguishes Semantic Form, which includes the grammatically relevant information, from Conceptual Structure. Finally, some assume one level of representation in which only particular parts are grammatically relevant; in Rappaport Hovav & Levin (1998), general decompositional templates contain the grammatically relevant information whereas idiosyncratic aspects are reflected by lexeme-specific constants that are inserted into these templates. However, the distinction between grammatically relevant and irrelevant properties within a semantic representation raises a serious theoretical question. It is a truism that not all subtleties of lexical meaning show grammatical effects. With to eat, the implied aspect of intentional agentivity is grammatically relevant in determining which argument becomes the subject while the implied aspect of biological food processing is not. However, distinguishing the grammatically relevant from the irrelevant by assigning them different locations in representations is not more than a descriptive convention unless one is able to show that grammatically relevant meaning is a particular type of meaning that can be distinguished on semantic grounds from grammatically irrelevant meaning. But do all the semantic properties that have grammatical effects (intentional agentivity and the like) form one natural semantic class and those that do not (biological food processing and the like) another? As it stands, it seems doubtful that such classes will emerge. As Jackendoff (2002: 290) notes, features with grammatical effects form a heterogeneous set. They include well-known distinctions such as those between agent and experiencer or causation and non-causation but also many idiosyncratic properties such as the distinction between emission verbs where the sound can be associated with an action of moving and which therefore allow a motion construction (the car squealed around the corner) and those where this is not the case (*the car honked around the corner) (cf. Levin & Rappaport Hovav 1996).
135
136
I. Foundations of semantics
3.5. Decompositions and their interpretation It is evident that in order to give empirical content to theoretical claims on the basis of lexical semantic representations, the meaning of the entities and configurations of entities in these semantic representations must be clear. This is a major problem not only for decompositional approaches to word meaning. To give an example from Distributed Morphology (DM), Harley & Noyer (2000: 368) notice that cheese is a mass noun and sentences like I had three cheeses for breakfast are unacceptable. The way DM is set up requires deriving the mass noun restriction from encyclopaedic knowledge (cf. article 17 (Engelberg) Frameworks of decomposition, section 10). Thus, it is listed in the encyclopaedia that “cheese does not typically come in discrete countable chunks or types”. However, to provide a claim like that with any empirical content, we need a methodology for determining what the encyclopaedic knowledge for cheese looks like. As we will see, lexical-semantic representations often exhibit a conflict when it comes to determining what a word actually means. If we use its syntactic behaviour as a guideline for its meaning, the semantic representation is not independently motivated but circularly determined by the very structures it purports to determine. If we use our naive everyday conception of what a word means, we lack an objective methodology of determining lexical meaning. Moreover, the two paths lead to different results. If we take a naive, syntax-independent look at the encyclopaedic semantics of cheese, a visit to the next supermarket will tell us that – contrary to what Harley & Noyer claim – all cheese comes in chunks or slices, so does all sausage and some of the fruit. Thus, it seems that the encyclopaedic knowledge in DM is not arrived at by naively watching the world around you. If it were, the whole architecture of Distributed Morphology would probably break down since, as corpus evidence shows, the German words for ‘cheese’ Käse, ‘sausage’ Wurst, and ‘fruit’ Obst are different with respect to the count/mass distinction although their supermarket appearance with respect to chunks and slices is very similar: Wurst can be freely used as a count and a mass noun; Käse is a mass noun that can be used as a count noun, especially, but not only, when referring to types of cheese, and Obst is obligatorily a mass noun. Thus, the encyclopaedic knowledge about cheese in Distributed Morphology seems to be forced by the fact that cheese is grammatically a mass noun. This two-way dependency between grammar and encyclopaedia immunizes DM against falsification and, thus, renders a central claim of DM empirically void. The neglect of semantic methodology and theory particularly in those approaches that advocate a semantically constrained but otherwise free syntactic generation of argument structures has of course been pointed out before: “Although lip service is often paid to the idea that a verb’s meaning must be compatible with syntactically determined meaning […], it is the free projection of arguments that is stressed and put to work, while the explication of compatibility is taken to be trivial.” (Rappaport Hovav & Levin 2005: 275). It has to be emphasized that it is one thing to observe differences in the syntactic behaviour of words in order to build assumptions about hitherto unnoticed semantic differences between them but it is a completely different matter to justify the existence of a particular semantic property of a word on the basis of a syntactic construction it occurs in and then use this property to predict its occurrence in this construction. The former is a useful heuristic method for tracing semantic properties that would then need to be justified independently; the latter is a circular construction of explanations that can rob a theory of most of its empirical value (cf. Engelberg 2006). This becomes particularly
7. Lexical decomposition: Foundational issues obvious in approaches that distinguish grammatically relevant from grammatically irrelevant meaning. For example, Grimshaw (2005: 75f) claims that “some meaning components have a grammatical life” (‘semantic structure’) while “some are linguistically inert” (‘semantic content’). Only the former have to be linguistically represented. She then discusses Jackendoff’s (1990: 253) representation of eat as a causative verb, which according to Jackendoff means that x causes y to go into x’s mouth. While she agrees that Jackendoff’s representation captures what eat “pretheoretically” means, she does not consider causation as part of the representation of eat since eat differs from other causatives (e.g., melt) in lacking an inchoative variant (Grimshaw 2005: 85f). Thus, we are confronted with a concept of grammatically relevant causation and a concept of grammatically irrelevant causation, which apart from their grammatical relevance seem to be identical. The result is a theory that pretends to explain syntactic phenomena on the basis of lexical meaning but actually just maps syntactic distinctions onto distinctions on a putatively semantic level, which, however, is not semantically motivated. Further problems arise if decompositions are adapted to syntactic structures: There is consensus among semanticists and philosophers that causation is a binary relation with both relata belonging to the same type. Depending on the theory, the relation holds either between events or proposition-like entities. In decompositional approaches, the causing argument is often represented as an individual argument, cause(x,p), since purportedly causative verbs only allow agent-denoting NPs in subject position. When this conflict is discussed, it is usually suggested that the first argument of cause is reinterpreted as ‘x does something’ or ‘that x does something’. Although such a reinterpretation can be formally implemented, it raises the question why decomposition is done at all. One could as well stay with simple predicate-argument structures like dry(x,y) and reinterpret them as ‘something that x does causes y to become dry’. Furthermore, the decision to represent the first argument of cause as an individual argument is not motivated by the meaning of the lexical item but, in a circular way, by the same syntactic structure that it claims to explain. Finally, the assumption that all causative verbs require an agentive NP in subject position is wrong. While it holds for verbs like German trocknen ‘to dry’ it does not hold for verbs like vergröβern ‘enlarge’ that allow sentential subjects. In any case, the asymmetrical representation of causation raises the problem that two cause predicates need to be introduced where one is reinterpreted in terms of the other. The problem is even bigger in approaches like Distributed Morphology that have done away with a mediating lexicon. If we assume that the bi-propositional (or bi-eventive) nature of causation is part of the encyclopaedic knowledge of vocabulary items that express causation, then causative verbs that only allow agentive subjects should be excluded from transitive verb frames completely. Even if the predicates used in decompositions are characterized in semantic terms, the criteria are not always sufficiently precise to decide whether or not a verb has the property expressed. Levin & Rappaport Hovav (1996) observe that there are counterexamples to the wide-spread assumption that telic intransitive verbs are unaccusative and agentive intransitive verbs are unergative, namely, verbs of sound, which are unergative but not necessarily agentive nor telic (beep, buzz, creak, gurgle). Therefore they propose that verbs that refer to “internally caused eventualities” are unergative, which is the case if “[…] some property of the entity denoted by the argument of the verb is responsible for the eventuality” (Levin & Rappaport Hovav 1996: 501). If we want to apply this idea to the unaccusative German zerbrechen ‘break’ and the unergative
137
138
I. Foundations of semantics knacken ‘creak’, which they do not discuss, we have to check whether it is true that some property of the twig is responsible for the creaking in der Zweig hat geknackt ‘the twig creaked’ while there is no property of the twig that is responsible for the breaking in der Zweig ist zerbrochen ‘the twig broke’. In order to do that, we must know what ‘internal causation’ is; that is, we have to answer questions like: What is ‘causation’? What is ‘responsibility’? What is ‘eventuality’? Is ‘responsibility’, contrary to all assumptions of theories of action, a predicate that applies to properties of twigs? What property of twigs are we talking about? Is (internal) ‘causation’, contrary to all theories of causation, a relation between properties and eventualities? As long as these questions are not answered, proponents of the theory will agree that the creaking of the twig but not the breaking is internally caused while opponents will deny it. And there is no way to resolve this (cf. Engelberg 2001). Similar circularities have been noticed by others. Fodor (1998: 51, 60ff), citing work from Pinker (1989) and Higginbotham (1994), criticizes that claims about linking based on interminably vague semantic properties elude any kind of evaluation. This is all the worse since the predicates involved in decompositions are not only expressions in the linguist’s metalanguage but are concepts that are attributed to the speaker and his or her knowledge about language (Fodor 1998: 59). Stipulations about the structure of decompositions can diminish the empirical value of the theory, too. Structural aspects of decompositions concern the way embedded predicates are combined as well as the argument structure of these predicates. For example, predicates like cause or poss are binary because we conceive of causation and possession as binary relations. Their binarity is a structural aspect that is deeply rooted in our understanding of these concepts. However, it is just by convention that most approaches represent the causing entity and the possessor as the first argument of cause and poss, respectively. Which argument of multi-place predicates stands for which entity is determined by the truth conditions for this predicates. Thus, the difference between poss(xpossessor,yentity-possessed) and poss(xentitiy-possessed,ypossessor) is just a notational one. Whenever explanations rely on this difference, they are not grounded in semantics but in notational conventions. This is, for example, the case in Lexical Decomposition Grammar where the first argument of poss falls out as the higher one – with all its consequences for Theta Structure and linking principles (Wunderlich 1997: 39). In summary, the problems with interpreting the predicates used in decompositions and structural stipulations severely limit the empirical content of predictions based on these decompositions. Two ways out of this situation have been pursued only rarely. Decompositional predicates can be given precise truth conditions, as is done in Dowty (1979), or they can be linked to cognitive concepts that are independently motivated (e.g., Jackendoff 1983, Engelberg 2006).
3.6. Decompositions and cognition Starting from the 1970s, psycholinguistic evidence has been used to argue for or against lexical decompositions. While some approaches to decompositions accepted psycholinguistic data as relevant evidence, proponents of particular lexical theories denied that their theories are about lexical processing at all. Dowty (1979: 391) emphasized that what the main decompositional operators determine is not what the speaker/listener
7. Lexical decomposition: Foundational issues must compute but to what he can infer. Similarly, Goddard (1998: 135) states that “there is no claim that people, in the normal course of linguistic thinking, compose their thoughts directly in terms of semantic primitives; or, conversely, that normal processes of comprehension involve real-time decomposition down to the level of semantic primitives.” It has also been suggested that in theorizing about decompositional versus atomistic theories one should distinguish whether lexical concepts are definitionally primitive, computationally primitive (pertaining to language processing), and/or developmentally primitive (pertaining to language acquisition) (cf. Fodor et al. 1980: 313; Carey 1982: 350f). When lexical decompositions are interpreted in psycholinguistic terms, the typical assumption is that the components of a decomposition are processed each time the lexical item is processed. Lexical processing efforts should emerge as a function from the complexity of the lexical decomposition to processing time: The more complex the decomposition, the longer the processing time. Most early psycholinguistic studies did not produce evidence for lexical decomposition (cf. Fodor et al. 1980; Johnson-Laird 1983). Employing a forced choice task and a rating test, Fodor et al. (1980) failed to find processing differences between causative verbs like kill, which are putatively decompositionally complex, and non-causative verbs like bite. Fodor, Fodor & Garrett (1975: 522) reported a significant difference between explicit negatives (e.g., not married) and putatively implicit negatives (e.g., an unmarried-feature in bachelor) in complex conditional sentences like (7). (7)
a. If practically all men in the room are not married, then few of the men in the room have wives. b. If practically all men in the room are bachelors, then few of the men in the room have wives.
Sentences like (7a) that contain explicit negatives gave rise to longer processing times, thus suggesting that bachelor does not contain hidden negatives. Measuring fixation time during reading, Rayner & Duffy (1986) did not find any differences between putatively complex words like causatives and non-causatives. Similarly, Roelofs (1997: 48ff) discussed several models for word retrieval and argued for a non-decompositional spreading-activation model. More recent studies display a more varied picture. Gennari & Poeppel (2003) compared eventive verbs like build, distort, show, which denote causally structured events, with stative verbs like resemble, lack, love, which do not involve complex cause/ become structures. Controlling for differences in argument structure and thematic roles, they carried out a self-paced reading study and a visual lexical decision task. They found that semantic complexity was reflected in processing time and that elements of decomposition-like structures were activated during processing. McKoon & MacFarland (2002) adopted Rappaport Hovav & Levin’s (1998) template-based approach to decomposition and their distinction between verbs denoting internal causation (bloom) and external causation (break) (Levin & Rappaport Hovav 1995, 1996). They reported longer processing times for break-type verbs than for bloom-type verbs in grammaticality judgments, reading time experiments, and lexical decision tasks. They interpreted the results as confirmation that break-type verbs involve more complex decompositions than bloom-type verbs.
139
140
I. Foundations of semantics Different conclusions were drawn from other experiments. Applying a “release from proactive interference” technique, Mobayyen & de Almeida (2005) investigated the processing times for lexical causatives (bend, crack, grow), morphological causatives (thicken, darken, fertilize), perception verbs (see, hear, smell), and repetitive perception verbs with morphological markers (e.g., re-smell). If verbs were represented in the form of decompositions, the semantically more complex lexical and morphological causatives should pattern together and evoke longer processing times than perception verbs. However, this did not turn out to be the case. Morphological causatives and the morphologically complex re-verbs required longer processing than lexical causatives and perception verbs. Similar results have been obtained in action-naming tasks carried out with Alzheimer patients (cf. de Almeida 2007). That lead Mobayyen and de Almeida to the conclusion that the latter two verb types are both semantically simple and refer to non-complex mental particulars. Another line of psycholinguistic/neurolinguistic research concerns speakers with category-specific semantic deficits due to brain damage. Data obtained from these speakers have been used to argue for semantic feature approaches as well as for approaches employing meaning postulates in the nominal domain (cf. the discussion in de Almeida 1999). However, de Almeida (2001: 483) emphasizes that so far no one has found evidence for category-specific verb concept deficits, for example, deficits concerning features like cause or go. Evidence for decomposition theories has also been sought in data from language acquisition. If a meaning of a word is its decomposition then learning a decomposed word means learning its decomposition. The most explicit early theory of decompositionbased learning is Clark’s (1973) semantic-feature based theory of word-learning. In her view, only some of the features that make up a lexical representation are present when a word is first acquired whereas the other features are learned while the word is already used. The assumption that these features are acquired only successively predicts that children overgeneralize heavily when acquiring a new word. In subsequent research, it turned out that Clark’s theory did not conform to the data: (i) Overgeneralization does not occur as often as predicted; (ii) with recently acquired words, undergeneralization is more typical than overgeneralization, and (iii) at some stages of acquisition, the referents a word is applied to do not have any features in common (Barrett 1995: 375ff, cf. also the review in Carey 1982: 361ff). It has been repeatedly argued that meaning postulates are better suited to explain acquisition processes (cf. Chierchia & McConnell-Ginet 1990: 363f, Bartsch & Vennemann 1972: 22). However, even if Clark’s theory of successive feature acquisition is not tenable, related data are cited in favor of decompositions to show that some kind of access to semantic features is involved in acquisition. For example, it has been argued that a meaning component cause is extracted from verbs by children and used in overgeneralizations like he falled it (cf. the overview in Clark 2003: 233ff). Some research on the acquisition of argument structure alternations has been used to argue for particular decompositional approaches, such as Pinker (1989) for Lexical Conceptual Structures in the vein of Levin & Rappaport Hovav, and Brinkmann (1997) for Lexical Decomposition Grammar. In summary, the question whether decompositions are involved in language processing or language acquisition remains open. Although processing differences for different classes of verbs have to be acknowledged, it is often difficult to conclude from these data what forms of lexical representation are compatible with these data.
7. Lexical decomposition: Foundational issues
4. Conclusion From a heuristic and descriptive point of view, lexical decomposition has proven to be a very successful device that has made it possible to discover and to tackle numerous lexical phenomena, in particular, at the syntax-semantics interface. Yet, from a theoretical point of view, lexical decompositions have remained a problematic concept that is not always well grounded in theories of semantics and cognition: – The basic predicates within decompositions are often elusive and lack truth conditions, definitions, or an empirically grounded link to basic cognitive concepts. – The lack of semantic grounding of decompositions often leads to circular argumentations in linking theories. – The cognitive status of decompositions is by and large unclear; it is not known whether and how decompositions are involved in lexical processing and language acquisition. Thus, decompositions still raise many questions: “But even if the ultimate answers are not in sight, there is certainly a sense of progress since the primitive approaches of the 1960s.” (Jackendoff 2002: 377)
5. References de Almeida, Roberto G. 1999. What do category-specific semantic deficits tell us about the representation of lexical concepts? Brain & Language 68, 241–248. de Almeida, Roberto G. 2001. Conceptual deficits without features: A view from atomism. Behavioral and Brain Sciences 24, 482–483. de Almeida, Roberto G. 2007. Cognitive science as paradigm of interdisciplinarity: the case of lexical concepts. In: J. L. Audy & M. Morosini (eds.). Interdisciplinarity in Science and at the University. Porto Alegre: EdiPUCRS, 221–276. Barrett, Martyn 1995. Early lexical development. In: P. Fletcher & B. MacWhinney (eds.). The Handbook of Child Language. Oxford: Blackwell, 362–392. Bartsch, Renate & Theo Vennemann 1972. Semantic Structures. A Study in the Relation between Semantics and Syntax. Frankfurt/M.: Athenäum. Bendix, Edward H. 1966. Componential Analysis of General Vocabulary. The Semantic Structure of a Set of Verbs in English, Hindi, and Japanese. Bloomington, IN: Indiana University. Bierwisch, Manfred 1969. Certain problems of semantic representations. Foundations of Language 5, 153–184. Bierwisch, Manfred 1970. Semantics. In: J. Lyons (ed.). New Horizons in Linguistics. Harmondsworth: Penguin, 166–184. Bierwisch, Manfred 1997. Lexical information from a minimalist point of view. In: C. Wilder, H.-M. Gärtner & M. Bierwisch (eds.). The Role of Economy Principles in Linguistic Theory. Berlin: Akademie Verlag, 227–266. Brinkmann, Ursula 1997. The Locative Alternation in German. Its Structure and Acquisition. Amsterdam: Benjamins. Carey, Susan 1982. Semantic development: The state of the art. In: E. Wanner & L. R. Gleitman (eds.). Language Acquisition. The State of the Art. Cambridge: Cambridge University Press, 347–389. Carnap, Rudolf 1952. Meaning postulates. Philosophical Studies 3, 65–73. Chierchia, Gennaro & Sally McConnell-Ginet 1990. Meaning and Grammar. An Introduction to Semantics. Cambridge, MA: The MIT Press.
141
142
I. Foundations of semantics Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge, MA: The MIT Press. Clark, Eve V. 1973. What’s in a word? On the child’s acquisition of semantics in his first language. In: T. E. Moore (ed.). Cognitive Development and the Acquisition of Language. New York: Academic Press, 65–110. Clark, Eve V. 2003. First Language Acquisition. Cambridge: Cambridge University Press. Coseriu, Eugenio 1964. Pour une sémantique diachronique structurale. Travaux de Linguistique et de Littérature 2, 139–186. Dowty, David R. 1979. Word Meaning and Montague Grammar. The Semantics of Verbs and Times in Generative Semantics and in Montague’s PTQ. Dordrecht: Reidel. Engelberg, Stefan 2001. Immunisierungsstrategien in der lexikalischen Ereignissemantik. In: J. Dölling & T. Zybatov (eds.). Ereignisstrukturen. Leipzig: Institut für Linguistik der Universität Leipzig, 9–33. Engelberg, Stefan 2006. A theory of lexical event structures and its cognitive motivation. In: D. Wunderlich (ed.). Advances in the Theory of the Lexicon. Berlin: de Gruyter, 235–285. Fillmore, Charles J. 1968. Lexical entries for verbs. Foundations of Language 4, 373–393. Fodor, Janet D. 1977. Semantics: Theories of Meaning in Generative Grammar. New York: Crowell. Fodor, Janet D., Jerry A. Fodor & Merrill F. Garrett 1975. The psychological unreality of semantic representations. Linguistic Inquiry 6, 515–531. Fodor, Jerry A. 1970. Three reasons for not deriving ‘kill’ from ‘cause to die’. Linguistic Inquiry 1, 429–438. Fodor, Jerry A. 1998. Concepts. Where Cognitive Science Went Wrong. Oxford: Clarendon Press. Fodor, Jerry A., Merrill F. Garrett, Edward C. T. Walker & Cornelia H. Parkes 1980. Against definitions. Cognition 8, 263–367. Gennari, Silvia & David Poeppel 2003. Processing correlates of lexical semantic complexity. Cognition 89, B27–B41. Goddard, Cliff 1998. Semantic Analysis. A Practical Introduction. Oxford: Oxford University Press. Greimas, Algirdas Julien 1966. Sémantique structurale. Paris: Larousse. Grimshaw, Jane 2005. Words and Structure. Stanford, CA: CSLI Publications. Harley, Heidi & Rolf Noyer 2000. Formal versus encyclopedic properties of vocabulary: Evidence from nominalizations. In: B. Peeters (ed.). The Lexicon-Encyclopedia Interface. Amsterdam: Elsevier, 349–375. Harris, William T. (ed.) 1923. Webster’s New International Dictionary of the English Language. Springfield, MA: Merriam. Higginbotham, James 1994. Priorities of thought. Supplementary Proceedings of the Aristotelian Society 68, 85–106. Hjelmslev, Louis 1963. Omkring sprogteoriens grundlæggelse. Festskrift udgivet af Københavns Universitet i anledning of Universitetets Aarsfest. København: E. Munksgaard, 1943, 3–113. English translation: Prolegomena to a Theory of Language. Madison, WI: The University of Wisconsin Press, 1963. Jackendoff, Ray 1983. Semantics and Cognition. Cambridge, MA: The MIT Press. Jackendoff, Ray 1990. Semantic Structures. Cambridge, MA: The MIT Press. Jackendoff, Ray 1992. Languages of the Mind. Essays on Mental Representation. Cambridge, MA: The MIT Press. Jackendoff, Ray 2002. Foundations of Language. Brain, Meaning, Grammar, Evolution. Oxford: Oxford University Press. Johnson-Laird, Philip N. 1983. Mental Models. Towards a Cognitive Science of Language, Inference, and Consciousness. Cambridge, MA: Harvard University Press. Katz, Jerrold J. 1967. Recent issues in semantic theory. Foundations of Language 3, 124–194. Katz, Jerrold J. 1971. Semantic theory. In: D. D. Steinberg & L. A. Jakobovits (eds.). Semantics. An Interdisciplinary Reader in Philosophy, Linguistics, and Psychology. Cambridge: Cambridge University Press, 297–307.
7. Lexical decomposition: Foundational issues Katz, Jerrold J. & Jerry A. Fodor 1963. The structure of a semantic theory. Language 39, 170–210. Katz, Jerrold J. & Paul M. Postal 1964. An Integrated Theory of Linguistic Descriptions. Cambridge, MA: The MIT Press. Kornfilt, Jaklin & Nelson Correa 1993. Conceptual structure and its relation to the structure of lexical entries. In: E. Reuland & W. Abraham (eds.). Knowledge and Language, vol. II: Lexical and Conceptual Structure. Dordrecht: Kluwer, 79–118. Lakoff, George 1972. Linguistics and natural logic. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 545–655. Levin, Beth 1993. English Verb Classes and Alternations. A Preliminary Investigation. Chicago, IL: The University of Chicago Press. Levin, Beth & Malka Rappaport Hovav 1995. Unaccusativity. At the Syntax-Lexical Semantics Interface. Cambridge, MA: The MIT Press. Levin, Beth & Malka Rappaport Hovav 1996. Lexical semantics and syntactic structure. In: S. Lappin (ed.). The Handbook of Contemporary Semantic Theory. Oxford: Blackwell, 487–507. Lewis, David 1973. Causation. The Journal of Philosophy 70, 556–567. Löbner, Sebastian 2002. Understanding Semantics. London: Arnold. Lorenz, Wolfgang & Gerd Wotjak 1977. Zum Verhältnis von Abbild und Bedeutung. Überlegungen im Grenzfeld zwischen Erkenntnistheorie und Semantik. Berlin: Akademie Verlag. Lyons, John 1968. Introduction to Theoretical Linguistics. Cambridge: Cambridge University Press. McKoon, Gail & Talke MacFarland 2002. Event templates in the lexical representations of verbs. Cognitive Psychology 45, 1–44. Mobayyen, Forouzan & Roberto G. de Almeida 2005. The influence of semantic and morphological complexity of verbs on sentence recall: Implications for the nature of conceptual representation and category-specific deficits. Brain and Cognition 57, 168–175. Pinker, Steven 1989. Learnability and Cognition. The Acquisition of Argument Structure. Cambridge, MA: The MIT Press. Pottier, Bernard 1963. Recherches sur l’analyse sémantique en linguistique et en traduction mécanique. Nancy: Publications Linguistiques de la Faculté de Lettres. Pottier, Bernard 1964. Vers une sémantique moderne. Travaux de Linguistique et de Littérature 2, 107–137. Pulman, Stephen G. 2005. Lexical decomposition: For and against. In: J. I. Tait (ed.). Charting a New Course: Natural Language Processing and Information Retrieval: Essays in Honour of Karen Sparck Jones. Dordrecht: Kluwer, 155–174. Ramchand, Gillian C. 2008. Verb Meaning and the Lexicon. A First-Phase Syntax. Cambridge: Cambridge University Press. Rappaport Hovav, Malka & Beth Levin 1998. Building verb meanings. In: M. Butt & W. Geuder (eds.). The Projection of Arguments: Lexical Compositional Factors. Stanford, CA: CSLI Publications. 97–134. Rappaport Hovav, Malka & Beth Levin 2005. Change-of-state verbs: Implications for theories of argument projection. In: N. Erteschik-Shir & T. Rapoport (eds.). The Syntax of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 274–286. Rayner, Keith & Susan A. Duffy 1986. Lexical complexity and fixation times in reading: Effects of word frequency, verb complexity, and lexical ambiguity. Memory & Cognition 14, 191–201. Roelofs, Ardi 1997. A case for nondecomposition in conceptually driven word retrieval. Journal of Psycholinguistic Research 26, 33–67. von Stechow, Arnim 1995. Lexical decomposition in syntax. In: U. Egli et al. (eds.). Lexical Knowledge in the Organisation of Language. Amsterdam: Benjamins, 81–177. Svensén, Bo 1993. Practical Lexicography. Principles and Methods of Dictionary-Making. Oxford: Oxford University Press. Thorndike, Edward L. 1941. Thorndike Century Senior Dictionary. Chicago, IL: Scott, Foresman and Company.
143
144
I. Foundations of semantics Wiegand, Herbert E. 1989. Die lexikographische Definition im allgemeinen einsprachigen Wörterbuch. In: F. J. Hausmann et al. (eds.). Wörterbücher. Dictionaries. Dictionnaires. Ein internationales Handbuch zur Lexikographie. An International Encyclopedia of Lexicography. Encyclopédie internationale de lexicographie (HSK 5.1). Berlin: de Gruyter, 530–588. Wunderlich, Dieter 1997. Cause and the structure of verbs. Linguistic Inquiry 28, 27–68.
Stefan Engelberg, Mannheim (Germany)
II. History of semantics 8. Meaning in pre-19th century thought 1. 2. 3. 4. 5. 6. 7.
Introduction Semantic theories in classical antiquity Hellenistic theories of meaning (ca. 300 B.C.–200 A.D.) Late classical sources of medieval semantics Concepts of meaning in the scholastic tradition Concepts of meaning in modern philosophy References
Abstract The article provides a broad survey of the historical development of western theories of meaning from antiquity to the late 18th century. Although it is chronological and structured by the names of the most important authors, schools, and traditions, the focus is mainly on the theoretical content relevant to the issue of linguistic meaning, or on doctrines that are directly related to it. I attempt to show that the history of semantic thought does not have the structure of a continuous ascent or progress; it is rather a complex multilayer process characterized by several ruptures, such as the decline of ancient or the expulsion of medieval learning by Renaissance Humanism, each connected with substantial losses. Quite a number of the discoveries of modern semantics are therefore in fact rediscoveries of much older insights.
1. Introduction Although it is commonly agreed that semantics as a discipline emerged in the 19th and 20th centuries, the history of semantic theories is both long and rich. In fact semantics started with a rather narrow thematic scope, since it focused, like Reisig, on the “development of the meaning of certain words, as well as the study of their use”, or, like Bréal, on the “laws which govern changes in meaning, the choice of new expressions, the birth and death of phrases” (see article 9 (Nerlich) Emergence of semantics), but many of the theoretical issues that were opened up by this flourishing discipline during the 20th century were already traditional issues of philosophy, especially of logic. The article aims to give an overview of the historical development of western theories of meaning from antiquity to the late 18th century. The attempt to condense more than 2000 years of intense intellectual labor into 25 pages necessarily leads to omissions and selections which are, of course, to a certain degree subjective. The article should not therefore be read with the expectation that it will tell the whole story. Quite a lot of what could or should have been told is actually not even mentioned. Still it is, I think, a fair sketch of the overall process of western semantic thought. The oldest known philosophical texts explicitly concerned with the issue of linguistic meaning evolved out of more ancient reflexions on naming and on the question whether, and in what sense, the relation between words and things is conventional or natural. Plato Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 145–172
146
II. History of semantics and Aristotle were fairly aware that word meaning needs to be discussed together with sentence meaning (see sections 2.2., 2.3.). The Stoic logicians later on were even more aware of this (see section 3.), but Aristotle’s approach in his most influential book Peri hermeneias was more influential than them. His account of the relation between linguistic expressions (written and spoken), thoughts and things, combining linguistics (avant la lettre), epistemology, and ontology, (see section 4.2.) brought about a long-lasting tendency within the scholastic tradition of Aristotle commentaries to focus on word meaning and to center semantics on the question whether words signify mental concepts or things. The numerous theories that have been designed to answer to this question (see section 5.2.) have strengthened the insight, shared by modern conceptual semantics, that linguistic signs do not refer to the world per se, but rather to the world as conceptualized by language users (see article 30 (Jackendoff) Conceptual Semantics). The way in which the question was put and answered in the scholastic tradition, however, in some respects fell short of its own standards. For ever since Abelard (see section 5.3.) there existed a propositional approach to meaning in scholastic logic fleshed out especially in the terminist logic (see section 5.4.) whose elaborate theories of syncategorematic terms and supposition suggested that by no means every word is related to external things, and that the reference of terms is essentially determined by the propositional context. Medieval philosophy gave birth to a number of innovative doctrines such as Roger Bacons ‘use theory’ of meaning (see section 5.5.), the speculative grammar, functionally distinguishing the word classes according to their different modes of signifying (see section 5.6.), or the late medieval mentalist approach, exploring the relations between spoken utterances and the underlying mental structures of thought (see section 5.7.). Compared to the abundance of semantic theories produced particularly within the framework of scholastic logic, the early modern contribution to semantics in the narrow sense is fairly modest (see section 6.1.). The philosophy of the 17th and 18th centuries, however, disconnecting semantics from logic and centering language theory on epistemology, opened up important new areas of interest, and reopened old ones. So, the idea of a universal grammar underlying all natural languages was forcefully reintroduced by the rationalists in the second half of the 17th century independently from the medieval grammatica speculativa (see section 6.4.). One of the new issues, or issues that were discussed in a new way, was the question of the function of language, or of signs in general, in the process of thinking. The fundamental influence of language on thought and knowledge became widely accepted, particularly, though not exclusively, within the empirist movement (see sections 6.2., 6.3., 6.5., 6.6.). Another innovation: from the late 17th century the philosophy of language showed a growing consciousness of and interest in the historical dimension of language (see section 6.7.).
2. Semantic theories in classical antiquity 2.1. The preplatonic view of language Due to the fragmentary survival of textual relics from the preplatonic period the very beginnings of semantic thought are withdrawn from our sight. Though some passages in the preserved texts as well as in several later sources might prompt speculations about the earliest semantic reflexions, the textual material available seems to be insufficient for a reliable reconstruction of a presocratic semantic theory in the proper sense. The case
8. Meaning in pre-19th century thought of the so-called Sophists (literally: ‘wise-makers’) of the 5th century B.C. is somewhat different: they earned their living mainly by teaching knowledge and skills that were thought to be helpful for gaining private and political success. Their teachings largely consisted of grammar and rhetoric as resources or the command of language and argumentation. One of their standard topics was the ‘correctness of names’ (orthotes onomaton) which was seemingly treated by all the main sophists in separate books, such as Protagoras (ca. 490–420 B.C.), Hippias (ca. 435 B.C.), and Prodicus (ca. 465–415 B.C.), the temporary teacher of Socrates (Schmitter 2001).
2.2. Plato (427–347 B.C.) The problem of the correctness of names, spelled out as the question whether the relation between names and things is natural or conventional, also makes up the main topic of Plato’s dialogue Cratylus (ca. 388 B.C.), the earliest known philosophical text on language theory. The position of naturalism is represented by Cratylus, claiming that “everything has a right name of its own, which comes by nature, and that a name is not whatever people call a thing by agreement, … but that there is a kind of inherent correctness in names, which is the same for all men, both Greeks and barbarians.” (Cratylus 383a–b). The opposite view of conventionalism is advanced by Hermogenes who denies that “there is any correctness of name other than convention (syntheke) and agreement (homologia)” and holds that “no name has been generated by nature for any particular thing, but rather by the custom and usage (thesei) of those who use the name and call things by it” (384c–d). Democritus (ca. 420 B.C.) had previously argued against naturalism by pointing to semantic phenomena such as homonymy, synonymy, and name changes. The position represented by Hermogenes, however, goes beyond the conventionalism of Democritus insofar as he does not make any difference between the various natural languages and an arbitrary individual name-giving (Cratylus 385d–e) resulting in autonomous idiolects or private language. Plato’s dialogue expounds with great subtlety the problems connected to both positions. Plato advocates an organon theory of language, according to which language is a praxis and each name is functioning as “an instrument (organon) of teaching and of separating (i.e. classifying) reality” (388b–c). He is therefore on the one hand showing, contrary to Hermogenes’s conventionalism, that names, insofar as they stand for concepts of things, have to be well defined and to take into account the nature of things in order to carve reality at the joints. On the other hand he demonstrates against Cratylus’s overdrawn naturalism the absurdities one can encounter in trying to substantiate the correctness of names etymologically, by tracing all words to a core set of primordial or ‘first names’ (prota onomata) linked to the objects they denote on grounds of their phonological qualities. Even if Plato’s own position is more on the side of a refined naturalism (Sedley 2003), the dialogue, marking the strengths and shortcomings of both positions, leaves the question undecided. Its positive results rather consist (1) in underlining the difference between names and things and (2) in the language critical warning that “no man of sense should put himself or the education of his mind in the power of names” (440c). Plato in his Cratylus (421d–e, 424e, 431b–c) draws, for the first time, a distinction between the word-classes of noun (onoma) and verb (rhema). In his later Dialogue Sophistes he
147
148
II. History of semantics shows that the essential function of language does not consist in naming things but rather in forming meaningful propositions endowed with a truth-value (Soph. 261c–262c; Baxter 1992). The question of truth is thus moved from the level of individual words and their relation to things to the more adequate level of propositions. No less important, however, is the fact that Plato uses this as the basis for what can be seen as the first account of propositional attitudes (see article 60 (Swanson) Propositional attitudes). For in order to claim that “thought, opinion, and imagination … exist in our minds both as true and false” he considers it necessary to ascribe a propositional structure to them. Hence Plato defines “forming opinion as talking and opinion as talk which has been held, not with someone else, nor yet aloud, but in silence with oneself” (Theaetet 190a), and claims, more generally, that “thought and speech (are) the same, with this exception, that what is called thought is the unuttered conversation of the soul with herself”, whereas “the stream of thought which flows through the lips and is audible is called speech” (Soph. 263e). Even if this view is suggested already by the Greek language itself in which logos means (inter alia) both speech or discourse and thought, it is in the Sophistes where the idea that thought is a kind of internal speech is explicitly stated for the first time. Thus, the Sophistes marks the point of departure of the long tradition of theories on the issue of mental speech or language of thought.
2.3. Aristotle (384–322 B.C.) Aristotle’s book Peri hermeneias (De interpretatione, On interpretation), especially the short introductory remarks (De int. 16a 3–8), can be seen as “the most influential text in the history of semantics” (Kretzmann 1974: 3). In order to settle the ground for his attempt to explain the logico-semantic core notions of name, verb, negation, affirmation, statement and sentence, Aristotle delineates the basic coordinate system of semantics, comprising the four elements he considers to be relevant for giving a full account of linguistic signification: i.e. (1) written marks, (2) spoken words, (3) mental concepts, and (4) things. This system, which under labels like ‘order of speech’ (ordo orandi) or ‘order of signification’ (ordo significationis, see 5.2.), became the basic framework for most later semantic theories (see Tab. 8.1.) does not only determine the interrelation of these elements but also gives some hints about the connection of semantics and epistemology. According to Aristotle (De int. 16a 3–8), spoken words (ta en te phone; literally: ‘those which are in the voice’) are symbols (symbola) of affections (pathemata) in the soul (i.e. mental concepts (noemata) according to De int. 16a 10), and written marks symbols of spoken words. And just as written marks are not the same for all men, neither are spoken sounds. But what these are in the first place signs (semeia) of – affections of the soul – are the same for all; and what these affections are likenesses (homoiomata) of – actual things – are also the same.
Though the interpretation of this passage is highly controversial, the most commonly accepted view is that Aristotle seems to claim that the four elements mentioned are interrelated such that spoken words which are signified by written symbols are signs of mental concepts which in turn are likenesses (or images) of things (Weidemann 1982). This passage presents at least four assumptions that are fundamental to Aristotle’s semantic theory: 1. Mental concepts are natural and therefore intersubjectively the same for all human beings.
8. Meaning in pre-19th century thought 2. Written or spoken words are, in contrast, conventional signs, i.e. sounds significant by agreement (kata syntheken, De int. 16a 19–27). 3. Mental concepts which are both likenesses (homoiomata) and natural effects of external things are directly related to reality and essentially independent from language. 4. Words and speech are at the same time sharply distinguished from the mental concepts and closely related to them, insofar as they refer to external things only through the mediation of concepts. Aristotle’s description of the relation between words, thoughts, and things can therefore be seen as an anticipation of the so called ‘semantic triangle’ (Lieb 1981). All four tenets played a prominent role in philosophy of language for a long time and yet, each of them, at one time or another, came under severe attack. The tenet that vocal expressions signify things trough the mediation of concepts, which the late ancient commentators saw as the core thesis of Aristotle’s semantics (Ebbesen 1983), leads, if spelled out, to the distinction of intension and extension (Graeser 1996: 39). For it seem to be just an abbreviated form of expressing that for each word to signify something (semainei ti) depends upon its being connected to an understanding or a formula (logos) which can be a definition, a description or a finite set of descriptions (Metaphysics 1006a 32–1006b 5) picking out a certain thing or a certain kind of things by determining what it is to be a such and such thing (Met. 1030a 14–17). Due to the prominency of the order of signification it may seem as if Aristotle were focusing on word semantics. There is, however, good reason to maintain that Aristotle’s primary intention was a semantics of propositions. Sedley (1996: 87) has made the case that for Aristotle, just as for other teleologists like Plato and the Stoics, “who regard the whole as ontologically prior to the part … the primary signifier is the sentence, and individual words are considered only secondarily, in so far as they contribute to the sentence’s function.” In face of this he characterizes De interpretatione, which is usually seen as a specimen of word semantics as “the most seriously misunderstood text in ancient semantics” (Sedley 1996: 88). This may hold for most of the modern interpretations but not so for the medieval commentators. For quite a number of them were well aware that the somewhat cryptic formulation “those which are in the voice” does not necessarily stand for nouns and verbs exclusively but was likely meant to include also propositions and speech in general. That the treatment of words is tailored to that of propositions is indicated already by the way in which Aristotle distinguishes onomata (nouns) and rhemata (predicative expressions) not as word classes as such but rather as word classes regarding their function in sentences. For when, as Aristotle remarks, predicative expressions “are spoken alone, as such, they are nouns (onomata) that signify something – for the one who utters them forms the notion [or: arrests the thought], and the hearer pauses” (De int. 16a 19–21). The two classes of nouns and predicative expressions, the members of which are described as the smallest conventionally significant units, are complemented by the ‘conjunctions’ (syndesmoi) which include conjunctions, the article, and pronouns and are the direct ancestors of Priscian’s syncategoremata that became a major topic in medieval logical semantics (see section 4.5.). Although the natural/conventional distinction separates mental concepts from spoken language, Aristotle describes the performance of thinking, just like Plato, in terms of an
149
150
II. History of semantics internal speaking. He distinguishes between an externally spoken logos and a logos or “speech in the soul” (Met. 1009a 20) the latter of which provides the fundament of signification, logical argumention and demonstration (Posterior analytics 76b 24; Panaccio 1999: 34–41), and is at the same time the place where truth and falsehood are located properly (Meier-Oeser 2004: 314). Thus, when in De interpretatione (17a 2–24) it is said that the fundamental logical function of declarative sentences to assert or deny something of something presupposes a combination of a name and a verb or an inflection of a verb, this interconnection of truth or falsity and propositionality only seemingly refers primarily to spoken language. For in other passages, as Met. 1027b 25–30, or in the last chapter of De interpretatione, Aristotle emphasizes that combination and separation and thus truth “are in thought” properly and primarily. How this internal discourse which should be ‘the same for all’ precisely relates to spoken language remains unclear in Aristotle. The close connection Aristotle has seen between linguistic signification and thought becomes evident in his insistance on the importance of an unambiguous and well-defined use of language the neglect of which must result in a destruction of both communication and rational discourse; for, as he claimed (Met. 1006b 7–12): not to signify one thing is to signify nothing, and if words do not signify anything there is an end of discourse with others, and even, strictly speaking, with oneself; because it is impossible to think of anything if we do not think of one thing. Noticing that the normal way of speaking does not conform to the ideal of unambiguity, Aristotle is well aware about the importance of a detailed analysis of the semantic issues of homonymy, synonymy, paronymy etc. (Categories 1a; Sophistical refutations 165f) and devotes the first two books of his Topics to the presentation of rules for detecting, and strategies for avoiding ambiguitiy.
3. Hellenistic theories of meaning (ca. 300 B.C.–200 A.D.) It is uncontroversial that a fully fledged theory of language, considering all the different phonetic, syntactic, semantic, and pragmatic aspects of language as its essential subject matter, is the invention of the Stoic philosophers (Pohlenz 1939). However, since hardly any authentic Stoic text has come down to us, it is still true that “the nature of the Stoics’ philosophy of language is the most tantalizing problem in the history of semantics” (Kretzmann 1967: 363). After a long period of neglect and contempt, the Stoic account of semantics is today generally seen as superior even to Aristotle’s, which was historically more influential (Graeser 1978: 77). For it has become a current conviction that central points of Stoic semantics are astonishingly close to some fundamental tenets that have paved the way for modern analytical philosophy. According to Sextus Empiricus (Adv. math. 8.11–12; Long & Sedley 1987 [=L&S] 33B) the principal elements of Stoic semantics are: 1. the semainon, i.e. that which signifies, or the signifier, which is a phoneme or grapheme, i.e. the material configuration that makes up a spoken word or rather – because Stoic semantics is primarily a sentence semantics – a spoken or written sentence; 2. the tynchanon (or: ‘name-bearer’), i.e the external material object or event referred to; and
8. Meaning in pre-19th century thought 3. the semainomenon, i.e. that which is signified. This is tantamount to the core concept and the most characteristic feature of the Stoic propositionalist semantics, the lekton (which can be translated both as ‘that which is said’ and ‘that which can be said’, i.e. the ‘sayable’). The lekton proper or the ‘complete lekton’ (lekton autoteles) is the meaning of a sentence, whereas the meaning of a word is characterized as an ‘incomplete lekton’ (lekton ellipes). Even though questions and demands may have a certain kind of lekton, the prototype of the Stoic lekton corresponds to what in modern terminology would be classified as the propositional content of a declarative sentence. Whereas the semainon and the tynchanon are corporeal things or events, the lekton is held to be incorporeal. This puts it in an exceptional position within the materialist Stoic ontology, which considered almost everything – even god, soul, wisdom, truth, or thought – as material entities. Hence the lekton “is not to be identified with any thoughts or with any of what goes on in one’s head when one says something” (Annas 1992: 76). This is evident also from the fact that the lekton corresponds to the Stoic notion of pragma which, however, in Stoic terminology does not stand for an external thing, as it does for Aristotle, but rather for a fact or something which is the case. The Stoic complement of the Aristotelian notion of mental concept (noema), is the material phantasia logike, i.e. a rational or linguistically expressible presentation in the soul. The phantasia logike is a key feature in the explanation of how incorporeal meaning is connected to the physical world, since the Stoics maintained “that a lekton is what subsists in accordance with a phantasia logike” (Adv. math. 8.70; L&S 33C). It should be clearly distinguished from the lekton or meaning as such. The phantasia logike makes up the ‘internal discourse’ (logos endiathetos) and is thus part of the subjective act of thinking, in contrast to the lekton, which is the “objective content of acts of thinking (noeseis)” (Long 1971: 82). The semainon and the tynchanon more or less correspond to the written or spoken signs and the things in the Aristotelian order of signification, but it has no equivalent of the lekton (see Tab. 8.1.). Meaning, as the Stoics take it, is neither some thing nor something in the head. Nor is the Stoic lekton a quasi-Platonic entity that would exist in some sense independent of whether or not one thinks of it, unlike Frege’s notion of Gedanke (thought) (Graeser 1978: 95; Barnes 1993: 61). Whitin the context of logical inference and demonstration a lekton may function as a sign (semeion, not to be mistaken with semainon!), i.e. as a “leading proposition in a sound conditional, revelatory of the consequent” (Sextus Empiricus, Outlines of Pyrrhonism, L&S 35C). Besides the notion of propositional content, one of the most interesting and innovative elements of Stoic semantics is the discovery that to take into account propositional content alone might be not in any case sufficient in order to make explicit what a sentence means. This discovery, pointing in the direction of modern speech act theory, is indicated when Plutarch (with critical intention) reports that the Stoics maintain “that those who forbid say one thing, forbid another, and command yet another. For he who says ‘do not steal’ says just this, ‘do not steal’, forbids stealing and commands not stealing” (Plutarch, On Stoic self-contradictions 1037d–e). In this case there are three different lekta associated to a single sentence, corresponding to the distinct linguistic functions this sentence can perform. Thus, the Stoics seem to have been well aware that the explication of meaning “involves not only the things we talk about and the thoughts we express but also the jobs we do by means of language alone” (Kretzmann 1967: 365a).
151
152
II. History of semantics A semantic theory which is significantly different from both the Aristotelian and the Stoic is to be found in the Epicurean philosophers who, as Plutarch (Against Colotes; L&S 19K) reports, “completely abolish the class of sayables …, leaving only words and name-bearers” (i.e. external things), so that words refer directly to the objects of sensory perception. This referential relation still presupposes a mental ‘preconception’ (prolepsis) or a schematic presentation (typos) of the thing signified, for we would not “have named something if we had not previously learnt its delineation (typos) by means of preconception (prolepsis)” (Diogenes Laërtius, Lives and Opinions of eminent Philosophers L&S 17E), but signification itself remains formally a two-term relation. It is true, the semantic relation between words and things is founded on a third; but this intermediate third, the prolepsis or typos, unlike the passiones or concepts in the Aristotelian account, does not enter the class of what is signified (see Tab. 8.1.). Tab. 8.1: The ordo significationis I
1
symbola graphomena litterae notae Boethius Aristotle
written ∼1250 +
II
2
ta en te phone voces
semeia
spoken signa
Augustinus
litterae
signa
∼1270 +
propositio scripta
signum
3
dictiones
similitudines
pragmata res
things
thought signa
dicibile, signa verbum mentis
propositio signum vocalis
IV
homoiomata pathemata passiones
signa
voces
Ockham scripta
notae
III
propositio in mente conceptus
similitudo
res
signum
propositio in re (= state of affairs)
signa
signa
res
signa
Epicureans
phone
Stoa
semainon
prolepsis semainei phantasia logike semainei
tynchanon semainomenon = lekton = pragma (= fact)
tynchanon
Labeled grey fields stand for elements (I, II, III) being signifiers and/or signified. Labeled white fields stand for semantic relations (1, 2, 3) characterizing the element on the left in regard to the one on the right. Labeled light grey fields stand for elements involved in the process of signification though neither being signifiers nor signified.
4. Late classical sources of medieval semantics 4.1. Augustinus (354–430) Whereas Aristotelian logic and semantics was transmitted to the Middle Ages via Boethius, the relevant works of Augustinus provide a complex compound of modified Stoic, Skeptic and Neoplatonic elements, together with some genuinely new ideas. Probably his most important and influential contribution to the history of semantics and semiotics consists of (1) the explicit definition of words as signs, and (2) his definition of the sign which, for the first time, included both the natural indexical sign and the conventional linguistic sign as species of an all-embracing generic notion of sign. Augustinus
8. Meaning in pre-19th century thought thus opened a long tradition in which the theory of language is viewed as a special branch of the more comprehensive theory of signs. The sign, in general, is defined as “something which, offering itself to the senses, conveys something other to the intellect” (Augustinus 1963: 33). This triadic sign conception provides the general basis for Augustinus’s communicative approach to language, which holds that a word is a “sign of something, which can be understood by the hearer when pronounced by the speaker” (Augustinus 1975: 86). In contrast to natural signs which “apart from any intention or desire of using them as signs, do yet lead to the knowledge of something else”, words are conventional signs “living beings mutually exchange in order to show …, the feelings of their minds, or their perceptions, or their thoughts” (Augustinus 1963: 34). The preeminent position of spoken words among the different sorts of signs used in human communication, some of which “relate to the sense of sight, some to that of hearing, a very few to the other senses”, does not result from their quantitative preponderance but rather from their significative universality, i.e. from the fact that, as Augustinus sees it, everything which can be indicated by nonverbal signs could be put into words but not vice versa (Augustinus 1963: 35). The full account of linguistic meaning entails four elements (Augustinus 1975: 88f): (1) the word itself, as an articulate vocal sound, (2) the dicibile, i.e. the sayable or “whatever is sensed in the word by the mind rather than by the ear and is retained in the mind”, (3) the dictio, i.e. the word in its ordinary significative use in contrast to the same word just being mentioned. It “involves both the word itself and that which occurs in a mind as the result of the word” (i.e. the dicibile or meaning), and (4) the thing signified (res) in the broadest sense comprising anything “understood, or sensed, or inapprehensible”. Even if the notion of dicibile obviously refers to the Stoic lekton, Augustinus has either missed or modified the essential point. For describing it as “what happens in the mind by means of the word” implies nothing less than the rejection of the Stoic expulsion of meaning (lekton) from the mind. It is as if he mixed up the lekton with the phantasia logike, which was characterized as a representation whose content “can be expressed in language” (Sextus Empiricus, Adv. math. 8.70; L&S 33C). Augustinus’s emphasis on the communicative and pragmatic aspects of language is manifest in his concept of the ‘force of word’ (vis verbi) which he describes as the “efficacy to the extent of which it can affect the hearer” (Augustinus 1975: 100). The import spoken words have on the hearer is not confined to their bare signification but includes a certain value and some emotive moments resulting from their sound, their familiarity to the hearer or their common use in language – aspects which Frege called Färbungen (colorations). In his later theory of verbum mentis (mental word), especially in De trinitate, Augustinus advocates the devaluation of the spoken word against the internal sphere of mental cognition. It is now the mental or ‘interior word’ (verbum interius), i.e., the mental concept, that is considered as word in its most proper sense, whereas the spoken word appears as a mere sign or voice of the word (signum verbi, vox verbi; Augustinus 1968: 486). In line with the old concept of an internal speech, Augustinus claims that thoughts (cogitationes) are performed in mental words. The verbum mentis however, corresponding to what later was called the conceptus mentis or intellectus, is by no means a linguistic entity in the proper sense, for it is “nullius linguae”, i.e. it does not belong to any spoken language like Latin or Greek.
153
154
II. History of semantics Between mental and spoken words there is a further level of speech, consisting of imaginative representations of spoken words, or, as he calls them, imagines sonorum (images of sounds), closely corresponding to Saussure’s notion of image accoustique. In Augustin’s theory of language this internalized version of uttered words does not seem to play a major role, but it will gain importance in late medieval and early modern reflections on the influence of language on thought.
4.2. Boethius (480–528) Boethius’ translations of and comments on parts of the Aristotelian Organon (especially De Interpretatione) are for a long time the only available source texts for the semantics of Aristotle and his late ancient Neoplatonic commentators for the medieval world. The medieval philosophers thus first viewed Aristotle’s logic through the eyes of Boethius, who made some influential decisions on semantic terminology, as well as on the interpretation of the Aristotelian text. What they learned through his writings were inter alia the conventional character of language, the view that meaning is established by an act of ‘imposition’, i.e., name-giving or reference-setting, and the influential idea that to ‘signify’ (significare) is to “establish an understanding” (intellectum constituere). Especially in his more elaborate second commentary on De interpretatione, Boethius discusses at length Aristotle’s four elements of linguistic semeiosis (scripta, voces, intellectus, res), which he calls the ‘order of speaking’ (ordo orandi) (Magee 1989: 64–92). The ordo orandi determines the direction of linguistic signification: written characters signify spoken words, whereas spoken words primarily signify mental concepts and, by means of the latter, secondarily denote the things, or, in short: words signify things by means of concepts (Boethius 1880: 24, 33). The first step of the process of homogenizing or systematizing the semantic relations between these four elements, later continued in the Middle Ages (see 5.2.), is to be seen in Boethius’s translation of Aristotle’s De interpretatione (De int. 16a 3–8) (Meier-Oeser 2009). For whereas Aristotle characterizes these relations with the three terms of symbola, semeia, and homoiomata, Boethius translates both symbola and semeia as notae (signs; see Tab. 8.1.). Boethius makes an additional distinction in the work of late classical Aristotle commentators: he distinguishes three levels of speech: besides - or rather, at the basis of written and spoken discourse there is a mental speech (oratio mentis) in which thinking is performed. This mental speech is, just like Augustinus’s mental word, not made up of words of any national language but rather of transidiomatic or even non-linguistic mental concepts (Boethius 1880: 36) which are, as Aristotle had claimed, “the same for all”.
5. Concepts of meaning in the scholastic tradition The view that semantic issues are not only a subject matter of logic but its primary and most fundamental one is characteristic for the scholastic tradition. This tradition was not confined to the Middle Ages but continued, after a partial interruption in the mid-16th century, during the 17th and 18th centuries. Because it is the largest and most elaborate tradition in the history of semantics, it seems advisable to begin with an overview of some basic aspects of medieval semantics, such as the use and the definition of
8. Meaning in pre-19th century thought the pivotal term of significatio (see 5.1.), and the most fundamental medieval debate on that subject (see 5.2.).
5.1. The use and definition of significatio in the scholastic tradition The problematic vagueness or ambiguity that has frequently been pointed out regarding the notion of meaning holds, at least partly, for the Latin term of significatio as well. There is, however, evidence that some medieval authors were aware of this terminological difficulty. Robert Kilwardby (1215–1279) for instance noted that significatio can designate either the ‘act or form of the signifier’ (actus et forma significantis), the ‘signified’ (significatum), or the relation between the two (comparatio signi ad significatum; Lewry 1981: 379). A look at the common scholastic use of the term significatio does not only verify this diagnosis but reveals a number of further variants (Meier-Oeser 1996: 763–765) which mostly resulted from debates on the appropriate ontological description of significatio as some sort of quality, form, relation, act etc. The scholastic definitions of significatio and significare (to signify), however, primarily took up the question what signification is in the sense of ‘what it is for a word or sign to signify’. There is a current misconception: the medieval notions of sign and signification are often described in terms of what Bühler (1934: 40) called the “famous formula of aliquid stat pro aliquo” (something stands for something). This description, confusing signification with supposition (see 5.4.), falls short by reducing signification to a two-term relation between a sign and its significate, whereas in scholastic logic the relation to a cognitive faculty of a sign recipient is always a constitutive element of signification. In this vein, the most widely spread scholastic definition (Meier-Oeser 1996: 765–768), based on Boethius’s translation of Aristotle’s De interpretatione (De int. 16b 20), characterizes signification or the act of signifying as “to establish an understanding” (constituere intellectum) of some thing, or “to evoque a concept in the intellect of the hearer”. The relation to the sign recipient remains pivotal when in the later Middle Ages the act of signifying is primarily defined as “to represent something to an intellect” (aliquid intellectui repraesentare). The common trait of the definitions mentioned (and all the others not mentioned) is (1) to describe signification – in contrast to meaning – not as something a word has, but rather as an act of signifying which is – in contrast to the stare pro – involved in a triadic relation including both the significate and a cognitive faculty.
5.2. The order of signification and the great altercation about whether words are signs of things or concepts The introductory remarks of Aristotle’s De interpretatione have been described as “the common starting point for virtually all medieval theories of semantics” (Magee 1989: 8). They did at least play a most important role in medieval semantics. In the late 13th and early 14th centuries the order of the four principal elements of linguistic signification (i.e. written and spoken words, mental concepts, and things) is characterized as ordo significationis (order of signification; Aquinas 1989: 9a), or ordo in significando (order in signifying; Ockham 1978: 347). The coinage of these expressions is the terminological outcome of the second step in the process mentioned above of homogenizing
155
156
II. History of semantics the relations between these four elements. It took place in the mid-13th century, when mental concepts began to be described as signs. The Boethian pair of notae and similitudines was thus further reduced to the single notion of sign (signum, see Tab. 8.1.), so that the entire order of signification was then uniformly described in terms of sign relations, or, as Antonius Andreas (ca. 1280–1320) said: “written words, vocal expressions, concepts in the soul and things are coordinated according to the notion of sign and significate” (Andreas 1508: fol. 63va). From the second half of the 13th century on, most scholastic logicians shared the view that mental concepts were signs, which provided new options for solving the “difficult question of whether a spoken word signifies the mental concept or the thing” (Roger Bacon 1978: 132). This question made up the subject matter of the most fundamental scholastic debate on semantic issues. John Duns Scotus (1265/66–1308) labeled it “the great altercation” (magna altercatio). The simple alternative offered in the formulation of the question is, however, by no means exhaustive, but only marks the extreme positions within a highly differentiated spectrum of possible answers (Ashworth 1987; Meier-Oeser 1996: 770–777). The various theories of word meaning turn out to be most inventive in producing variants of the coordination of linguistic signs, concepts and things. For example (1) while especially some early authors held the mental concept to be the only proper significate of a spoken word, (2) Roger Bacon (ca. 1214 or 1220 – ca. 1292) as well as most of the so-called late medieval nominalists favored an extensionalist reference semantics according to which words signify things. (3) Especially Thomist authors took up the formula that words signify things by the mediation of concepts (voces significant res mediantibus conceptibus) and answered the question along the lines of the semantic triangle (Aquinas 1989: 11a). Some later authors put it the other way round and maintained (4) that words signify concepts only by the mediation of their signification of things (voces significant conceptus mediante significatione rerum). For “if I do not know which things the words signify I shall never learn by them which concepts the speaker has in his mind” (Smiglecius 1634: 437). Sharing the view that concepts were signs of things, (5) Scotus referred to the principle of semantic transitivity, claiming that “the sign of a sign is [also] the sign of the significate” (signum signi est signum signati), and held that words signify both concepts and things by one and the same act of signifying (1891: 451f). Others, in contrast, maintained (6) that there had to be two simultaneous but distinguishable acts of signification (Conimbricenses 1607: 2.39f). And still others tried to solve the problem by introducing further differentiations. Some of them were related to the mediantibus conceptibus-formula by the claim (7) that this formula does not imply that concepts were the immediate significates of words but rather that they were a prerequisite condition for words to signify things (Henry of Ghent 1520: 2.272v). Others were related to the notion of things, taking (8) the ‘thing conceived’ (res concepta; res ut intelligitur) as the proper significate of words (Scotus 1891: 543). Further approaches tried to decide the question either (9) by distinguishing senses of the term significare (significare suppositive – manifestative), claiming that spoken words stand for the thing but manifest the concepts (Rubius 1605: 21), or (10) by differentiating between things being signified and thoughts being expressed (Soto 1554: fol. 3 rb–va), or again (11) by taking into account the different roles of the discourse participants, so that words signify concepts for the speaker and things for the hearer (Versor 1572: fol. 8 r), or lastly (12) by distinguishing between different types of discourse, maintaining that in familiar speech words primarily
8. Meaning in pre-19th century thought refer to concepts whereas in doctrinal discourse to things (Nicolaus a S. Iohanne Baptista 1687: 40. 43ff). No matter how subtle the semantic doctrines behind these positions may have been in detail; it is still true that most of the contributions to the ‘great altercation’ were focusing primarily on word semantics. Scholastic semantics, however, was by no means confined to this approach.
5.3. Peter Abelard (1079–1142) and the meaning of the proposition As early as Peter Abelard a shift in the primary interest of scholastic logic became apparent. His treatment of logic and semantics is determined by a decidedly propositional approach; all distinctions he draws and all discussions he conducts are guided by his concentration on propositions (Jacobi 1983: 91). In a conceptual move comparable to the one in Frege’s famous Über Sinn und Bedeutung, Abelard transposes the distinction between the signification of things (which is akin to Frege’s Bedeutung) and concepts (Frege’s Sinn) to the level of propositions. On the one hand, and in line with Frege’s principle of compositionality, the signification of a proposition is the complex comprehension of the sentence as it is integrated from the meanings of its components (see article 6 (Pagin & Westerståhl) Compositionality). On the other hand, it corresponds to Frege’s Gedanke, i.e. to the propositional content of a sentence. Abelard calls this, in accordance with the Stoic lekton, dictum propositionis (‘what is said by the proposition’) or res propositionis (‘thing of the proposition’, i.e. the Stoic pragma). These similarities do not concern only terminology, but also the ontological interpretation. For the res propositionis is characterized as being essentially nothing (nullae penitus essentiae; Abelard 1927: 332, 24) or as entirely nothing (nil omnino; 366: 1). And yet the truth value or the modal state of a proposition depends on the res propositionis being either true or false, necessary or possible etc. (367: 9–16). In the logical textbooks of the late 12th and 13th centuries Abelard’s notion of dictum propositionis is present under the name of enuntiabile (‘what can be stated’) (de Rijk 1967 2/2.208: 15ff). In the 14th century it has its analog in the theory of the ‘complexly signifiable’ (complexe significabile) developed by Adam Wodeham (ca. 1295–1358), Gregory of Rimini (ca. 1300–1358), and others in the context of intense discussions on the immediate object of knowledge and belief (Tachau 1987). These conceptions of the significate of propositions correspond in important points to Frege’s notion of ‘thought’ (Gedanke) or Bolzano’s ‘sentences as such’ (Sätze an sich), and, of course, to the Stoic lekton. In contrast, Walter Burley (ca. 1275–1344) answered the question about the ultimate significate of vocal and mental propositions by advocating the notion of a propositio in re (proposition in reality) or a proposition composed of things, which points more in the direction of Wittgensteins notion of Sachverhalte (cases) or Tatsachen (facts; or better: states of affairs) as described in his Tractatus logicophilosophicus (“1. The world is all that is the case. 1.1. The world is the totality of facts, not of things.”). Whereas the advocates of the complexe significabile project propositionality onto a Fregian-like ‘third realm’ of propositional content, Burley and some other authors project it onto the real world, maintaining that what makes our propositions true (or false) are not the things as such but rather the states of affairs, i.e. the things relating (or not relating) to each other in the way our propositions say they do (Meier-Oeser 2009: 503f).
157
158
II. History of semantics
5.4. The theory of supposition and the propositional approach to meaning The propositional approach to meaning is also characteristic of the so-called ‘terminist’ or ‘modern logic’ (logica moderna), emerging in the late 12th and 13th centuries with a rich and increasingly sophisticated continuation from the 14th to the early 16th century. Most of what is genuinely novel in medieval logic and semantics is to be found in this tradition whose two most important theoretical contributions are (1) the theory of syncategorematic terms and (2) the theory of the properties of terms (proprietates terminorum). 1. The theory of syncategorematic terms is concerned with the semantic and logical functions of those parts of speech that have been missed out in the ordo significationis since they are neither nouns nor verbs and thus have neither a proper meaning nor a direct relation to any part of reality (Meier-Oeser 1998). Even if syncategorematic terms (i.e. quantifiers, prepositions, adverbs, conjunctions etc. like ‘some’, ‘every’, ‘besides’, ‘necessarily’, or the copula ‘est’) do not signify ‘something’ (aliquid) but only, as was later said, ‘in some way’ (aliqualiter), they perform semantic functions that are determinative for the meaning and the truth-value of propositions. Since the late 12th century the syncategorematic terms became the subject matter of a special genre of logical textbooks, the syncategoremata tracts. They also played an important role in the vast literature on sophismata, i.e. on propositions like “every man is every man” or “Socrates twice sees every man besides Plato”, which, due to a syncategorematic term contained in them, need further analysis in order to make explicit their unterlying logical form as well as the conditions under which they can be called true or false (Kretzmann 1982). 2. The second branch of terminist semantics was concerned with those properties of terms that are relevant for explaining truth, inference and fallacy. While signification was seen as the most fundamental property of terms, the one to which they devoted most attention was suppositio. Whereas any term, due to its imposition, has signification or lexical meaning on its own, it is only within the context of a proposition that it achieves the property of supposition or the function of standing for (supponere pro) a certain object or a certain number or kind of objects. Thus it is the propositional context that determines the reference of terms. The main feature of supposition theory is the distinction of different kinds of suppositions. The major distinction is that between ‘material supposition’ (suppositio materialis, when a term stands for itself as a token, e.g. ‘I write donkey’, or a type; e.g. ‘donkey is a noun’), ‘simple supposition’ (s. simplex, when a term stands for the universal form or a concept; e.g. ‘donkey is a species’), and ‘personal supposition’ (s. personalis; when a term stands for ordinary objects, e.g. ‘some donkey is running’). This last most important type of supposition is further divided and subdivided depending whether the truth conditions of the proposition in which the term appears require a particular quantification of the term (all x, every x, this x, some x, a certain x, etc). While supposition theory in general provides a set of rules to determine how the terms in a given propositional context have to be understood in order to render the proposition true or an inference valid, the treatment of suppositio personalis and its subclasses, which are at the center of this logico-semantic approach, focuses on the extension of the terms in a given proposition.
8. Meaning in pre-19th century thought The characteristic feature of terminist logic, as it is exemplified both in the theory of syncategorematic terms and in the theory of supposition, is commonly described as a contextual approach (de Rijk 1967: 123–125), or, more precisely, as a propositional approach to meaning (de Rijk 1967: 552).
5.5. Roger Bacon’s theory of the foundation and the change of reference Roger Bacons theory of linguistic meaning is structured around two dominant features: (1) his semiotic approach, according to which linguistic signification is considered in connection to both conventional and natural sign processes, and (2) his original and inventive interpretation of the doctrine of the ‘imposition of names’ (impositio nominum) as the basis of word meaning. Bacon accentuates the arbitrariness of meaning (Fredborg 1981: 87ff). But even though the first name-giver is free to impose a word or sign on anything whatsoever, he or she performs the act of imposition according to the paradigm of baptizing a child, so that Bacon in this respect might be seen as an early advocate of what today is known as causal theory of reference (see article 4 (Abbott) Reference). This approach, if taken seriously, has important consequences for the concept of signification. For: “all names which we impose on things we impose inasmuch as they are present to us, as in the case of names of people in baptism” (Bacon 1988: 90). Contrary to the tradition of Aristotelian or Boethian semantics (Ebbesen 1983), Bacon favors the view that words according to their imposition immediately and properly signify things rather than mental concepts of things. Thus, his account of linguistic signification abandons the model of the semantic triangle and marks an important turning point on the way from the traditional intensionalist semantics to the extensionalist reference semantics as it became increasingly accepted in the 14th century (Pinborg 1972: 58f). With regard to mental concepts, spoken words function just as natural signs, which indicates that the speaker possesses some concept of the object the word refers to, for this is a prerequisite for any meaningful use of language (Bacon 1978: 85f, 1988: 64). When Bacon treats the issue of linguistic meaning as a special case of sign relations, he considers the sign relation after the model of real relations presupposing both the distinction of the terms related (so that nothing can be a sign of itself) and their actual existence (so that there can be no relation to a non-existent object). As a consequence of this account, words lose their meaning, or, as Bacon says, “fall away from their signification” (cadunt a significatione) if their significate ceases to exist (1978: 128). But even if the disappearance of the thing signified annihilates the sign relation and therefore must result in the corruption of the sign itself, Bacon is well aware that the use of names and words in general is not restricted to the meaning endowed during the first act of imposition (the term homo does not only denote those men who were present when the original act of its imposition took place); nor do words cease to be used when their name-bearers no longer physically exist (Bacon 1978: 128). As a theoretical device for solving the resulting difficulties regarding the continuity of reference, Bacon introduced a distinction of two modes of imposition that can be seen as “his most original contribution to grammar and semantics” (Fredborg 1981: 168). Besides the ‘formal mode of imposition’, conducted by an illocutionary expression like “I call this …” (modus imponendi sub forma impositionis vocaliter expressa), there is a kind of ‘secondary imposition’, taking place tacitly (sine forma imponendi vocaliter expressa) whenever a term is applied (transumitur) to any object other than that which the first
159
160
II. History of semantics name-giver ‘baptized’ (Bacon 1978: 130). Whereas the formal mode of imposition refers to acts of explicitly coining a new word, the second mode describes what happens in the everyday use of language. Bacon (1978: 130) states: We notice that infinitely many expressions are transposed in this way; for when a man is seeing for the first time the image of a depicted man he does not say that this image shall be called ‘man’ in the way names are given to children, he rather transposes the name of man to the picture. In the same way he who for the first time says that god is just, does not say beforehand ‘the divine essence shall be called justice’, but transposes the name of human justice to the divine one because of the similitude. In this way we act the whole day long and renew the things signified by vocal expressions without an explicit formula of vocal imposition.
In fact this modification of the meaning of words is constantly taking place even without the speaker or anyone else being actually aware of it. For by simply using language we “all day long impose names without being conscious of when and how” (nos tota die imponimus nomina et non advertimus quando et quomodo; Bacon 1978: 100, 130f). Thus, according to Roger Bacon, who in this respect is playing off a use theory of meaning against the causal approach, the common mode of language use is too complex and irregular as to be sufficiently described solely by the two features of a primary reference-setting and a subsequent causal chain of reference-borrowing.
5.6. Speculative grammar and its critics The idea, fundamental already for Bacon, that grammar is a formal science rather than a propaedeutic art, is shared by the school of the so-called ‘modist’ grammarians (modistae) emerging around 1270 in the faculty of arts of the university of Paris and culminating in the Grammatica Speculativa of Thomas of Erfurt (ca. 1300). The members of this school who took it for granted that the objective of any formal science was to explain the facts by giving reasons for them rather than to simply describe them, made it their business to deduce the ‘modes of signifyng’ (modi significandi), i.e. grammatical features common to all languages, from universal ‘modes of being’ (modi essendi) by means of corresponding ‘modes of understanding’ (modi intelligendi). Thus the tradition of ‘speculative grammar’ (grammatica speculativa) adopted Aristotle’s commonly accepted claim that mental concepts, just as things, are the same for all men, and developed it further to the thesis of a universal grammar based on the structural analogy between the ‘modes of being’ (modi essendi), the ‘modes of understanding’ (modi intelligendi), and the ‘modes of signifying’ (modi significandi) that are the same for all languages (Bursill-Hall 1971). Thus, Boethius Dacus (1969: 12): one of the most important theoreticians of speculative grammar, states that … all national languages are grammatically identical. The reason for this is that the whole grammar is borrowed from the things … and just as the natures of things are similar for those who speak different languages, so are the modes of being and the modes of understanding; and consequently the modes of signifying are similar, whence, so are the modes of grammatical construction or speech. And therefore the whole grammar which is in one language is similar to the one which is in another language.
8. Meaning in pre-19th century thought Even though the words are arbitrarily imposed (whence arise the differences between all languages), the modes of signifying are uniformly related to the modes of being by means of the modes of understanding (whence arise the grammatical similarities among all languages). Soon after 1300 the modistic approach came under substantial criticism. The main point that critics like Ockham oppose is not the assumption of a basic universal grammar, for such a claim is implied in Ockham’s concept of mental grammar too, but rather two other aspects of modism: (1) the assertion of a close structural analogy between spoken or mental language and external reality (William of Ockham 1978: 158), and (2) the inadmissible reification of the modus significandi, which is involved in its description as some quality or form added to the articulate voice (dictioni superadditum) through the act of imposition. To say that vocal expressions ‘have’ different modes of signifying is, as Ockham points out, just a metaphorical manner of speaking; for what is meant is simply the fact that different words signify whatever they signify in different ways (Ockham 1974: 798). According to John Aurifaber (ca. 1330), a vocal term is significative, or is a sign, solely by being used significatively, not on grounds of something inherent in the sound. In order to assign signification a proper place in reality, it must be ascribed to the intellect rather than to the vocal sound (Pinborg 1967: 226). This criticism of modist grammar is connected to a process that might be described as a progressive ‘mentalization’ of signification.
5.7. The late medieval mentalist approach to signification The idea behind this process is the contention that without some sort of ‘intentionality’ the phenomena of sign, signification, and semiosis in general must remain inconceivable. The tendency to relocate the notions of sign and signification from the sphere of spoken words to the sphere of the mind is characteristic for the mentalist logic, emerging in the early 14th century and remaining dominant throughout the later Middle Ages. The signification of spoken words and external signs in general is founded on the natural signification instantiated in the mental concepts. The cognitive mental act as that which makes any signification possible is now conceived as a sign or an act of signification in its most proper sense. The introduction of the notion of formal signification (significatio formalis), identical with the mental concept (Raulin 1500: d3vb), is the result of a fundamental change in the conception of signification. The mental concept does not have but rather is signification. This, however, does not imply that it is the signified of a spoken word but, quite the contrary: the mental concept, as Ockham (1974: 7f) claimed, is the primary signifier in subordination to which a spoken word (to which again is subordinate the corresponding written term) takes up the character of a sign (see Tab. 8.1.). Thus the significative force of mental concepts is seen as the point at which the analysis of signification necessarily must find its end. It is an ultimate fact for which no further rationale can be given (Meier-Oeser 1997: 141–143).
6. Concepts of meaning in modern philosophy Whereas in Western Europe, under the growing influence of humanism, the scholastic tradition of terminist logic and semantics came to an end in the third decade of the 16th century, it continued vigourously on the Iberian Peninsula until the 18th century. It was
161
162
II. History of semantics then reimported from there into central Europe in the late 16th and early 17th century and dominated, though in a somewhat simplified form, academic teaching in Catholic areas for more than a century. In what is commonly labeled ‘modern philosphy’, however, logic, the former center of semantic theory, lost many of its medieval attainments and subsided into inactivity until the middle of the 19th century. In early modern philosophy of language the logico-semantic approach of the scholastic tradition is displaced by an epistemological approach, so that in this context the focus is not on the issue of meaning but rather on the cognitive function of language.
6.1. The modern redefinition of meaning In early modern philosophy (outside the scholastic discourse) the “difficult question of whether a spoken word signifies the mental concept or the thing” (see 5.2.) which once had stimulated a rich variety of distinctly elaborated semantic theories was unanimously considered as definitively answered. Due to the prevalent persuasion that the primary function of speech was to express one’s thoughts, most of the non-scholastic early modern authors took up the view that words signify concepts rather than things which, from a scholastic point of view, had been classified as the “more antiquated” one (Rubius 1605: 2.18). Given, however, that concepts, ideas, or thoughts are the primary meaning of words, the thesis that language has a formative influence on thought, which became increasingly widely accepted during the 18th century, turns out to be a thesis of fundamental importance to semantics.
6.2. The influence of conventional language on thought processes Peter of Ailly (1330–1421) claimed that there is such a habitually close connection between the concept of the thing and the concept of its verbal expression that by stimulating one of these concepts the other is always immediately stimulated as well (Kaczmarek 1988: 403ff). Still closer is the correlation of language and thought in Giovanni Battista Giattini’s (1601–1672) account of language acquisition. Upon hearing certain words frequently and in combination with the sensory perception of their significata, a ‘complex species’ is generated, and this species comprises, just like the Saussurean sign, the soundimage as well as the concept of its correlate object (“… generantur … species complexae talium vocum simul et talium obiectorum ex ipsa consuetudine”; Giattini 1651: 431). In this vein, Jean de Raey (1622–1702) sees the “union of the external vocal sound and the inner sense” as the “immediate fundament of signification” and holds that sound (sonus) and meaning or sense (sensus) make up “one and the same thing rather than two things” (Raey 1692: 29). Thinking, therefore, “seems to be just some kind of internal speech or logos endiathetos without which there would be no reasoning” (30). Even if de Reay refers to the old tradition of describing the process of thinking in terms of internal speech (see 2.2.), a fundamental difference becomes apparent when he claims that “both the speaker and the hearer have in mind primarily the sound rather than the meaning and often the sound without meaning but never the meaning without sound” (29). Until then, internal speech had generally been conceived as being performed in what Augustinus had called verba nullius linguae (see 4.1.), but from de Raey (and other authors of that time) inner speech is clearly intimately linked to spoken language.
8. Meaning in pre-19th century thought The habitual connection of language and thought was the theoretical foundation for the thesis of an influence of language on thinking in the 17th century. As the introduction to the Port-Royal logic notes: “this custom is so strong, that even when we think alone, things present themselves to our minds only in connection with the words to which we have been accustomed to recourse in speaking to others.” (Arnauld & Nicole 1662: 30). Because most 17th century authors adhered to the priority of thought over language they considered this custom just a bad habit. While this critical attitude remained a constant factor in the philosophical view of language during the following centuries, a new and alternative perspective, taking into account also its positive effects, was opened with Hobbes, Locke and Leibniz.
6.3. Thomas Hobbes (1588–1679) In his Logic (Computatio sive logica), which makes up the first part (De Corpore) of his Elementa Philosophiae (1655, English 1656), Thomas Hobbes draws a parallel between reasoning and a mathematical calculus, the basic operations of which can be described as an addition or subtraction of ideas, thoughts, or concepts (Hobbes 1839a: 3–5). Because thoughts are volatile and fleeting (fluxae et caducae) they have to be fixed by means of marks (notae) which, in principle, everyone can arbitrarily choose for himself (1839a: 11f). Because the progress of science, however, can be obtained only in form of a collective accumulation of knowledge, it is necessary that “the same notes be made common to many” (Hobbes 1839b: 14). So, in addition to these marks, signs (signa) as a means of communication are required. In fact, both functions are obtained by words: “The nature of a name consists principally in this, that it is a mark taken for memory’s sake; but it serves also by accident to signify and make known to others what we remember ourselves” (Hobbes 1839b: 15). Hobbes adopts the scholastic practice of viewing linguistic signs in light of the general notion of sign. His own concept of sign, however, according to which signs in general can be described as “the antecedents of their consequents, and the consequents of their antecedents”, is from the outset confined to the class of indexical signs (Hobbes 1839b: 14). Words and names ordered in speech must therefore be indexical signs rather than expressions of conceptions, just as they cannot be “signs of the things themselves; for that the sound of this word stone should be the sign of a stone, cannot be understood in any sense but this, that he that hears it collects that he that pronounces it thinks of a stone” (Hobbes 1839b: 15). It is true, from Roger Bacon onwards we find in scholastic logic the position that words are indexical signs of the speaker’s concepts. Connected to this, however, was always the assumption, explicitly denied by Hobbes, that the proper significate of words are the things talked about. Names, according to Hobbes, “though standing singly by themselves, are marks, because they serve to recall our own thoughts to mind, … cannot be signs, otherwise than by being disposed and ordered in speech as parts of the same” (Hobbes 1839b: 15). As a result of the distinction between marks and signs, any significative function of words can be realized only in the framework of communication. These rudiments of a sentence semantics (Hungerland & Vick 1973), however, were not elaborated any further by Hobbes.
163
164
II. History of semantics
6.4. The ‘Port-Royal Logic’ and ‘Port-Royal Grammar’ The so called Port-Royal Logic (Logique ou l’art de penser), published in 1662 by Antoine Arnauld (1612–1694) and Pierre Nicole (1625–1695), is not only one of the most influential early modern books on the issue of language but in some respect also the most symptomatic one. For “it marks, better than any other, the abandonment of the medieval doctrine of an essential connection between logic and semantics”, and treats the “most fundamental questions … with the kind of inattention to detail that came to characterize most of the many semantic theories of the Enlightenment” (Kretzmann 1967: 378a). The most influential, though actually quite modest, semantic doctrine of this text is the distinction between comprehension and extension which is commonly seen as a direct ancestor of the later distinction between intension and extension. And yet it is different: Whereas the comprehension of an idea is tantamount to “the attributes it comprises in itself that cannot be removed from it without destroying it”, extension is described as “the subjects with which that idea agrees, which are also called the inferiors of the general term, which in relation to them, is called superior; as the idea of triangle is extended to all the various species of triangle” (Arnauld & Nicole 1662: 61f). It is manifest that Arnauld in this passage unfolds his doctrine along the lines of the relation of genus and species, so that the idea of a certain species would be part of the extension of the idea of the superior genus, which does not match with the current use of the intension/extension distinction. The extension of a universal idea, however, does not consist of species alone; for Arnauld also notices that the extention of a universal idea can be restriced in two different modes: either (1) “by joining another distinct or determinate idea to it” (e.g. “right-angled” to “triangle”) which makes it the idea of a certain subclass of the universal idea (= extension 1), or (2) by “joining to it merely an indistinct and indeterminate idea of a part” (e.g. the quantifying syncategoreme “some”) which makes it the idea of an undetermined number of individuals (= extension 2) (Arnauld & Nicole 1662: 62). What Arnauld intends to convey is simply that the restriction of a general idea can be achieved either by specification or by quantification – which, however, result in two different and hardly combinable notions of extension. While empirism, according to which sense experience is the ultimate source of all our concepts and knowledge, was prominently represented by Hobbes, the PortRoyal Logic, taking a distinct Cartesian approach, is part of the rationalist philosophy acknowledging the existence of innate ideas or, at least, of predetermined structures of rational thought. This also holds for the Port-Royal Grammar (1660) (Arnauld & Lancelot 1667/1966) that opened the modern tradition of universal grammar which dominated linguistic studies in the 17th and 18th centuries (Schmitter 1996). The universal grammarians aimed to reduce, in a Chomskian-style analysis, the fundamental structures of language to universally predetermined mental structures. The distinction of deep and surface structure seems to be heard when Nicolas Beauzée (1717–1789) claims that since all languages are founded on an identical “méchanisme intellectuel” the “différences qui se trouvent d’une langue à l’autre ne sont, pour ainsî dire, que superficielles.” (Beauzée 1767: viiif). Whereas the rationalist grammarians took language as “the exposition of the analysis of thought” (Beauzée 1767: xxxiiif) and thus as a means of detecting the rules of thought, empirists like Locke or Condillac saw language as a means of forming and analyzing
8. Meaning in pre-19th century thought complex ideas, thus showing a pronounced tendency to ascribe a certain influence of language on thought.
6.5. John Locke (1632–1704) Locke’s Essay Concerning Human Understanding (1690/1975) is the most influential early modern text on language, even if the third book, which is devoted to the issue “of words”, hardly offers more than a semantics of names, differentiating between names of simple ideas, of mixed modes, and of natural substances. In this context Locke focuses on the question of how names and the ideas they stand for are related to external reality. With regard to simple ideas the answer is simple as well. For their causation by external object shows such a degree of regularity that our simple ideas can be considered as “very near and undiscernably alike” (389). Error and dissent, therefore, turn out to be primarily the result of inconsiderate use of language: “Men, who well examine the Ideas of their own Minds, cannot much differ in thinking; however, they may perplex themselves with words” (180). Locke places emphasis on the priority of ideas over words in some passages (437; 689) and distinguishes between a language-independent mental discourse and its subsequent expression in words (574ff). However, the thesis that thought is at least in principle independent of language is counterbalanced by Locke’s account of individual language acquisition and the actual use of language. Even if the idea is logically prior to the corresponding word, this relation is inverted in language learning. For in most cases the meaning of words is socially imparted (Lenz 2010), so that we learn a word before being acquainted with the idea customarily connected to it (Locke 1975: 437). This habitual connection of ideas with words does not only effect an excitation of ideas by words but quite often a substitution of the former by the latter (408). Thus, the mental proposition made up of ideas actually turns out to be a marginal case, for “most Men, if not all, in their Thinking and Reasoning within themselves, made use of Words instead of Ideas” (575). Even if clear and distinct knowledge were best achieved by “examining and judging of Ideas by themselves their Names being quite laid aside”, it is, as Locke conceeds, “through the prevailing custom of using Sounds for Ideas … very seldom practised” (579). Locke’s theory of meaning is often characterized as vague or even incoherent (Kretzmann 1968; Landesman 1976). For, on the one hand, Locke states that “Words in their primary or immediate Signification, stand for nothing, but the Ideas in the Mind of him he uses them” (Locke 1975: 405f, 159, 378, 402, 420, 422) so that words “can be Signs of nothing else” (408). On the other hand, like most scholastic authors unlike the contempora trend of the 17th and 18th centuries, he considers ideas as signs of things, and advocates the view that words “ultimatly … represent Things” (Locke 1975: 520) in accordance with the scholastic mediantibus conceptibus-thesis (see 5.2.; Ashworth 1984). Whereas in the late 19th century it was considered “one of the glories of Locke’s philosophy that he established the fact that names are not the signs of things but in their origins always the signs of concepts” (Müller 1887: 77), it is precisely for this view that Locke’s semantic theory is often criticized as a paradigm case of private-language philosophy and semantic subjectivism. But if there is something like semantic subjectivism in Locke then it is more in the sense of a problem that he is pointing to than something that his theory tends to result in. For one of his main points regarding our use of language
165
166
II. History of semantics is that we should keep it consistent with the use of others (Locke 1975: 471), since words are “no Man’s private possession, but the common measure of Commerce and Communication” (514). Locke saw sensualism as supported by an interesting observation regarding meaning change (see article 99 (Fritz) Theories of meaning change) of words. Many words, he noticed, “which are made use of to stand for actions and notions quite removed from sense, have their rise from … obvious sensible ideas and are transferred to more abstruse significations” (Locke 1975: 403). Therefore large parts of our vocabulary are “metaphorical” concepts in the sense that metaphor is defined by the modern cognitive account (see article 26 (Tyler & Takahashi) Metaphors and metonymies). That is, thinking about a concept from one knowledge domain in terms of another domain, as is exemplified by terms like “imagine, apprehend, comprehend, adhere, conceive, instil, disgust, disturbance, tranquillity”. This view was substantiated by Leibniz’s comments on the epistemic function of the “analogie des choses sensibles et insensibles”, as it becomes manifest in language. It would be worthwile, Leibniz maintained, to consider “l’usage des prepositions, comme à, avec, de, devant, en, hors, par, pour, sur, vers, qui sont toutes prises du lieu, de la distance, et du mouvement, et transferées depuis à toute sorte de changemens, ordres, suites, différences, convenances” (Leibniz 1875–1890: 5.256). Kant, too, agreed in the famous §59 of his Critique of Judgement that this symbolic function of language would be “worthy of a deeper study”. For “… the words ground (support, basis), to depend (to be held up from above), to flow from (instead of to follow), substance … and numberless others, are … symbolic hypotyposes, and express concepts without employing a direct intuition for the purpose, but only drawing upon an analogy with one, i.e., transferring the reflection upon an object of intuition to quite a new concept, and one with which perhaps no intuition could ever directly correspond.” Thus, according to Locke, Leibniz and Kant, metaphor is not simply a case of deviant meaning but rather, as modern semantics has found out anew, an ubiquitous feature of language and thought.
6.6. G. W. Leibniz (1646–1716) and the tradition of symbolic knowledge While Hobbes and Locke, at least in principle and to a certain degree, still acknowledged the possibility of a non-linguistic mental discourse or mental proposition, Gottfried Wilhelm Leibniz emphazised the dependency of thinking on the use of signs: “thinking can take place without words … but not without other signs” (1875–1890: 7.191). For “all our reasoning is nothing but a process of connecting and substituting characters which may be words or other signs or images” (7.31). This view became explicit and later extremely influential under the label of cognitio symbolica (symbolic knowledge, symbolic cognition), a term Leibniz coined in his Meditationes de cognitione, veritate et ideis (1684; 1875–1890: 4.423). Symbolic knowledge is opposed to intuitive knowledge (cognitio intuitiva) which is defined as a direct and simultaneous conception of a complex notion together with all its partial notions. Because the limited human intellect cannot comprehend more complex concepts other than successively, the complex concept of the thing itself must be substituted by a sensible sign in the process of reasoning always supposing that a detailed explication of its meaning could be given if needed. Leibniz, therefore, maintains that the knowledge or
8. Meaning in pre-19th century thought cognition of complex objects or notions is always symbolic, i.e. performed in the medium of signs (1875–1890: 4.422f). The possibility and validity of symbolic knowledge is based on the principle of proportionality according to which the basic signs used in symbolic knowledge may be choosen arbitrarily, provided that the internal relations between the signs are analogous to the relations between the things signified (7.264). In his Dialogue (1677) Leibniz remarks that “even if the characters are arbitrary, still the use and interconnection of them has something that is not arbitrary - viz. a certain proportionality between the characters and the things, and the relations among different characters expressing the same things. This proportion or relation is the foundation of truth.” (Leibniz 1875–1890: 7.192). The notion of cognitio symbolica provides the epistemological foundation of both his project of a characteristica universalis or universal language of science and his philosophy of language. For the basic principle of analogy is also realized in natural languages to a certain extent. Leibniz therefore holds that “languages are the best mirror of the human mind and that an exact analysis of the signification of words would make known the operations of the understanding better than would anything else” (5.313). Especially through its reception by Christian Wolff (1679–1654) and his school the doctrine of symbolic knowledge became one of the main epistemological issues of the German Enlightenment. In this tradition it is a common conviction that the use of signs in general and of language in particular provides an indispensable function for any higher intellectual operation. Hence Leibniz’s proposal of a characteristica universalis has been massively taken up, even though mostly in a somewhat altered form. For what the 18th century authors are generally aiming at is not the invention of a sign system for obtaining general knowledge, but rather a general science of sign systems. The general doctrine of signs, referred to with names like Characteristica, Semiotica, or Semiologia, was considered as a most important desideratum. In 1724 Wolff’s disciple Georg Bernhard Bilfinger (1693–1750) suggested the name Ars semantica for this, as he saw it, until then neglected discipline, the subject matter of which would be the knowledge of all sorts of signs in general as well as the theory about the invention, right use, and assessment of linguistic signs in particular (Bilfinger 1724: 298f). The first extended attempt to fill this gap was made by Johann Heinrich Lambert (1728–1777) with his Semiotik, oder die Lehre von der Bezeichnung der Gedanken und Dinge (Semiotics, or the doctrine of the signification of thoughts and things), published as the second part of his Neues Organon (1764). The leading idea of this work is Leibniz’s principle of proportionality which guarantees, as Lambert claims, the interchangeability of “the theory of the signs” and “the theory of the objects” signified. (Lambert 1764: 3.23–24). Besides the theory of sign invention, the hermeneutica, the theory of sign interpretation, made up an essential part of the characteristica. Within the framework of ‘general hermeneutics’ (hermeneutica generalis), originally designed by Johann Conrad Dannhauer (1603–1666) as a complement to Aristotelian logic, the reflections on linguistic meaning focused on sentence meaning. Due to Dannhauer’s influential Idea boni interpretis (Presentation of the good interpreter, 1630), some relics of scholastic semantics were taken up by hermeneutical theory, for which particularly the theory of supposition (see 5.4.), which provided the “knowledge about the modes in which the signification of words may vary according to their connection to others” (Reusch 1734: 266) had to be of pivotal interest. The developing discipline of hermeneutics as “the science
167
168
II. History of semantics of the very rules in compliance of which the meanings can be recognized from their signs” (Meier 1757: 1), shows a growing awareness of several important semantic doctrines, as for instance the methodological necessity of the principle of charity in form of a presumption of consistency and rationality (Scholz 1999: 35–64), the distinction between language meaning and contextual meaning, as it is present in Christian August Crusius’s (1747: 1080) distinction between grammatic meaning (grammatischer Verstand), i.e. “the totality of meanings a word may ever have in one language”, and logic meaning, i.e. the “totality of what a word can mean at a certain place and in a certain context”, or the principle of compositionality (see article 6 (Pagin & Westerståhl) Compositionality), as it appears in Georg Friedrich Meier’s claim that “the sense of a speech is the sum total of the meanings of the words that make up this speech and which are connected to and determining each other” (Meier 1757: 57).
6.7. Condillac (1714–1780) One of the most decisive features of 18th century science is its historical approach. In linguistics this resulted in a great number of works on the origin and development of language (Gessinger & von Rahden 1989). The way in which the historic-genetic point of view opened new perspectives on the relation between language and thought is most clearly reflected in Etienne Bonnot de Condillac’s Essai sur l’origine des connaissances humaines (1746). Already in the early 18th century it was widely accepted that linguistic signs provide the basis for virtually any intellectual knowledge. Christian Thomasius (1655–1728) had argued that the use of language plays a decisive role in the ontogenetic development of each individual’s cognitive faculties (Meier-Oeser 2008). Condillac goes even further and argues that the same holds for the phylogenetic development of mankind as well. Language, therefore, is not only essentially involved in the formation of thoughts or abstract ideas but also in the formation of the subject of thought, viz. of man as an intellectual being. According to Condillac all higher cognitive operations are nothing but ‘transformed sensation’ (sensation transformé). The formative principle that effects this transformation is language or, more generally, the use of signs (l’usage des signes). Condillac reconstructs the historical process of language development as a process leading from a primordial natural language of actions and gestures (langage d’action), viewed as a language of simultaneous ideas (langage des idées simultanées), to the language of articulate voice (langage des sons articulés), viewed as a language of successive ideas (langage des idées successives). We are so deeply accustomed to spoken language with its sequential catenation of articulate sounds, he notices, that we believe our ideas would by nature come to our mind one after another, just as we speak words one after another. In fact, however, the discursive structure of thinking is not naturally given but rather the result of our use of linguistic signs. The main effect of articulate language consists in the gradual analysis of complex and indistinct sensations into abstract ideas that are connected to and make up the meaning of words. Since language is a necessary condition of thought and knowledge, Locke was mistaken to claim that the primary purpose of language is to communicate knowledge (Condillac 1947–1951: 1.442a): The primary purpose of language is to analyze thought. In fact we cannot exhibit the ideas that coexist in our mind successively to others except in so far as we know how to exhibit
8. Meaning in pre-19th century thought them successively to ourselves. That is to say, we know how to speak to others only in so far as we know how to speak to ourselves.
It was therefore Locke’s adherence to the scholastic idea of mental propositions that prevented him “to realize how necessary the signs are for the operations of the soul” (1.738a). Every language, Condillac claims, “is an analytic method”. Due to his comprehensive notion of language this also holds vice versa: “every analytic method is a language” (2.119a), so that Condillac, maintaining that sciences are analytical methods by essence, comes to his provocative and controversial thesis that “all sciences are nothing but well-made languages” (2.419a). Condillac’s theory of language and its epistemic function became the core topic of the so-called school of ‘ideology’ (idéologie) that dominated the French scene in early 19th century. Although most authors of this school rejected the absolute necessity of signs and language for thinking, they adhered to the subjectivistic consequences of sensualism and considered it impossible “that one and the same sign should have the same value for all of those who use it and even for each of them at different moments of time” (Destutt de Tracy 1803: 405).
7. References 7.1. Primary Sources Abelard, Peter 1927. Logica ingredientibus, Glossae super Peri ermenias. Ed. B. Geyer. Münster: Aschendorff. Andreas, Antonius 1508. Scriptum in arte veteri. Venice. Aquinas, Thomas 1989. Expositio libri peryermenias. In: Opera omnia I* 1. Ed. Commissio Leonina. 2nd edn. Rome. Arnauld, Antoine & Pierre Nicole 1965. La logique ou l’art de penser. Paris. Arnauld, Antoine & Claude Lancelot 1667/1966. Grammaire générale et raisonnée ou La Grammaire de Port-Royal. Ed. H. E. Brekle, t. 1, reprint of the 3rd edn. Stuttgart-Bad Cannstatt: Frommann, 1966. Augustinus 1963. De doctrina Christiana. In: W. M. Green (ed.). Sancti Augustini Opera (CSEL 80). Vienna: Österreichische Akademie der Wissenschaften. Augustinus 1968. De trinitate. In: W. J. Mountain & F. Glorie (eds.). Aurelii Augustini Opera (CCSL 50). Turnhout: Brepols. Augustinus 1975. De dialectica. Ed. Jan Pinborg, translation with introduction and notes by B. Darrel Jackson. Dordrecht: Reidel. Bacon, Roger 1978. De signis. Ed. K.M. Fredborg, L. Nielsen & J. Pinborg. Traditio 34, 75–136. Bacon, Roger 1988. Compendium Studii Theologiae. Ed. Th. S. Maloney. Leiden: Brill. Bilfinger, Georg Bernhard 1724. De Sermone Sinico. In: Specimen doctrinae veterum Sinarum moralis et politicae, Frankfurt/M. Beauzée, Nicolas 1767. Grammaire générale. Paris. Boethius, Anicius M. T. S. 1880. In Perihermeneias editio secunda. Ed. C. Meiser. Leipzig: Teubner. Boethius Dacus 1969. Modi significandi. Ed. J. Pinborg & H. Roos. Copenhagen: C. E. C. Gad. Condillac, Etienne Bonnot de 1947–1951. Oeuvres philosophiques. Ed. G. Le Roy. Paris: PUF. Conimbricenses 1607. Commentaria in universam Aristotelis dialectica. Cologne. Crusius, Christian August 1747. Weg zur Gewiβ/heit und Zuverlässigkeit der menschlichen Erkenntnis. Leipzig. Dannhauer, Johann Conrad 1630. Idea boni interpretis. Strasburg. Destutt de Tracy, Antoine 1803. Éléments d’idéologie. Vol. 2. Paris. Giattini, Johannes Baptista 1651. Logica. Rome.
169
170
II. History of semantics Henry of Ghent 1520. Summa quaestionum ordinarium. Paris. Hobbes, Thomas 1839a. Opera philosophica quae scripsit omnia. Ed. W. Molesworth, vol. 1. London. Hobbes, Thomas 1839b. The English Works. Ed. W. Molesworth, vol. 1. London. Hoffbauer, Johannes Christoph 1789. Tentamina semiologica sive quaedam generalem theoriam signorum spectantia. Halle. Lambert, Johann Heinrich 1764. Neues Organon oder Gedanken über die Erforschung und Bezeichnung des Wahren. Vol. 2. Leipzig. Leibniz, Gottfried Wilhelm 1875–1890. Die philosophischen Schriften. Ed. C. I. Gerhardt. Berlin. Reprinted: Hildesheim: Olms, 1965. Locke, John 1975. An Essay concerning Human Understand. Ed. P. H. Nidditch. Oxford: Clarendon Press. Meier, Georg Friedrich 1757. Versuch einer allgemeinen Auslegungskunst. Halle. Nicolaus a S. Iohanne Baptista 1687. Philosophia augustiniana. Genova. Ockham, William of 1974. Summa logicae. Ed. Ph. Boehner, G. Gál & S. Brown. In: Opera philosophica 1. St. Bonaventure, NY: The Franciscan Institute. Ockham, William of 1978. Expositio in librum Perihermeneias Aristotelis. Ed. A. Gambatese & S. Brown. In: Opera philosophica 2. St. Bonaventure, NY: The Franciscan Institute. Raey, Jean de 1692. Cogitata de interpretatione. Amsterdam. Raulin, John 1500. In logicam Aristotelis commentarium. Paris. Reusch, Johann Peter 1734. Systema logicum. Jena. Rubius, Antonius 1605. Logica mexicana. Cologne. Scotus, John Duns 1891. In primum librum perihermenias quaestiones. In: Opera Omnia. Ed. L. Vivès, vol. 1. Paris. Smiglecius, Martin 1634. Logica. Oxford. de Soto, Domingo 1554. Summulae. Salamanca. Versor, Johannes 1572. Summulae logicales. Venice.
7.2. Secondary Sources Annas, Julia E. 1992. Hellenistic Philosophy of Mind. Berkeley, LA: University of California Press. Ashworth, E. Jennifer 1984. Locke on language. Canadian Journal of Philosophy 14, 45–74. Ashworth, E. Jennifer 1987. Jacobus Naveros on the question: ‘Do spoken words signifiy concepts or things?’. In: L. M. de Rijk & C. A. G. Braakhuis (eds.). Logos and Pragma. Nijmegen: Ingenium Publishers, 189–214. Barnes, Jonathan 1993. Meaning, saying and thinking. In: K. Döring & T. Ebert (eds.). Dialektiker und Stoiker. Zur Logik der Stoa und ihrer Vorläufer. Stuttgart: Steiner, 47–61. Baxter, Timothy M. S. 1992. The Cratylus: Plato’s Critique of Naming. Leiden: Brill. Bühler, Karl 1934. Sprachtheorie. Jena: Fischer. Bursill-Hall, G. E. 1971. Speculative Grammars of the Middle Ages. The Hague: Mouton. Ebbesen, Sten 1983. The odyssey of semantics from the Stoa to Buridan. In: A. Eschbach & J. Trabant (eds.). History of Semiotics. Amsterdam: Benjamins, 67–85. Fredborg, Karen Margareta 1981. Roger Bacon on ‘Impositio vocis ad significandum’. In: H. A. G. Braakhuis, C. H. Kneepkens & L. M. de Rijk (eds.). English Logic and Semantics. Nijmegen: Ingenium Publishers, 167–191. Gessinger, Joachim & Wolfert von Rahden 1989 (eds.). Theorien vom Ursprung der Sprache. Berlin: de Gruyter. Graeser, Andreas 1978. The Stoic theory of meaning. In: J. M. Rist (ed.). The Stoics. Berkeley, CA: University of California Press, 77–100. Graeser, Andreas 1996. Aristoteles. In: T. Borsche (ed.). Klassiker der Sprachphilosophie, München: C. H. Beck, 33–47. Hungerland, Isabel C. & George R. Vick 1973. Hobbes’s theory of signification. Journal of the History of Philosophy 11, 459–482.
8. Meaning in pre-19th century thought Jacobi, Klaus 1983. Abelard and Frege. The semanics of words and propositions. In: V.M. Abrusci, E. Casari & M. Mugnai (eds.). Atti del Convegno Internazionale di storia della logica, San Gimignano 1982. Bologna: CLUEB, 81–96. Kaczmarek, Ludger 1988. ‘Notitia’ bei Peter von Ailly. In: O. Pluta (ed.). Die Philosophie im 14. und 15. Jahrhundert. Amsterdam: Grüner, 385–420. Kretzmann, Norman 1967. Semantics, history of. In: P. Edwards (ed.). The Encyclopedia of Philosophy. Vol. 7. New York: The Macmillan Company, 358b–406a. Kretzmann, Normann 1968. The main thesis of Locke’s semantic theory. Philosophical Review 78, 175–196. Kretzmann, Norman 1974. Aristotle on spoken sound significant by convention. In: J. Corcoran (ed.). Ancient Logic and its Modern Interpretation. Dordrecht: Kluwer, 3–21. Kretzmann, Norman 1982. Syncategoremata, exponibilia, sophismata. In: N. Kretzmann, A. Kenny & J. Pinborg (eds.). The Cambridge History of Later Medieval Philosophy. Cambridge: Cambridge University Press, 211–245. Landesman, Charles 1976. Locke’s theory of meaning. Journal of the History of Philosophy 14, 23–35. Lenz, Martin 2010. Lockes Sprachkonzeption. Berlin: de Gruyter. Lewry, Osmund 1981. Robert Kilwardby on meaning. A Parisian course on the logica vetus. Miscellanea mediaevalia 13, 376–384. Lieb, Hans-Heinrich 1981. Das ‘semiotische Dreieck’ bei Ogden und Richards: Eine Neuformulierung des Zeichenmodells von Aristoteles. In: H. Geckeler et al. (eds.). Logos Semantikos. Studia Linguistica in Honorem Eugenio Coseriu, vol. 1. Berlin: de Gruyter/Madrid: Gredos, 137–156. Long, Anthony A. & David N. Sedley 1987. The Hellenistic Philosophers. Cambridge: Cambridge University Press. Long, Anthony A. 1971. Language and thought in Stoicism. In: A. A. Long (ed.). Problems in Stoicism. London: The Athlone Press, 75–113. Magee, John 1989. Boethius on Signification and Mind. Leiden: Brill. Meier-Oeser, Stephan 1996. Signifikation. In: J. Ritter & K. Gründer (eds.). Historisches Wörterbuch der Philosophie 9. Basel: Schwabe, 759–795. Meier-Oeser, Stephan 1997. Die Spur des Zeichens. Das Zeichen und seine Funktion in der Philosophie des Mittelalters und der frühen Neuzeit. Berlin: de Gruyter. Meier-Oeser, Stephan 1998. Synkategorem. In: J. Ritter & K. Gründer (eds.). Historisches Wörterbuch der Philosophie 10. Basel: Schwabe, 787–799. Meier-Oeser, Stephan 2004. Sprache und Bilder im Geist. Philosophisches Jahrbuch 111, 312–342. Meier-Oeser, Stephan 2008. Das Ende der Metapher von der ‘inneren Rede’. Zum Verhältnis von Sprache und Denken in der deutschen Frühaufklärung. In: H. G. Bödecker (ed.). Strukturen der deutschen Frühaufklärung 1680–1720. Göttingen: Vandenhoeck & Ruprecht, 195–223. Meier-Oeser, Stephan 2009. Walter Burley’s propositio in re and the systematization of the ordo significationis. In: S. F. Brown, Th. Dewender & Th. Kobusch (eds.). Philosophical Debates at Paris in the Early Fourteenth Century. Leiden: Brill, 483–506. Müller, Max 1887. The Science of the Thought. London: Longmans, Green & Co. Panaccio, Claude 1999. Le Discours Intérieure de Platon à Guillaume Ockham. Paris: Seuil. Pinborg, Jan 1967. Die Entwicklung der Sprachtheorie im Mittelalter. Münster: Aschendorff. Pinborg, Jan 1972. Logik und Semantik im Mittelalter. Ein Überblick. Ed. H. Kohlenberger. Stuttgart-Bad Cannstatt: Frommann-Holzboog. Pohlenz, Max 1939. Die Begründung der abendländischen Sprachlehre durch die Stoa. (Nachrichten von der Gesellschaft der Wissenschaften, Göttingen. Philologisch-historische Klasse, N.F. 3). Göttingen: Vandenhoeck & Ruprecht, 151–198. Rijk, Lambertus Maria de 1967. Logica modernorum, vol. II/1. Assen: Van Gorcum. Schmitter, Peter (ed.) 1996. Sprachtheorien der Neuzeit II. Von der Grammaire de Port-Royal zur Konstitution moderner linguistischer Disziplinen. Tübingen: Narr.
171
172
II. History of semantics Schmitter, Peter (2001). The emergence of philosophical semantics in early Greek antiquity. Logos and Language 2, 45–56. Scholz, Oliver R. 1999. Verstehen und Rationalität. Frankfurt/M.: Klostermann. Sedley, David 1996. Aristotle’s de interpretatione and ancient semantic. In: G. Manetti (ed.). Knowledge Through Signs. Turnhout: Brepols, 87–108. Sedley, David 2003. Plato’s Cratylus. Cambridge: Cambridge University Press. Tachau, Katherine H. 1987. Wodeham, Crathorn and Holcot: The development of the complexe significabile. In: L. M. de Rijk & H. A. G. Braakhuis (eds.). Logos and Pragma. Nijmegen: Ingenium Publishers, 161–189. Weidemann, Hermann 1982. Ansätze zu einer semantischen Theorie bei Aristoteles. Zeitschrift für Semiotik 4, 241–257.
Stephan Meier-Oeser, Berlin (Germany)
9. The emergence of linguistic semantics in the 19th and early 20th century 1. Introduction: The emergence of linguistic semantics in the context of interdisciplinary research in the 19th century 2. Linguistic semantics in Germany 3. Linguistic semantics in France 4. Linguistic semantics in Britain 5. Conclusion 6. References
Abstract This chapter deals with the 19th-century roots of current cognitive and pragmatic approaches to the study of meaning and meaning change. It demonstrates that 19th-century linguistic semantics has more to offer than the atomistic historicism for which 19th-century linguistics became known and for which it was often criticised. By contrast, semanticists in Germany, France and Britain in particular sought to study meaning and change of meaning from a much more holistic point of view, seeking inspiration from philosophy, biology, geology, psychology, and sociology to study how meaning is ‘made’ in the context of social interaction and how it changes over time under pressure from changing linguistic, societal and cognitive needs and influences.
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 172–191
9. The emergence of linguistic semantics in the 19th and early 20th century The subject in which I invite the reader to follow me is so new in kind that it has not even been given a name. The fact is that most linguists have directed their attention to the forms of words: the laws which govern changes in meaning, the choice of new expressions, the birth and death of phrases, have been left behind or have been noticed only in passing. Since this subject deserves a name as much as does phonetics or morphology, I shall call it semantics […], the science of meaning. (Bréal 1883/1991:137) In memory of Peter Schmitter who first helped me to explore the history of semantics
1. Introduction: The emergence of linguistic semantics in the context of interdisciplinary research in the 19th century The history of semantics as a reflection on meaning is potentially infinite, starting probably with Greek or Hindu philosophers of language and embracing more than two thousand years of the history of mankind. It is spread over numerous disciplines, from ancient philosophy (Schmitter 2001a), to modern cognitive science. The history of semantics as a linguistic discipline is somewhat shorter and has been well explored (Nerlich 1992b, 1996a, 1996b). This chapter therefore summarizes results from research dealing with the history of semantics from the 1950s onwards and I shall follow the standard view that semantics as a linguistic discipline began with Christian Karl Reisig’s lectures on Latin semasiology or Bedeutungslehre, given in the 1820s (Schmitter 1990, 2004). As one can see from the motto cited above, another central figure in the history of linguistic semantics in the 19th century is undoubtedly Michel Bréal, the author of the famous Essai de sémantique, published in 1897, the cumulative product of work started as early as 1866 (Bréal 1883/1991). When this seminal book was translated into English in 1900, the name of the discipline that studied linguistic meaning and changes of meaning became ‘semantics’, and other terms, such as semasiology, or sematology were sidelined in the 20th century. Although the focus is here on linguistic semantics, it should not be forgotten that the study of ‘meaning’ also preoccupied philosophers and semioticians. Two seminal figures in the 19th century that should be mentioned in this context are Charles Sanders Peirce in the United States who worked in the tradition of American pragmatism (Nerlich & Clarke 1996) and Gottlob Frege who helped found mathematical logic and analytic philosophy (see article 10 (Newen & Schröder) Logic and semantics and article 3 (Textor) Sense and reference). Although Peirce exerted enormous influence on the developments of semantics, pragmatics and semiotics in the late 19th and early 20th century, his main interest can be said to have been in epistemology. Frege too exerted great influence on the development of formal semantics (see article 11 (Kempson) Formal semantics and representationalism and article 14 (ter Meulen) Formal methods), truth-conditional semantics, feature semantics and so on, especially through his distinction between Sinn and Bedeutung or sense and reference. However, his main interest lay in logic, arithmetic and number theory. Neither Frege nor Peirce were widely discussed in the treatises on linguistic semantics which will be presented below, except for the philosophical and psychological reflections of meaning around the tradition of ‘significs’. Frege was a logician, not a linguist and he himself pointed out that “[t]o a mind concerned with the beauties of language, what is trivial to the logician may seem to be just what is important” (Frege 1977: 10). The linguists discussed below were all fascinated with the beauty of language.
173
174
II. History of semantics This chapter focuses on linguistic semantics in Germany, France, and Britain, thus leaving aside Eastern Europe, Russia, Scandinavia, and our closer neighbours, Italy, Spain, the Benelux countries and many more. However, the work carried out in these countries was, as far as I know, strongly influenced by, if not dependent on the theories developed in Germany, France, and Britain. Terminologically, German linguists initially wrote about Semasiologie, French ones about la sémantique, and some English ones about sematology. In the end the term semantics was universally adopted. In general terms one can say that linguistic semantics emerged from a dissatisfaction with traditional grammar on the one hand, which could not deal adequately with questions of meaning, and with traditional lexicography and etymology on the other, which did not give a satisfactory account of the evolution of meaning, listing the meanings of words in a rather arbitrary fashion, instead of looking for a logical, natural or inner order in the succession of meanings. To redefine grammar, scholars looked for inspiration in the available traditions of philosophy; to redefine lexicography they grasped the tools provided by rhetoric, that is the figures of speech, especially metaphor and metonymy (Nerlich 1998). The development of the field was however not only influenced by internal factors relating to the study of language but also by developments in other fields such as geology and biology, for example, from which semantics imported concepts such as ‘uniformitarianism’, ‘transformation’, ‘evolution’, ‘organism’ and ‘growth’. After initial enthusiasm about ways to give semantics ‘scientific’ credibility in this way, a debate about whether framing the development and change of meaning in such terms was legitimate would preoccupy semanticists in the latter half of the 19th century. The different traditions in the field of semantics were influenced by different philosophical traditions on the one hand, by different sciences on the other. In the case of German semasiology, the heritage of Kantian philosophy, idealism, the romantic movement, and the new type of philology, or to quote some names, the works of Immanuel Kant, especially his focus on the ‘active subject’, Wilhelm von Humboldt and his concept of ‘ergon’ and ‘energeia’ (Schmitter 2001b) and Franz Bopp and his research into comparative grammar were of seminal importance. German semasiology after Reisig was very much attached to the predominant paradigm in linguistic science, that is, to historical-comparative philology. This might be the reason why the term ‘semasiology’, as designating one branch of a prospering and internationally respected discipline, was at first so successful in English speaking countries where an autonomous approach to semantics was missing. This was not the case in France, where Bréal, from 1866 onwards, used semantic research as a way to challenge the German supremacy in linguistics. Later on, however, German Semasiologie, just like the French tradition of la sémantique, began to be influenced by psychology, and thus these two traditions moved closer together. French semantics, especially the Bréalian version, was influenced by the French philosophy of language which was rooted in the work of Etienne Bonnot de Condillac and his followers, the Idéologues, on words as signs. But Bréal was also an admirer of Bopp and of Humboldt. Bréal first expressed his conviction that semantics should be a psychological and historical science in his review of the seminal book, La Vie des mots, written in 1887 by Arsène Darmesteter (first published in English, in 1886, based on lectures given in London) and advocated caution in adopting terms and concept from biology, such as organism and transformation (Bréal 1887). Darmesteter had derived his theory of semantic change from biological models, such as Charles Darwin’s theory of evolution and August Schleicher’s model of language as an evolving organism and of languages as organised into family trees, transforming
9. The emergence of linguistic semantics in the 19th and early 20th century themselves independently from the speakers of the language. Darmesteter applied this conception to words themselves. It is therefore not astonishing to find that Darmesteter’s booklet contains a host of biological metaphors about the birth, life and death of words, their struggle for survival, etc. This metaphorical basis of Darmesteter’s theory was noted with skepticism by his colleagues who agreed however that Darmesteter’s book was the first really sound attempt at analysing how and why words change their meanings. To integrate Darmesteter’s insights into his own theoretical framework, Bréal had only to replace the picture of the autonomous change of a language by the axiom that words change their meaning because the speakers and hearers use them in different ways, in different situations (Delesalle 1987, Nerlich 1990). In parallel with Reisig, Darmesteter used figures of speech, such as metaphor, metonymy and synecdoche to describe the transitions between the meanings of words (Darmesteter 1887), being inspired by the achievements of French rhetoric, especially the work of César Chesneau Du Marsais (1757) on tropes as being used in ordinary language. The English tradition of semantics emerged from philosophical discussions about language and mind in the 17th and 18th centuries (John Locke), and about etymology and the search for ‘true’ and original meaning (John Horne Tooke). Philosophical influences here were utilitarianism and a certain type of materialism. Semantics in its broadest sense was also used at first to underpin religious arguments about the divine origin of language. The most famous figure in what one might call religious semantics, was Richard Chenevix Trench, an Anglican ecclesiast, bishop of Dublin and later Dean of Westminster, who wrote numerous books on the history of the English language and the history of English words. His new ideas about dictionaries led the Philological Society in London to the creation of the New English Dictionary, later called the Oxford English Dictionary, which is nowadays the richest source-book for those who want to study semantic change in the English language. After the turn of the 19th to the 20th century one can observe in Britain a rapid increase in books on ‘words’ – the trivial literature of semantics, so to speak (Nerlich 1992a) –, but also a more thoroughly philosophical reflection on meaning, as well as the start of a new tradition of contextualism in the work of (Sir) Alan Henderson Gardiner and John Rupert Firth, mainly influenced by the German linguist Philipp Wegener and the anthropologist Bronislaw Malinowski (see Nerlich & Clarke 1996). Many of those interested in semantics tried to establish classifications of types or causes of semantic change, something that became in fact one of the main preoccupations for 19th-century semanticists. The classifications of types of semantic change were mostly based on logical principles, that is a typology according to the figures of speech, such as metaphor, metonymy, and extension and restriction, specifying the type of relation or transition between the meanings of a word; the typologies of causes of semantic change were mostly based on psychological principles, specifying human drives and instincts; and finally some classifications applied historical or cultural principles; but most frequently these enterprises used a mixture of all three approaches. Later on in the century, when the issue of phonetic laws reverberated through linguistics, semanticists tried not only to classify semantic changes, but to find laws of semantic change that would be as strict as the sound laws were then believed to be and in doing so, some believed to turn linguistics into a ‘science’ in the sense of natural science. After this preliminary sketch I shall now deal with the three traditions of semantics one by one, the German, the French, and the British one. However, the reader of the following sections has to keep in mind that the three traditions of semantics are not as
175
176
II. History of semantics strictly separable as it might appear. There were numerous links of imitation, influence, cross-fertilization, and collaboration.
2. Linguistic semantics in Germany It has become customary to distinguish two main periods in German semantics: (1) a logico-historical or historico-descriptive one, and (2) a psychologico-explanatory one. The father of the first tradition is Reisig, the father of the second Steinthal. Christian Karl Reisig (Schmitter 1987, 2004) was, like many other early semanticists, a classical philologist. In his lectures on Latin grammar given in the 1820s, and first published by his student Friedrich Haase in 1839 (2nd edition 1881–1890), he tried to reform the standard view of traditional grammar by adding to it a new discipline: semasiology or Bedeutungslehre, that is the study of meaning in language. Grammar was normally considered to consist in the study of syntax and etymology (which then meant approximately the same as morphology or Formenlehre). Reisig claims that the word should not only be studied with regard to its form (etymology) and in its relation to other words (syntax), but as having a certain meaning. He points out that there are words whose meaning is neither determined by their form alone nor by their place in the sentence, and that the meaning of these words has to be studied by semasiology. More specifically semasiology is the study of the development of the meaning of certain words, as well as the study of their use, both phenomena that were covered by neither etymology nor syntax. Reisig puts forward thirteen principles of semantics, according to which language precedes grammar, languages are the products of nations, not of single human beings, language comes into being through imagination and enthusiasm in the social interaction of people. We shall see that this dynamic view of language and semantic change became central to historical semantics at the end of the 19th century when it was also linked to a more contextual approach. According to Reisig the evolution of a particular language is determined by free language-use within the limits set by the general laws of language. These general laws of language are Kant’s laws of pure intuition (space and time), and his categories. This means that language and language change are brought about by a dynamic interplay of several forces which Reisig derived from his knowledge of German idealism on the one side and the romantic movement on the other. In line with German idealistic philosophy, he wanted to find the general principles of semantic change, assumed to be based on general laws of the human mind. In tune with the romantic movement, he saw that every language has, however, its individuality, based on the history of the nation and he recognized that speakers too have certain degrees of freedom in their creative use of language. This freedom is however constrained by certain habitual associations between ideas which had already been discussed in rhetoric, but which semasiology should, he claims, take account of, namely synecdoche, metonymy and metaphor. Whereas rhetoric focuses on their aesthetic function, semasiology focuses on the way these figures have come to structure language use in a particular language. The recognition of rhetorical figures, such as metaphor and metonymy, as habitual cognitive associations and also as procedures of semantic innovation and change was a very important step in the constitution of semantics as an autonomous discipline. The figures of speech were reinterpreted in two ways and thus emancipated from their definitions in philosophical and literary discourse: they were no longer regarded as mere abuses of language, but as necessary for the life of language, that is, the continuous use
9. The emergence of linguistic semantics in the 19th and early 20th century of it; and they were not mere ornaments of speech, but essential instruments of mind and language to cope with ever new communicative needs – an insight deepened in various contributions to the 19th and early 20th-century philosophy of metaphor and rediscovered in modern cognitive semantics (Nerlich & Clarke 2000; see also article 100 (Geeraerts) Cognitive approaches to diachronic semantics). To summarize: Reisig’s approach to semantics is philosophical with regard to the general laws of the mind, as inherited from Kant, but it is also historical, because Reisig stressed the necessity of always studying the Latin texts very closely. One can also claim that his semantics is to some degree influenced by psychology, in as much as Reisig adds to Kant’s purely philosophical principles of intuition and reason a third source of human language: sensations or feelings. It is also rhetorical and stylistic, a perspective later rejected by his followers Heerdegen and Hey. The theory of the sign that underlies his semantics is very traditional, that is representational: the sign signifies an idea/concept (Begriff) or a feeling (Empfindung); language represents thought, a thought that itself represents the external world. For Reisig thoughts and feelings exist independently of the language that represents them, a view that Humboldt tried to destroy in his conception of language as constitutive of thought. The study of semantic change can therefore only be the study of the development of ideas or thoughts as reflected in the words that signify them. This development of thought can run along the following (‘logical’) lines, called metaphor (the interchange of ideas, II: 6), metonymy (the interchange of representations, II: 4), or synecdoche (II: 4). To these he adds the use of originally local prepositions to designate temporal relations, based on the interchange of the forms of intuition, time and space, again an insight that was rediscovered in modern cognitive semantics. Semasiology has to show how the different meanings of a word have emerged from the first meaning (logically and historically, II: 2), that is, how thought has unfolded itself in the meaning of words. This kind of ‘Vorstellungssemantik’ (Knobloch 1988: 271) would dominate German semasiology until 1930 approximately. It was then challenged and eventually overthrown by Leo Weisgerber in linguistics and by Karl Bühler in psychology. 19th-century diachronic and psychological semantics was replaced by 20th-century synchronic and structural semantics. However, Reisig does not leave it at the narrow view of semantics sketched above. He points out that the meaning of a word is not only constituted by its function of representing ideas, but that it is determined as well by the state of the language in general and by the use of a word according to a certain style or register (Schmitter 1987: 126). In fact, in dealing with the ‘stylistic’ problem of choosing a word from a series of words with ‘the same’ meaning, he reshapes his unidimensional definition of the sign as signifying one concept. Reisig deals here with synonyms, either complete or quasi synonyms; he even indicates the importance of a new kind of study: synonymology. This rather broad conception of semantics, including word semantics, but also stylistics and synonymology, is very similar to the one later advocated by Bréal. However, Reisig’s immediate followers in Germany gradually changed Reisig’s initial conception of semasiology in the following ways: they dropped the philosophical underpinnings and narrowed the scope of Reisig’s semasiology by abandoning the study of words in their stylistic context, reducing semasiology more and more to a purely atomistic study of changes in word-meaning. In this new shape and form, semasiology flowered in Germany, especially after the new edition of Reisig’s work in the 1880s. Up to the end of the century a host of treatises on the subject were published by philologists but also by a number of ‘schoolmen’ (Nerlich 1992a).
177
178
II. History of semantics A new impetus to the study of meaning came from the rise of psychological thought in Germany, especially under the influence of Johann Friedrich Herbart, Heymann Steinthal, Moritz Lazarus, and Wilhelm Wundt, the latter three fostering a return to Humboldt’s philosophy of language. At the time when Steinthal tried to reform linguistic thought through the application of psychological principles, even the most hard-nosed linguists, the neogrammarians themselves, were forced to turn to psychology. This was due to the introduction of analogy as the second most important principle of language change, apart from sound laws. As early as 1855 Steinthal had written a book where he tried to refute the belief held by many of his fellow linguists that language is based on logical principles and that grammar is based on logic. According to Steinthal, language is plainly based on psychological principles, and these principles are largely of a semantic nature. Criticizing the view inherited from Reisig that grammar has three parts, etymology, semasiology, and syntax, he claims that there is meaning (what he calls after Humboldt an ‘inner form’) in etymology as well as syntax. In short, semasiology should be part of etymology and syntax, not be separated from them (1855: xxi–xxii). Using the Humboldtian dichotomy of inner and outer form, he wants to study grammar (etymology and syntax) from two points of view: semasiology and phonetics. For him language is ‘significant sound’, that is, sound and meaning can not be artificially separated. In 1871 Steinthal wrote his Abriss der Sprachwissenschaft. Volume I was intended to be an Introduction into psychology and linguistics. In this work, he wants to explain the origin of language as meaningful sound. His theory of the emergence of language can be compared to modern symbolic interactionism (Nerlich & Clarke 1998). The principle axiom is that when we emit a sound which is understood by the other in a certain way, we understand not only the sound we made, but we understand ourselves, attain consciousness. The origin of language and of consciousness thus lies in understanding. This principle became very important to Philipp Wegener who fostered a new approach to semantics, no longer a mere word-semantics, but a semantics of communication and understanding (Wegener 1885/1991). Steinthal’s conception of psychology was the basis of an influential book on semantic change in the Greek language by Max Hecht, which appeared in 1888 and was extensively quoted by the classical philologists among the semanticists. Hermann Paul is often regarded as one of the leading figures in the neogrammarian movement and his book, the Prinzipien der Sprachgeschichte, first published in 1880, is regarded by some as the bible of the neogrammarians. It is true that Paul intended his book at first to be just that. But already in the second edition (1886) he extensively elaborated his at first rather patchy thoughts on semantic topics, such that Bréal – normally rather critical of neogrammarian thought, especially their phonetic bias – could say in his review of the second edition (Bréal 1887) that Paul’s book constituted a major contribution to semantics (Bréal 1897: 307). How had this change of emphasis from sound change to semantic change come about? In 1885 Wegener, like Paul a follower of Steinthal, had published his Untersuchungen über die Grundfragen des Sprachlebens (Wegener 1885/1991) where he had devoted a long chapter to semantic change, especially its origin in human communication and interaction. Paul had read (and reviewed) this book (just as Wegener had read and reviewed Paul’s). In doing so, Paul must have discovered many affinities between his ideas and those of Wegener, and he must have been inspired to devote more thought to semantic questions. What were the affinities? The most direct resemblance was their insistence on the
9. The emergence of linguistic semantics in the 19th and early 20th century interaction between speaker and hearer; here again their debt to Steinthal is clear, as clear as their opposition to another very influential psychologist of language, namely Wundt. Paul’s intention was to get rid of abstractions or ‘hypostasiations’ such as the soul of a people or a language, ghosts that Wundt, and even Steinthal, still tried to catch. These entities, if indeed they are entities, escape, according to Paul, the grasp of any science that wants to be empirical. What can be observed, from a psychological and historical perspective, are only the psychological activities of individuals, but individuals that interact with others. This social and psychological interaction is a mediated one; it is mediated by physiological factors: the production and reception of sounds. Historical linguistics (and all linguistics should be historical in Paul’s eyes) as a science based on principles is therefore closely related to two other disciplines: physiology and the psychology of the individual. From this perspective, language use does not change autonomously as it had been believed by a previous generation of linguists, but neither can it be changed by an individual act of the will. It evolves through the cumulative changes occurring in the speech activity of individuals. This speech activity normally proceeds unconsciously – we are only conscious of what we want to say, not of how we say it or how we change what we use in our speech activity: the sounds and the meanings. Accordingly, Paul devotes one chapter to sound change, one to semantic change, and one to analogy (more concerned with the changes in word-forms). The most important dichotomy that Paul introduced into the study of semantics is that of usual and occasional meaning (usuelle und okkasionelle Bedeutung) (Paul 1920/1975: 75), a distinction that exerted lasting influence on semantics and also the psychology and philosophy of meaning (Stout 1891, for example). The usual signification is the accumulated sedimentation of occasional significations, the occasional signification, based on the usual signification is imbued with the intention of the speaker and reshaped by being embedded in the situation of discourse. This context-dependency of the occasional signification can have three forms: it depends on a certain perceptual background shared by speaker and hearer; it depends on what has preceded the word in the discourse; and finally it depends on the shared location of discourse. “Put the plates in the kitchen” is understood because we know that we are speaking about that kitchen here and now and no other. These contextual clues facilitate especially the understanding of words which are ambiguous or polysemous in their usual signification (but Paul does not use term ‘polysemy’, which had been introduced by Bréal in 1887; Nerlich & Clarke 1997, 2003). A much more radical view of the situation as a factor in meaning construction was put forward by Wegener in 1885 (Nerlich 1990). Like so many linguists of the 19th century, Paul tries to state the main types of changes of meaning, but he insists that they correspond to the possibilities we have to modify the meaning of words on the level of occasional signification. The first type is the specialization of meaning (what Reisig and Darmesteter would have called ‘synecdoche’), which he defines as the restriction of the extension of a word (the set of all actual things the word describes) and enrichment of its intension (the set of all possible things a word or phrase could describe). This type of semantic change is very common. Paul gives the example of German Schirm, which can be used to designate any object employed as a ‘screen’. In its occasional usage it may signify a ‘fire-screen’, a ‘lamp-screen’, a ‘screen’ for the eyes, an ‘umbrella’, a ‘parasol’, etc. But normally, on hearing the word Schirm we think of a ‘Regenschirm’, an umbrella – what cognitive semanticists would now call its prototypical meaning. This meaning has somehow separated itself from the general meaning of ‘screen’ and become independent. A second basic means to extend (and
179
180
II. History of semantics restrict) word meaning is metaphor. A third type of semantic change is the transfer of meaning upon that which is connected with the usual meaning in space, time or causally (what would later be called via ‘contiguity’). Astonishingly, Paul does not use the term ‘metonymy’ to refer to this type of semantic change. One of the most important contributions to linguistics in general and semantics in particular was made by Wegener in his Untersuchungen über die Grundfragen des Sprachlebens, published in 1885. Although Wegener can to some extent be called a neogrammarian, he never accepted their strict distinction between physiology and psychology, as advocated for example by Hermann Osthoff. For Wegener language is a phenomenon based on the whole human being, their psyche and their body, of a human being who is an integral part of a communicative situation. Paul was influenced by Wegener and so was Gardiner, the Egyptologist and general linguist who dedicated his book The Theory of Speech and Language (Gardiner 1932/1951) to Wegener. So much for some major contributions to German semantics. As one could easily devote an entire book to each of the three national trends in semantic thought, of which the German tradition was by far the most prolific, I can only indicate very briefly the major lines of development that semantics took in Germany after Paul. Many classical philologists continued the tradition started by Reisig. Others took Paul’s achievements as a starting point for treatises on semantic change that wanted to illustrate Paul’s main types of semantic change by more and more examples. Others still, such as Johan Stöcklein tried to develop Paul’s core theory further by stressing, for instance, the importance of the context of the sentence for semantic change (Stöcklein 1898). But most importantly, the influence of psychology on semantics increased strongly. Apart from Herbart’s psychology of mechanical association which had had a certain influence on Steinthal and hence Paul and Wegener, and apart from some more incidental influences such as that of Sigmund Freud and Carl Gustav Jung on Hans Sperber, for example, the most important development in the field of psychology was Wundt’s Völkerpsychologie. Two volumes of his monumental work on the psychology of such collective phenomena as language, myth and custom were devoted to language (Wundt 1900), and of these a considerable part was concerned with semantic change (on the psychology of language in Germany see Knobloch 1988). Wundt distinguished between regular semantic change based on social processes or, as he said, the psyche of the people, and singular semantic change, based on the psyche of the individual. He divided the first class into assimilative change and complicative change, the second into name-making according to individual (or singular) associations, individual (or singular) transfer of names, and metaphorically used words. In short, the different types of semantic change were mainly based on different types of association processes (similar to Reisig in this way). However, Wundt’s work attracted a substantial body of criticism, especially from the psychologist Karl Bühler (see Nerlich & Clarke 1998) and the philosopher and general linguist Anton Marty who developed a descriptive semasiology in opposition to Wundt’s historical approach (Marty 1908), in this comparable to Raoul de La Grasserie in France (de La Grasserie 1908). Two other developments in German linguistics have at least to be mentioned: the new focus on words and things, that is on designation (Bezeichnung), and not so much on meaning (Bedeutung) and the new focus on lexical and semantic fields instead of single words. After the introduction of this first new perspective, the term ‘semasiology’ itself
9. The emergence of linguistic semantics in the 19th and early 20th century changed its meaning, standing now in opposition to ‘onomasiology’. The second new perspective lead to the flourishing new field of field semantics (Nerlich & Clarke 2000).
3. Linguistic semantics in France After this sketch of the evolution of semasiology in Germany we now turn to France, where a rather different doctrine was being developed by Michel Bréal, the most famous of French semanticists. But Bréal was by no means the only one interested in semantic questions. Lexicographers, such as Emile Littré (1880) and later Darmesteter and Adolphe Hatzfeld, contributed to the discussion on semantic questions from a ‘naturalist’ point of view, applying insights of Darwin’s theory of evolution, of Lamarckian transformationism and, in the case of Littré, of Auguste Comte’s positivism to the problems of etymology. Littré was one of the first to advocate uniformitarian principles in linguistics, which became so important for Darmesteter and Bréal in France and William Dwight Whitney in the United States (Nerlich 1990). According to the uniformitarian view, inherited from geology (Lyell 1830–1833), the laws of language change now in operation, and which can therefore be ‘observed’ in living languages, were the same that structured language change in the past. Hence, one can explain past changes by laws now in operation. Darmesteter was indeed the first to put forward a program for semantics which resembles in its broad scope that of Reisig before him and Bréal after him and in his emphasis on history that of Paul. He contended that the philosophy of language should focus on the history of languages, the transformations of syntax, grammatical forms and word meanings, as a contribution to the history of the human mind and he also claims that these figures of speech also structure changes in grammatical forms and syntactic constructions. However, as early as the 1840s, before Littré and Darmesteter, the immediate predecessors of Bréal, another group of linguists had started to do ‘semantics’ under the heading of idéologie, or as one of its members later called it fonctologie, a term obviously influenced by Schleicher’s distinction between form, function and relation (Schleicher 1860). French semantics of this type focused, like the later German semasiology, on the isolated word, but even more on the idea it incarnates, and excluded from its investigation the sentential or other contexts. Later on de La Grasserie (1908) proposed an ‘integral semantics’ based on this framework. He was (with Marty) the first to point out the difference between synchronic and diachronic semantics, or as he called it ‘la sémantique statique’ and ‘la sémantique dynamique’. As we shall see in part 4 of this chapter, as early as 1831 the English philosopher Benjamin Humphrey Smart was aware of the dangers of that type of ‘ideology’ and wanted to replace it by his type of ‘sematology’, stressing heavily the importance of the context in the production and understanding of meaning – that is replacing mental association by situational embeddedness (Smart 1831: 252). However, the real winner in this ‘struggle for survival’ between opposing approaches to semantics, was the new school surrounding Bréal. Language was no longer regarded as an organism, nor did words ‘live and die’. The focus was now on the language users, their psychological make-up and the process of mutual understanding. It was in this process that ‘words changed their meanings’. Hence the laws of semantic change were no longer regarded as ‘natural’ or ‘logical’ laws, but as intellectual laws (Bréal 1883), or what one would nowadays call cognitive laws. This new psychological approach to semantic
181
182
II. History of semantics problems resembled that advocated in Germany by Steinthal, Paul and Wegener. Paul, Wegener, Darmesteter, and Bréal all stressed that the meaning of a word is not so much determined by its etymological ancestry, but by the value it has in current usage, a point of view that moved 19th-century semantics slowly from a purely diachronic to a more synchronic and functional perspective. Although Bréal and Wegener seem not to have known each other’s work, their conceptions of language and of semantics are in some ways astonishingly similar (they also overlap with theories of language developed by Whitney in the United States and Johan Nicolai Madvig (1875) in Denmark, see Hauger 1994). Brought up in the framework of traditional German comparative linguistics, both objected to the reification of language as an autonomous self-evolving system, both saw in psychology a valuable help to get an insight into how people speak and understand each other and change the language as they go along, both made fruitful use of the form-function distinction, where the function drives the evolution of the linguistic form, both assumed that to understand a sentence the hearer had much more to do than to decode it word by word – s/he has to draw inferences from the context of the sentence as a whole, as well as from the context of its use or its function in discourse –, and, finally, they both had a much broader conception of historical semantics than their contemporaries, especially some of the semasiologists in Germany and the ‘ideologists’ in France. In their eyes semantic change is a phenomenon not only of the word or idea, but must be observed at the morphological and syntactical level, too. The evolution of grammar or of syntax is thus an integral part of semantics. Bréal’s thoughts on semantics, gathered and condensed in the Essai de sémantique (Science des significations) (1897), had evolved over many years. The stages in the maturation of his semantic theory were, briefly stated, the following: 1866 - lecture on the form and function of words; 1868 - lecture on latent ideas; 1883 - introduction of the term sémantique for the study of semantic change and more particularly for the search of the intellectual laws of language change in general; 1887 – review of Darmesteter’s book on the life of words, a review called quite intentionally ‘history of words’. Bréal rejected all talk about the life of words. For him words do not live and die, they change according to the use speakers make of them. But postulating the importance of the speaker was not enough for him, he went so far as to proclaim that the will or consciousness of the speaker are the ultimate forces of language change. This made him very unpopular among those French linguists who still adhered to the biological paradigm of language change, based on Schleicher’s views on the transformation of language. But Bréal was also criticised by some of his friends such as Antoine Meillet. Meillet stressed the role of collective forces, such as social groups, over and above the individual will of the speaker, and became as such important for a new trend in 20th-century French semantics: sociosemantics (Meillet 1904–1905). As mentioned before, Bréal was not the only one who wrote a review of Darmesteter’s book. Two of his friends and colleagues had done the same: Gaston Paris and Victor Henry, and they had basically adopted the same stand as Bréal. Henry should be remembered for his criticism of Bréal’s insistence on consciousness, or at least certain degrees of consciousness, as factors in language change. Henry held the view that all changes in language are the result of unconsciously applied procedures, a view he defended in his booklet on linguistic antinomies (1896) and in his study of a case of glossolalia (1901). How was Bréal received in Germany, a country where a long tradition of ‘semasiology’ already existed? It is not astonishing to find that the psychologically oriented Bréal was
9. The emergence of linguistic semantics in the 19th and early 20th century warmly applauded by Steinthal in his 1868 review of Bréal’s lecture on latent ideas. He was also mentioned approvingly by Paul (1920/1975: 78 fn. 2). Bréal’s division of linguistics into phonetics and semantics as the study of meaning at the level of the lexicon, morphology and syntax also corresponds to some extent to Steinthal’s conce ption outlined above. It did, however, disturb those who, after Ferdinand Heerdegen’s narrowing of the field of semasiology, practiced the study of semantic change almost exclusively on the level of the word, excluding morphology and syntax. This difference between French semantics and German semasiology was noted by Oskar Hey in his review of the Essai (1898: 551). Hey comes to the conclusion that if Bréal had not entirely ignored the German achievements in the field of semasiology, he would have noticed that everything he has to say about semantic change had already been said. He concedes, however, that etymologists, morphologists and syntacticians may have a different view on some parts of Bréal’s work than he has as a classical philologist and semasiologist (see p. 555). From this it is clear that Hey had not really grasped the implications of Bréal’s novel approach to semantics. Bréal tried to open up the field of historical semantics from a narrow study of changes in word-meaning to the analysis of language change in general based on the assumption that meaning and change of meaning are a function of discourse. If one had to answer the question: where did Bréal’s thoughts on semantics come from, if not from German semasiology (but Bréal had read Reisig, whom he mentions in the context of a discussion on pronouns, see 1897: 207 fn. 1), one would have to look more closely at 18th-century philosophy of language, specifically the work of the philosopher Etienne Bonnot de Condillac on words as signs and about the progress of knowledge going hand in hand with the progress of language. From this point of view words are not ‘living beings’ and the fate of language is not mere decay. However, Bréal did not accept Condillac’s use of etymology as the instrument to find the real, original meanings of words and to get insights into the constitution of the human mind (a view also espoused in Britain by Tooke, see below). For Bréal the progress of language is linked to the progressive forgetting of the original etymological meaning, it is based on the liberation of the mind from its etymological burden. The red thread that runs through Bréal’s semantic work is the following insight: To understand the evolution and the structure of languages we should not focus so much on the forms and sounds but on the functions and meanings of words and constructions, used and built up by human beings under the influence of their will and intelligence, on the one hand, and the influence of the society they live and talk in, on the other. Unlike some of his contemporaries, Bréal therefore looked at how ideas, how our knowledge of a language and our knowledge of the world, shape the words we use. However, he was also acutely aware of the fact that this semantic and cognitive side of language studies was not yet on a par with the advances made in the study of phonetics, of the more physiological side of language, and had much to learn from the emerging experimental sciences of the human mind (Bréal 1883/1991: 151). Bréal’s most famous contribution to semantics as a discipline was probably his discussion of polysemy, a term he invented in 1887. For him, as for Elisabeth Closs Traugott today, all semantic change arises by polysemy, i.e., new meanings coexist with earlier ones, typically in restricted contexts (Traugott 2005). French semantics had its peak between 1870 and 1900. Ironically, when the term ‘semantics’ was created through the translation of Bréal’s Essai, the interest in semantics faded slightly in France. However, there were some continuations of 19th-century
183
184
II. History of semantics semantics, as for example in the work of Meillet who focused on the social aspects of semantic change. There was also, just as in Germany, a trend to study affective and emotional meaning, a trend particularly well illustrated by the work of the Swiss linguist and student of Ferdinand de Saussure, Charles Bally, on ‘stylistics’ (1951), followed finally by a period of syntheses, of which the work of the Belgian writer Albert Joseph Carnoy is the best example (1927).
4. Linguistic semantics in Britain In Britain the study of semantic change was linked for a long time to a kind of etymology that had also prevailed in 18th-century France, that is the use of etymology as the instrument to find the real, original meanings of words and so to get insights into the constitution of the human mind. Genuinely philological considerations only came to dominate the scene by the middle of the century with the creation of the Philological Society in 1842, and its project to create a New English Dictionary. The influence of Locke’s Essay Concerning Human Understanding (1689) on English thinking had been immense, strengthened by John Horne Tooke’s widely read Diversions of Purley (Tooke 1786–1805). Tooke’s theory of meaning can be summarised in the slogan “one word - one meaning”. Etymology has to find this meaning, and any use that deviates from it is regarded as ‘wrong’ - linguistically and morally - this also has religious implications. Up to the 1830s Tooke was much in vogue. His doctrine was, however, challenged by two philosophers: the Scottish common sense philosopher Dugald Stewart and, following him to some extent, Benjamin Humphrey Smart. In his 1810 essay “On the Tendency of some Late Philological Speculations”, “Stewart attacked”, as Aarsleff points out, “what might be called the atomistic theory of meaning, the notion that each single word has a precise idea affixed to it and that the total meaning of a sentence is, so to speak, the sum of these meanings.” He “went to the heart of the matter, asserting that words gain meaning only in context, that many have none apart from it” (Aarsleff 1967/1983: 103). According to Stewart, words which have multiple meanings in the dictionary, or as Bréal would say, polysemous words, are easily understood in context. This contextual view of meaning is endorsed by Smart in his anonymously published book entitled An Outline of Sematology or an Essay towards establishing a new theory of grammar, logic and rhetoric (1831), which was followed by a sequel to this book published in 1851, and finally by his 1855 book on thought and language. In both his 1831 and 1855 books Smart quotes the following lines from Stewart: (…) our words, when examined separately, are often as completely insignificant as the letters of which they are composed, deriving their meaning solely from the connection or relation in which they stand to others. (Stewart 1810: 208–209)
The Outline is based on Locke’s Essay, but goes far beyond it. Smart takes up Locke’s threefold division of knowledge into (1) physicology or the study of nature, (2) practology or the study of human action, and (3) sematology, the study of the use of signs for our knowledge or in short the doctrine of signs (Smart 1831: 1–2). This study deals with signs “which the mind invents and uses to carry on a train of reasoning independently of actual existences” (1831: 2, note carried over from 1). In the first chapter of his book, which is devoted to grammar, Smart tries “to imagine the progress of speech upwards as from its first invention” (Smart 1831: 3). It starts with
9. The emergence of linguistic semantics in the 19th and early 20th century natural cries which have ‘the same’ meaning as a ‘real’ sentence composed of different parts, and this because “if equally understood for the actual purpose, [it] is, for this purpose, quite adequate to the artificially compounded sign”, the sentence (Smart 1831: 8). But as it is impossible to have a (natural) sign for every occasion or for every purpose (to signify a perception or conception), it was necessary to find an expedient. This expedient was to put together several signs, which each had served a particular purpose, in such a way that they would modify each other, and could, united, serve the new purpose, signify something new (see Smart 1831: 9–10), From these rudest beginnings language developed gradually as an artificial instrument of communication. I cannot go into Smart’s presentation of the evolution of the different parts of speech, but it is important to point out that Smart, like Stewart, rejected the notion that words have meaning in isolation. Words have only meaning in the sentence, the sentence has only meaning inside a paragraph, the paragraph only inside a text (see Smart 1831: 54–55). Signs in isolation signify notions, or what the mind knows on the level of abstraction. Signs in combination signify perceptions, conceptions and passions (see Smart 1831: 10–12). Words have thus a double force “by which they signify at the same time the actual thought, and refer to knowledge necessary perhaps to come at it” (Smart 1831: 16). This knowledge is not God-given. “It is by frequently hearing the same word in context with others, that a full knowledge of its meaning is at length obtained; but this implies that the several occasions on which it is used, are observed and compared; it implies in short, a constant enlargement of our knowledge by the use of language as an instrument to attain it” (Smart 1831: 18–19). And thus, as only words give access to ideas, ideas do not exist antecedently to language. As language does not represent notions, the understanding of language is not as simple as one might think, it cannot be used to transfer notions from the head of the speaker to the head of the hearer. Instead we use words in such a way that we adapt them to what the hearer already knows (see Smart 1831: 191). It is therefore not astonishing to find a praise of tropes and figures of speech in the third chapter of the Outline, devoted to rhetoric. Smart claims that they are “essential parts of the original structure of language; and however they may sometimes serve the purpose of falsehood, they are, on most occasions, indispensable to the effective communication of truth. It is only by [these] expedients that mind can unfold itself to mind; - language is made up of them; there is no such thing as an express and direct image of thought.” (Smart 1831: 210). Tropes and figures of speech “are the original texture of language and that from which whatever is now plain at first arose. All words are originally tropes; that is expressions turned (…) from their first purpose, and extended to others.” (Smart 1831: 214, Nerlich & Clarke 2000). In his 1855 book Smart wants to correct and extend Locke’s thought even further, in particular get rid of the mistake according to which there is a one to one relationship between ideas and words. According to Smart, we do not add meaning to meaning to make sense of a sentence or a text, on the contrary: we subtract (Smart 1855: 139). As an example Smart gives the syntagm old men. Just as the French vieillards, the English old men does not mean the same thing as vieux added to hommes. To understand a whole sentence, we step down from what he calls premise to premise until we reach the conclusion. Unfortunately, Smart’s conception of the construction of meaning seems to have had little influence on English linguistic thought in the 19th century. He left, however, an impression on philosophers and psychologists of language, such as Victoria Lady Welby (1911) and George Frederick Stout who had also read Paul’s work for example. Stout
185
186
II. History of semantics picked up Paul’s distinction between usual and occasional meaning for example, but points out that it “must be noticed, however, that the usual signification is, in a certain sense, a fiction” (Stout 1891: 194) and that: “Permanent change of meaning arises from the gradual shifting of the limits circumscribing the general significations. This shifting is due to the frequent repetition of the same kind of occasional application” (Stout 1891: 196). What James A.H. Murray later called ‘sematology’ (in a different sense to Smart’s use of the term), that is the use of semantics in lexicography, received its impulses from the progress in philology and dictionary writing in Germany and from the dissatisfaction with English dictionary writing. This dissatisfaction was first expressed most strongly by Richard Garnett in 1835 when he attacked English dictionaries for overlooking the achievements of historical-comparative philology. Garnett even went as far as to call some of his countrymen’s lexicographical attempts “etymological trash” (Garnett 1835: 306). The next to point out certain deficiencies in dictionaries was the man who became much more popular for his views on semantic matters: Trench. He used his knowledge of language to argue against the ‘Utilitarians’, against the biological transformationists and those who held ‘uniformitarian’ or evolutionary views of language change. For him language is a divine gift, insofar as God has given us the power of reason, and thus the power to name things. His most popular book was On the Study of Words (1851), used here in its 21st edition of 1890. The difference between Trench and the up to then prevailing philosophy of language is summarized by Aarsleff in the following way: (…) by making the substance of language – the words – the object of inquiry, Trench placed himself firmly in the English tradition, which had its beginning in Locke. There was one important difference, however. Trench shared with the Lockeian school, Tooke and the Utilitarians, the belief that words contained information about thought, feeling, and experience, but unlike them he did not use this information to seek knowledge of the original, philosophical constitution of the mind, but only as evidence of what had been present to the conscious awareness of the users of words within recent centuries; this interest was not in etymological metaphysics, not in conjectural history; not in material philosophy, but in the spiritual and moral life of the speakers of English. (Aarsleff 1967/1983: 238)
He studied semantic change at one and the same time as historical records and a lessons in changing morals and history. This is best expressed in this chapter-title: “On the Morality of Words”; the chapter contains ‘records of sin’ and ‘records of good and evil in language’. Despite these moralizing overtones, Trench’s purely historical approach to etymology became slowly dominant in Britain and it found its ultimate expression in the New English Dictionary. However, Trench’s book contains some important insights into the nature of language and semantic change which would later on be treated more fully by Darmesteter and Bréal. It is also surprising to find that language is for Trench as it was for Smart “a collection of faded metaphors” (Trench 1851/1890: 48), that words are for him fossilized poetry (for very similar views, see Jean Paul 1962–1977). Trench also writes about what we would nowadays call the amelioration or pejoration of word-meaning, about the changes in meaning due to politics, commerce, the influence of the church, on the rise of new words according to the needs and thoughts of the speakers, and finally we find a chapter “On the Distinction of Words”, which deals with a phenomenon called by Hey, Bréal and Paul the differentiation of synonyms. The study of synonyms deals with the essential
9. The emergence of linguistic semantics in the 19th and early 20th century (but not entire) resemblance between word-meanings (Trench 1851/1890: 248–249). For Trench there can never be perfect synonyms, and this for the following reason: Men feel, and rightly, that with a boundless world lying around them and demanding to be catalogued and named […], it is a wanton extravagance to expend two or more signs on that which could adequately be set forth by one – an extravagance in one part of their expenditure, which will be almost sure to issue in, and to be punished by, a corresponding scantness and straitness in another. Some thought or feeling or fact will wholly want one adequate sign, because another has two. Hereupon that which has been well called the process of ‘desynonymizing’ begins – that is, of gradually discriminating in use between words which have hitherto been accounted perfectly equivalent, and, as such, indifferently employed. (…) This may seem at first sight only as a better regulation of old territory; for all practical purposes it is the acquisition of new. (Trench 1851/1890: 258–259)
Trench’s books, which became highly popular, must have sharpened every educated Englishman’s and Englishwoman’s awareness for all kinds and sorts of semantic changes. On a more scientific level Trench’s influence was even more profound. As Murray wrote in the Preface to the first volume of the New English Dictionary the “scheme [for the NED] originated in a resolution of the Philological Society, passed in 1857, at the suggestion of the late Archbishop Trench, then Dean of Westminster” (Murray 1884: v). In this dictionary the new historical method in philology was for the first time applied to the “life and use of words” (ibid.). The aim was “to furnish an adequate account of the meaning, origin, and history of English words now in general use, or known to have been in use at any time during the last seven hundred years.” (ibid.). The dictionary endeavoured (1) to show, with regard to each individual word, when, how, in what shape, and with what signification, it became English; what development of form and meaning it has since received; which of its uses have, in the course of time, become obsolete, and which still survive; what new uses have since arisen, by what processes, and when: (2) to illustrate these facts by a series of quotations ranging from the first known occurrence of the word to the latest, or down to the present day; the word being thus made to exhibit its own history and meaning: and (3) to treat the etymology of each word strictly on the basis of historical fact, and in accordance with the methods and results of modern philological science. (Murray 1884: vi)
Etymology was no longer seen as an instrument used to get insights into the workings of the human mind, as philosophers in France and Britain had believed at the end of the 18th century, or to discover the truly original meaning of a word. It was now put on a purely scientific, i.e. historical, footing. But, as we have seen, by then a new approach to semantics, fostered by Steinthal and Bréal under the influence of psychological thought, brought back considerations of the human mind, of the speaker, and of communication, opening up semantics from the study of the history of words in and for themselves to the study of semantic change in the context of psychology and sociology. One Cambridge philosopher, who knew Scottish common sense philosophy just as well as Kantianism, and studied meaning in the context of communication was John Grote. In the 1860s, he developed a theory of meaning as use and of thinking as a social activity based on communication (Gibbins 2007). His focus on ‘living meaning’ as opposed to ‘fossilised’ or historical meaning had parallels with the
187
188
II. History of semantics theory of meaning developed by Bréal at the same time and can be regarded as a direct precursor of ordinary language philosophy. Influenced by Wegener, Ferdinand de Saussure (and his distinction between speech and language) and the anthropologist Malinowski (and his claim that language can only be studied as part of action), Gardiner and then Firth tried to develop a new contextualist approach to semantics and laid the foundation for the London School of Linguistics. However, by the 1960s Britain, like the rest of Europe, began to feel the influence of structuralism (Pottier 1964). Gustav Stern (1931) in Denmark tried to synthesise achievements in semantics, especially with reference to the English language. Stephen Ullmann in Oxford attempted to bridge the gap between the old (French and German) type of historical-psychological and the new type of structural semantics in his most influential books on semantics written in 1951 and 1962. Although brought up in the tradition of Gardiner and Firth, Sir John Lyons finally swept away the old type of semantics and advocated the new ‘structural semantics’ in 1963 (Lyons 1963). The focus was now both on how meanings are shaped by their mutual relations in a system, rather than by their evolution over time and on the way meanings are constituted internally by semantic features which were, by some thought to be invariant (see Matthews 2001; for more information see article 16 (Bierwisch) Semantic features and primes). Variation and change, discourse and society, mind and metaphor which had so fascinated earlier linguists were sidelined but later rediscovered inside frame semantics (see article 29 (Gawron) Frame Semantics), prototype semantics (see article 28 (Taylor) Prototype theory), cognitive semantics (see article 27 (Talmy) Cognitive Semantics) and studies of grammaticalisation (see article 101 (Eckhardt) Grammaticalization and semantic reanalysis).
5. Conclusion One of the pioneers in the history of semantics, Dirk Geeraerts has, in the past, distinguished between five stages in the history of lexical semantics, namely pre-structuralist diachronic semantics, structuralist semantics, lexical semantics as practized in the context of generative grammar, logical semantics, and cognitive semantics (Geeraerts 1997). Geeraerts himself has now provided a masterly overview of the history of semantics from its earliest beginnings up to the present (Geeraerts 2010), with chapters on historical-philological semantics, structuralist semantics, generativist semantics, neostructuralist semantics and cognitive semantics. His first chapter is at the same time broader and narrower than the history of early (pre-structuralist) semantics provided here. It provides an overview of speculative etymology in antiquity as well as of the rhetorical tradition, which I do not cover in this chapter; and when Geeraerts focuses on developments between 1830 and 1930 he mainly focuses on Bréal and Paul. So I hope that by examining discussions of what words mean and how meaning is achieved as a process between speaker and hearer, author and reader and so on in three national traditions, the French, the British and the German in early semantics, between around 1830 and 1930, and by focusing more on context than cognition, I supplement Geeraerts’ work to some extent.
6. References Aarsleff, Hans 1967/1983. The Study of Language in Britain 1780–1860. 2nd edn. London: Athlone Press, 1983.
9. The emergence of linguistic semantics in the 19th and early 20th century Bally, Charles 1951. Traité de Stylistique Française. 3rd edn. Genève: George Klincksieck. Bréal, Michel 1883. Les lois intellectuelles du langage. Fragment de sémantique. Annuaire de l’Association pour l’encouragement des études grecques en France 17, 132–142. Bréal, Michel 1883/1991. The Beginnings of Semantics: Essays, Lectures and Reviews. Translated and introduced by George Wolf. Reprinted: Stanford, CA: Stanford University Press, 1991. Bréal, Michel 1887. L’histoire des mots. Revue des Deux Mondes 1, 187–212. Bréal, Michel 1897. Essai de Sémantique (Science des Significations). Paris: Hachette. Carnoy, Albert J. 1927. La Science du Mot. Traité de Sémantique. Louvain: Editions ‘Universitas’. Darmesteter, Arsène 1887. La Vie des Mots Étudiée d’après leurs Significations. Paris: Delagrave. Delesalle, Simone 1987. Vie des mots et science des significations: Arsène Darmesteter et Michel Bréal (DRLAV). Revue de Linguistique 36–37, 265–314. Firth, John R. 1935/1957. The technique of semantics. Transactions of the Philological Society for 1935, 36–72. Reprinted in: J.R. Firth, Papers in Linguistics 1934–1951. London: Oxford University Press, 1957, 7–33. Frege, Gottlob 1977. Thoughts. In: P. Geach (ed.). Logical Investigations. Gottlob Frege. New Haven, CT: Yale University Press, 1–30. Gardiner, Alan H. 1932/1951. The Theory of Speech and Language. 2nd edn., with additions. Oxford: Clarendon Press, 1951. Garnett, Richard 1835. English lexicography. Quarterly Review 54, 294–330. Geeraerts, Dirk 1997. Diachronic Prototype Semantics. A Contribution to Historical Lexicology. Oxford: Oxford University Press. Geeraerts, Dirk 2010. Theories of Lexical Semantics: A Cognitive Perspective. Oxford: Oxford University Press. de la Grasserie, Raoul 1908. Essai d’une Sémantique Intégrale. Paris: Leroux. Gibbins, John R. 2007. John Grote, Cambridge University and the Development of Victorian Thought. Exeter: Imprint Academic. Hauger, Brigitte 1994. Johan Nicolai Madvig. The Language Theory of a Classical Philologist. Münster: Nodus. Hecht, Max 1888. Die griechische Bedeutungslehre: Eine Aufgabe der klassischen Philologie. Leipzig: Teubner. Henry, Victor 1896. Antinomies Linguistique. Paris: Alcan. Henry, Victor 1901. Le Langage Martien. Paris: Maisonneuve. Hey, Oskar 1898. Review of Bréal (1897). Archiv für Lateinische Lexicographie und Grammatik 10, 551–555. Jean Paul [Johann Paul Friedrich Richter] 1962–1977[1804]. Vorschule der Ästhetik. [Introduction to Aesthetics]. In: Werke [Works]. Edited by Norbert Miller. Abt. 1, Vol. 5. Darmstadt: Wissenschaftliche Buchgesellschaft, 7–330. Knobloch, Clemens 1988. Geschichte der psychologischen Sprachauffassung in Deutschland von 1850 bis 1920. Tübingen: Niemeyer. Littré, Emile 1880. Pathologie verbale ou lésion de certains mots dans le cours de l’usage. Etudes et Glanures pour Faire Suite à l’Histoire de la Langue Française. Paris: Didier, 1–68. Locke, John 1975/1689. Essay on Human Understanding. Oxford: Oxford University Press. 1st edn. 1689. Lyell, Charles 1830–1833. Principles of Geology, Being an Attempt to Explain the Former Changes of the Earth’s Surface by Reference to Causes now in Operation. London: Murray. Lyons, John 1963. Structural Semantics. Oxford: Blackwell. Madvig, Johann N. 1875. Kleine philologische Schriften. Leipzig: Teubner. Reprinted: Hildesheim: Olms, 1966. Du Marsais, César de Chesneau 1757. Des Tropes ou des Différens Sens dans lesquels on Peut Prendre un Même Mot dans une Même Langue. Paris: chez David. Matthews, Peter 2001. A Short History of Structural Linguistics. Cambridge: Cambridge University Press.
189
190
II. History of semantics Marty, Anton 1908. Untersuchungen zur Grundlegung der allgemeinen Grammatik und Sprachphilosophie 1. Halle/Saale: Niemeyer. Meillet, Antoine 1904–1905. Comment les mots changent de sens. Année Sociologique 1904–1905, 230–271. Reprinted in: Linguistique historique et linguistique générale I, 230–271. Murray, James A. (ed.) 1884. A New English Dictionary. On Historical Principles; Founded Mainly on the Materials Collected by The Philological Society 1(a). Oxford: Clarendon Press. Nerlich, Brigitte 1990. Change in Language. Whitney, Bréal and Wegener. London: Routledge. Nerlich, Brigitte 1992a. La sémantique: ‘Éducation et Récréation’. Cahiers Ferdinand de Saussure 46, 159–171. Nerlich, Brigitte 1992b. Semantic Theories in Europe, 1830–1930. From Etymology to Contextuality. Amsterdam: Benjamins. Nerlich, Brigitte 1996a. Semantics in the XIXth Century. In: P. Schmitter (ed.). Geschichte der Sprachtheorie 5. Tübingen: Narr, 395–426. Nerlich, Brigitte 1996b. Un chaînon manquant entre la rhétorique et la sémantique: L’oeuvre d’Auguste de Chevallet. Travaux de Linguistique 33, 115–131. Nerlich, Brigitte 1998. La métaphore et le métonymie: Aux sources rhétoriques des théories sémantiques modernes. Sémiotiques 14, 143–170. Nerlich, Brigitte & David D. Clarke. 1996. Language, Action, and Context: The Early History of Pragmatics in Europe and America, 1780–1930. Amsterdam: Benjamins. Nerlich, Brigitte & David D. Clarke 1997. Polysemy: Patterns in meaning and patterns in history. Historiographia Linguistica 24, 359–385. Nerlich, Brigitte & David D. Clarke, 1998. The linguistic repudiation of Wundt. History of Psychology 1, 179–204. Nerlich, Brigitte & David D. Clarke 2000. Semantic fields and frames: Historical explorations of the interface between language, action and cognition. Journal of Pragmatics 32, 125–150. Nerlich, Brigitte & David D. Clarke, 2003. Polysemy and flexibility: Introduction and overview. In: B. Nerlich et al. (eds.). Polysemy: Flexible Patterns of Meaning in Mind and Language. Berlin: de Gruyter, 3–30. Paul, Hermann 1880. Principien der Sprachgeschichte. 5th edn. Halle/Saale: Niemeyer, 1920. Paul, Hermann 1920/1975. Prinzipien der Sprachgeschichte. Reprint of the 5th edn. 1920. Tübingen: Niemeyer, 1975. Pottier, Bernard 1964. Vers une sémantique moderne. Travaux de Linguistique et de Littérature 2, 107–137. Reisig, Christian K. 1839. Vorlesungen über lateinische Sprachwissenschaft. In: F. Haase (ed.). Ch. K. Reisig: Vorlesungen über lateinische Sprachwissenschaft. Leipzig: Lehnhold. Schleicher, August 1860. Die Deutsche Sprache. Stuttgart: Cotta. Schmitter, Peter 1987. Die Zeichen- und Bedeutungstheorie Leo Weisgerbers als Grundlage semantischen Analyse. In: P. Schmitter (ed.). Das sprachliche Zeichen. Münster: Nodus, 176–202. Schmitter, Peter (ed.) 1990. Essays Towards a History of Semantics. Münster: Nodus. Schmitter, Peter 2001a. The emergence of philosophical semantics in early Greek Antiquity. Logos and Language 2, 45–56. Schmitter, Peter 2001b. Zur Rolle der Semantik in Humboldts linguistischem Forschungsprogramm. In: K. Adamzik & H. Christen (eds.). Sprachkontakt, Sprachvergleich, Sprachvariation. Festschrift für Gottfried Kolde zum 65. Geburtstag. Tübingen: Niemeyer, 307–323. Schmitter, Peter 2004. Die Wortbildungstheorie der frühen Semasiologie. Ein weiβer Fleck in den Geschichtsatlanten der Linguistik. Beiträge zur Geschichte der Sprachwissenschaft 14, 107–134. Smart, Benjamin H. 1831. An Outline of Sematology: Or an Essay Towards Establishing a New Theory of Grammar, Logic, and Rhetoric. London: Richardson. Smart, Benjamin H. 1855. Thought and Language: An Essay Having in View the Revival, Correction, and Exclusive Establishment of Locke’s Philosophy. London: Longman, Brown, Green & Longmans.
10. The influence of logic on semantics
191
Steinthal, Heymann 1855. Grammatik, Logik und Psychologie: Ihre Prinzipien und ihr Verhältnis zueinander. Berlin: Dümmler. Steinthal, Heymann 1871. Abriss der Sprachwissenschaft 1. Berlin: Dümmler. Stern, Gustav 1931. Meaning and Change of Meaning. With Special Reference to the English Language. Göteborg: Elanders Boktryckeri Aktiebolag. Reprinted: Bloomington, IN: Indiana University Press, 1968. Stewart, Dugald 1810. Philosophical Essays. Edinburgh: Creech. Stöcklein, Johann 1898. Bedeutungswandel der Wörter. Seine Entstehung und Entwicklung. München: Lindauer. Stout, George Frederick 1891. Thought and language. Mind 16, 181–197. Tooke, John H. 1786–1805. Epea ptepoenta; or, the Diversions of Purley. 2nd edn. London: printed for the author. Traugott, Elizabeth C. 2005. Semantic change In: K. Brown (ed.). Encyclopedia of Language and Linguistics. Amsterdam: Elsevier. Trench, Richard C. 1851/1890. On the Study of Words. 21st edn. London: Kegan Paul, Trench, Trübner & Co., 1890. Ullmann, Stephen 1951. The Principles of Semantics. A Linguistic Approach to Meaning. 2nd edn. Glasgow: Jackson, 1957. Ullmann, Stephen 1962. Semantics. Oxford: Blackwell. Wegener, Philipp 1885/1991. Untersuchungen über die Grundfragen des Sprachlebens. Halle/Saale: Niemeyer. Welby, Victoria Lady 1911. Significs and Language. The Articulate Form of our Expressive and Interpretative Resources. London: Macmillan & Co. Wundt, Wilhelm 1900. Völkerspsychologie I: Die Sprache. Leipzig: Engelmann.
Brigitte Nerlich, Nottingham (United Kingdom)
10. The influence of logic on semantics 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
Overview Pre-Fregean logic Gottlob Frege’s progress Bertrand Russell’s criticism and his theory of definite descriptions Rudolf Carnap’s theory of extension and intension: Relying on possible worlds Willard V. O. Quine: Logic, existence and propositional attitudes Necessity and direct reference: The two-dimensional semantics Montague-Semantics: Compositionality revisited Generalized quantifiers Intensional theory of types Dynamic logic References
Abstract The aim of this contribution is to investigate the influence of logical tools on the development of semantic theories and vice versa. Pre-19th-century logic was limited to a few sentence forms and their logical interrelations. Modern predicate logic and later type logic, both inspired by investigating the meaning of mathematical sentences, widened Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 191–215
192
II. History of semantics the view for new sentence forms and thereby made logic relevant for a wider range of expressions in natural language. In a parallel course of developments the problem of different levels of meaning like sense and reference, or intension and extension were studied and initiated a shift to modal contexts in natural language. Montague bundled in his intensional type-theoretical framework a great part of these development in a unified formal framework which had strong impact on the formal approaches in natural language semantics. While the logical developments mentioned so far could be seen as direct answers to natural language phenomena, the first approaches to dynamic logic did not get their motivation from natural language, but from the semantics of computer programming. Here, a logical toolset was adapted to specific problems of natural language semantics.
1. Overview The aim of this contribution is to investigate the influence of logical tools on the development of semantic theories and vice versa. The article starts with an example of PreFregean logic, i.e. Aristotelian syllogistic. This traditional frame of logic has severe limits for a theory of meaning (e.g. no possibility for multiple quantification, no variety of scope). These limitations have been overcome by Frege’s predicate logic which is the root of standard modern logic and the basis for the first developments in a formal philosophy of language: We will present Frege’s analysis of mathematical sentences and his transfer to the analysis of natural language sentences by introducing a theory of sense and reference. The insufficient treatment of singular terms in Frege’s logic is one reason for Russell’s theory of definite descriptions. Furthermore, Russell tries to organize semantics without a difference between sense and reference. Carnap introduces the first logical framework for a semantics of possible worlds and shows how one can keep the Fregean semantic intuitions without using the problematic tool of “senses”. Carnap develops a systematic theory of intension and extension defining the intension of an expression as a function from possible worlds to the relevant extension. An important formal step was then the invention of modal logic by Kripke. This is the framework for the theory of direct reference of proper names and for the so-called two-dimensional semantics which is relevant to receive an adequate treatment of names, definite descriptions, and especially indexicals. Tarski’s formal theory of truth is used by Davidson to argue that truth-conditions are the adequate tool to characterize the meaning of assertions. Although the idea of a truth-conditional semantics is already in the background since Frege, with Davidson’s work it became the leading idea for modern semantics. The second part of the article (starting with section 8) will concentrate on important progresses made in this overall framework of truth-conditional semantics. Montague presented a compositional formal semantics including quantifiers, intensional contexts and the phenomenon of deixis. His ideal was to offer an absolute truth-condition for any sentence. In the next step new formal tools were invented to account not only for the extralinguistic environment but also for the discourse as a core feature of the meaning of utterances. Context-dependency in this sense is considered in approaches of dynamic semantics. Formal tools are nowadays not only used to answer the leading question “What is the meaning of a natural language expression?” In recent developments new logic formalisms are used to answer questions like “How is a convention established?” and “How can we account for the pragmatics of the utterance?” It has
10. The influence of logic on semantics
193
become clear that truth-conditional semantics has to be completed by aspects of the environment, social conventions and speaker’s intentions to receive an adequate account of meaning. Therefore the latest trend is to offer new formal tools that can account for these features.
2. Pre-Fregean logic The first system of logic was grounded by Aristotle (see 1992). He organized inferences according to a syllogistic schema which consists of two premises and a conclusion. Each sentence of such a syllogistic schema contains two predicates (F, G), a quantifier in front of the first predicate (some, every) and a negation (not) could be added in front of the second predicate. Each syllogistic sentence has the following structure “Some/every F is/ is not G”. Then we receive four possible types of sentences: A sentence is universal if it starts with “every” and particular if it starts with “some”. A sentence is affirmative if it does not contain a negation in front of the second predicate otherwise it is negative. We receive Tab. 10.1. Tab. 10.1: Syllogistic types of sentences NAME
FORM
TITLE
a i e o
Every F is G Some F is G Every F is not G Some F is not G
Universal Affirmative Particular Affirmative Universal Negative Particular Negative
The name of the affirmative syllogistic sentences is due to the latin word “affirmo”. The first vowel represents the universal sentence while the second vowel represents the particular. The name of the negative syllogistic sentences is due to the latin word “nego” again with the same convention concerning the use of the first and second vowel. As we will see the sequence of vowels is also used to represent the syllogistic inferences. If we introduce the further quantifier “no” we can find equivalent representations but no new propositions. The sentence “No F is G” is equivalent to (e) “Every F is not G” and “No F is not G” is equivalent to (a) “Every F is G”. The proposition (e) is intuitively more easily to grasp in the form “No F is G” while proposition (a) is better understandable in the original format “Every F is G”. So we continue with these formulations. On the basis of these sentences we can systematically arrange the typical syllogistic inferences, e.g. the inference called “barbara” because it contains three sentences of the form (a). Tab. 10.2: Barbara Premise 1 (a): Premise 2 (a): Conclusion (a):
Every G is H. Every F is G. Every F is H.
abbreviation: GaH abbreviation: FaG abbreviation: FaH
Now we can start with systematic variations of the four types of sentences (a, i, e, o). The aim of Aristotle was to select all and only those inferences which are valid. Given the same structure of the predicates in the premises and the conclusion only varying the kind of sentence we receive e.g. the valid inferences of Tab. 10.3.
194
II. History of semantics Tab. 10.3: Same predicate structure, varying types of sentences Barbara
Darii
Ferio
Celarent
Every M is H Every F is M Every F is H
Every M is H Some F are M Some F are H
No M is H Some F is M Some F is not H
No M is H Every F is M No F is H
To present the complete list of possible syllogistic inferences we have to account for different kinds of predicate positions in the inference. We can distinguish four general schemata including the one we already had presented so far. Our first schema has the general structure (I) and we also receive the other structures (II to IV) in Tab. 10.4. Tab. 10.4: Predicate structures I.
M F F
H M H
II. H F F
M M H
III.
M M F
H F H
IV. H M F
M F H
For each general format we can vary the kinds of sentences that are involved in the way presented above. This leads to all possible syllogistic inferences in the Aristotelian logic. While making this claim we are ignoring the fact that Aristotle already worked out a modal logic, cf. Nortmann (1996). Concentrating on nonmodal logic we have presented the core of the Aristotelian system. Although it was an ingenious discovery in ancient times, the Aristotelian system has its strong limitations: Strictly speaking, there is no space in syllogistic inferences for (a) singular terms, (b) existence claims (like “Trees exist”) and there are only very limited possibilities for quantification (see section on Frege’s progress). Especially, there is no possibility for multiple uses of quantifiers in one sentence. This is the most important progress which is due to the predicate logic essentially developed by Gottlob Frege. Before we present this radical step into modern logic, we shortly describe some core ideas of G. W. Leibniz, who invented already some influential ideas on the way to modern logic. Leibniz is well-known for introducing the idea of a calculus of logical inferences. He introduced the idea that the syntax of the sentences mirrors the logical structure of the thoughts expressed and that there can be defined a purely syntactic procedure of proving a sentence. This leads to the modern understanding of a syntactic notion of proof which ideally allows for all sentences to decide simply on the basis of syntactic transformations whether they are provable or not. The logic systems developed by Leibniz are essentially advanced compared to the Aristotelian syllogistic. It has been shown that his logic is equivalent to the Boolean logic, i.e. the monadic predicate logic, see Lenzen (1990). Furthermore, Leibniz introduced a calculus of concepts defining concept identity, inclusion, containment and addition, see Zalta (2000) and Lenzen (2000). He reserved a special place for individual concepts. Since his work had almost no influence on the general development of logic the main ideas are only mentioned here. Ignoring a lot of interesting developments (e.g. modal systems) we can characterize a great deal of the logical systems initiated by Aristotle until the 19th century by the square of opposition (cf. article 8 (Meier-Oeser) Meaning in pre 19th-century thought). The square of opposition (see Fig. 10.1) already involves the essential distinction between different understandings of “oppositions”: A contradiction of a sentence is an external negation of sentence “It is not the case that …” while the contrary involves
10. The influence of logic on semantics
Fig. 10.1: Square of oppositions
an “internal” negation. What is meant by an “internal” negation can only be illustrated by transforming the syllogistic sentences into modern predicate logic. The most important general features in the traditional understanding are the following: (i) From two contradictory sentences one must be false and one be true. (ii) For two contrary sentences holds that they cannot both be true (but they can both be false). (iii) For two subcontrary sentences holds that they cannot both be false (but they can both be true). A central problem pointed out by Abelard in the Dialectica (1956) is presupposition of existential import: According to one understanding of the traditional square of opposition it presupposes that sentences like “Every F is G” or “Some F is G” imply “There is at least one thing which is F”. This is the so-called existential import condition which leads into trouble. The modern transformation of the syllogistic sentences into predicate logic, as included into the figure above, does not involve the existential presupposition. Let us use an example: On the one hand, “Some man is black” implies that at least one thing is a man, namely the man who has to be black if “Some man is black” is true. On the other hand, “Some man is not black” also implies that something is a man, namely the man who is not black if “Some man is not black” is true. But these are two subcontrary sentences, i.e. according to the traditional view they cannot both be false; one has to be true. Therefore (since both imply that there is a thing which is a man) it follows that men exist. In such a logic the use of a predicate F in a sentence “Some F…” presupposes that F is non-empty (simply given the meaning of F and the traditional square of opposition), i.e. there are no empty predicates. But of course (as Abelard points out) surely men might not exist. This observation leads to the modern square of opposition which uses the reading of sentences in predicate logic and leaves out the relations of contraries and subcontraries and only keeps the relation of the contradictories. The leading intuition to avoid the problematic consequence mentioned above is the claim that meaningful universal sentences like “Every man is mortal” do not imply that men exist. The existential import is denied for both predicates in universal sentences. Relying on the interpretation of particular sentences like “Some men are mortal” according to modern predicate logic, the truth of the particular sentences just means that there is at least one man which is mortal. If we want to allow nonempty predicates then we receive the modern square of opposition.
195
196
II. History of semantics
3. Gottlob Frege’s progress Frege was a main figure from two points of view: He introduced a modern predicate logic and he also developed the first systematic modern philosophy of language by transferring his insights in logic and mathematics into a philosophy of language. His logic was first developed in the “Begriffsschrift” (1879, in the following shortly: BS), see Frege (1977). He used a special notation to characterize the content of a sentence by a horizontal line (the content line) and the act of judging the content by a vertical line (the judgement line). A This distinction is a clear precursor of the distinction between illocution (the type of speech act) and proposition (the content of the speech act) in Searle’s speech act theory. Frege’s main aim was to clarify the status of arithmetic sentences. Dealing with a mathematical expression like „32“ Frege analyzes it into the functional expression „()2“ und the argument expression „3“. The functional expression refers to the function of squaring something while the argument expression refers to the number 3. An essential feature of the functional expression is its being unsaturated, i.e. it has an empty space that needs to be filled by an argument expression to constitute a complete sentence. The argument expression is saturated, i.e. it has no space for any addition. Frege transferred this observation into the philosophy of language: Predicates are typical expressions which are unsaturated, while proper names and definite descriptions are typical examples for saturated expressions. Predicates refer to concepts while proper names and definite descriptions refer to objects. Since predicates are unsaturated expressions which need to be completed by a saturated expression, Frege defines concepts (as the reference of predicates) as functions which – completed by objects (as the referents of proper names and other singular terms) – always have a truth value (Truth or Falsity) as result. The truth-value is the reference of the sentence composed of the proper name and the predicate. By analogy from mathematical sentences Frege starts to analyze sentences of natural language and dealing with natural language and develops a systematic theory of meaning. Before outlining some basic aspects of this project, we first introduce the idea of a modern system of logic. Frege developed the following propositional calculus (for a detailed reconstruction of Frege’s logic see von Kutschera 1989, chap. 3): Axioms: (1) a. b. c. d. e. f.
A → (B → A) (C → (B → A)) → ((C → B) → (C → A)) (D → (B → A)) → (B → (D → A)) (B → A) → (¬A → ¬B) ¬¬A → A A → ¬¬A
A rule of inference, which allows to derive theorems by starting with two axioms or theorems: (2)
A → B, A B
10. The influence of logic on semantics If we add an axiom and a rule we receive a system of predicate logic that is complete and consistent. Frege suggested the following axiom: (3)
∀xA[x] → A[a] (BS: 51).
The additionally relevant rule of inference was not explicitly marked by Frege but presupposed implicitly: (4)
A → B[a] ⊢ A → ∀xB[x], if „a” is not involved in the conclusion (BS: 21).
Frege tried to show the semantic consistency of the predicate calculus (which was then intensily debated) but he did not try to prove the completeness since he lacked the notion of interpretation to develop such a proof (von Kutschera 1989: 34). A formal system of axioms and rules is complete for first-order predicate logic (FOPL) if all sentences logically valid in FOPL are derivable in the formal system. The completeness proof was for the first time worked out by Kurt Gödel (1930). Frege already included second-order predicates into his system of logic. The interesting fact that second-order predicate logic is incomplete was for the first time shown by Kurt Gödel (1931). One of the central advantages of modern predicate logic for the development of semantics is the fact that we can now use as much quantifiers in sequence as we want. We have of course to take care of the meaning of quantifiers given the sequence. The following sentences which cannot be expressed in the system of Aristotelian syllogism can be nicely expressed by the modern predicate logic using “L” as a shorthand for the two-place predicate “( ) loves ( )”. (5) Everyone loves everyone: ∀x∀yL(x, y) Using mixed quantifiers their sequence becomes relevant: (6)
a. Someone loves everyone: ∃x∀yL(x, y) b. Everyone loves someone: ∀x∃yL(x, y) [This can be a different person for everyone] c. Someone is loved by everyone: ∃y∀xL(x, y) [There is (at least) one specific human being who is loved by all human beings] d. Everyone is loved by someone: ∀y∃xL(x, y) e. Someone loves someone: ∃x∃yL(x, y) [There is (at least) one human being who loves (at least) one human being]
Frege’s philosophy of language is based on a principle of compositionality (cf. article 6 (Pagin & Westerståhl) Compositionality), i.e. the principle that the value of a complex expression is determined by the values of the parts plus its composition. He developed a systematic theory of sense and reference (cf. article 3 (Textor) Sense and reference). The reference of a proper name is the designated object and the reference of a predicate is a concept while both determine the reference of the sentence, i.e. the truth-value. Given this framework of reference it follows that the sentences
197
198
II. History of semantics (7)
The morning star is identical with the morning star.
and (8)
The morning star is identical with the evening star.
have the same reference, i.e. the same truth-value: The truth-value is determined by the reference of the name and the predicate. Each token of the predicate refers to the same concept and the two names refer to the same object, the planet Venus. But sentence (7) is uninformative while sentence (8) is informative. Therefore, we need a new aspect of meaning to account for the informativity: the sense of an expression. The sense of a proper name is a mode of presentation of the designated object, i.e. “the evening star” expresses the mode of presentation characterized as the brightest star in the evening sky. Furthermore the sense of a sentence is a thought. The latter (in the case of simple sentences like “Socrates is a philosopher”) is constituted by the sense of a predicate “( ) is a philosopher” and the sense of the proper name “Socrates”. Frege defines the sense of an expression in general as the mode of presentation of the reference. To develop a consistent theory of sense and reference Frege introduced different senses for one and the same expression in different linguistic contexts, e.g. indirect speech (propositional attitude ascriptions) or quotations are contexts in which the sense of an expression changes. Frege’s philosophy of language has at least two major problems: (1) the necessity of an infinite hierarchy of senses to account for the recursive syntactic structure (John believes that Mary believes that Karl believes …..) and (2) the problem of indexical expressions (it is accounted for in two-dimensional semantics and dynamic semantics, see below).
4. Bertrand Russell’s criticism and his theory of definite descriptions Russell (1903, partly in cooperation with Whitehead (Russell & Whitehead 1910–1913)) also developed himself both a system of logic and a philosophy of language in contrast to Frege such that we nowadays speak of Neo-Fregean and Neo-Russellian theories of meaning. Let us first have a look at Russell’s logical considerations. Russell developed his famous paradox which was a serious problem for Frege because Frege presupposes in his system that he could produce sets of sets in an unconstrained manner. But if there are no constraints we run into Russell’s paradox: Let R be the set of all sets which are not members of themselves. Then R is neither a member of itself nor not a member of itself. Symbolically, let R := {x : x ∉ x}. Then R ∈ R iff R ∉ R. To illustrate the consideration: If R is a member of itself it must fulfill the definition of its members, i.e. it must not be a member of itself. If R is not a member of itself then it should not fulfill the definition of its members, i.e. it must be a member of itself. When Russell wrote his discovery in a letter to Frege who was just completing Grundlagen der Arithmetik Frege was despaired because the foundations of his system were undermined. Russell himself developed a solution by introducing a theory of types (1908). The leading idea is that we always have to clarify those objects to which the function will apply before a function can be defined exactly. This leads to a strict distinction between object language and meta-language: We can avoid the paradox by avoiding self-references and this can be done by arranging all sentences (or, equivalently, all propositional functions) into a hierarchy. The lowest level of this hierarchy will consist of sentences about individuals. The next lowest level
10. The influence of logic on semantics will consist of sentences about sets of individuals. The next lowest level will consist of sentences about sets of sets of individuals, and so on. It is then possible to refer to all objects for which a given condition (or predicate) holds only if they are all at the same level, i.e. of the same type. The theory of types is a central element in modern theory of truth and thereby also for semantic theories. Russell’s contribution to the philosophy of language is essentially connected with his analysis of definite descriptions (Russell 1905). The meaning of the sentence “The present King of France is bald” is analyzed as follows: 1. there is an x such that x is the present King of France (∃x(Fx)) 2. for every x that is the present King of France and every y that is the present King of France, x equals y (i.e. there is at most one present King of France) (∀x(Fx → ∀y (Fy → y = x))) 3. for every x that is the present King of France, x is bald. (∀x(Fx → Bx)) Since France is no longer a kingdom, assertion 1. is plainly false; and since our statement is the conjunction of all three assertions, our statement is false. Russell’s analysis of definite descriptions involves a strategy to develop a purely extensional semantics, i.e. a semantic theory that can characterize the meaning of sentences without introducing the distinction between sense and reference or any related distinction of intensional and extensional meanings. Definite descriptions are analyzed such that there remains no singular term in the reformulation and ordinary proper names are according to Russell’s theory hidden definite descriptions. His strategy eliminates singular terms with only one exception: He needs the basic singular term “this/that” to account for our speech about sense-data (Russell 1910). Since he takes an acquaintance relation with sense-data (and also with universals) including a sense-data ontology as a basic presupposition of his specific semantic approach, the only core idea that survived in modern semantics is his logical analysis of definite descriptions.
5. Rudolf Carnap’s theory of extension and intension: Relying on possible worlds Since Russell’s project was idiosyncratically connected with a sense-data theory it was for the great majority of scientists not acceptable as a purely extensional project. The extensional semantics had to wait until Davidson used Tarski’s theory of truth as a framework to characterize a new extensional semantics. Meanwhile it was Rudolf Carnap who introduced the logic of extensional and intensional meanings to modernize Frege’s twofold distinction of semantics. The central progress was made by introducing the idea of possible worlds into logics and semantics: The actual world is constituted by a combination of states of affaires which are constituted by objects (properties, relations etc.). If at least one state of affairs is changed we speak of a new possible world. If the world consists of basic elements which constitute states of affaires then the possible combinations of these elements allow us to characterize all possible states of affaires. Thereby we can characterize all possible worlds since a possible world can be characterized by a class of states of affaires that is realized in this world. Using this new instrument of possible worlds Carnap introduces a systematic theory of intension. His notion of intension should substitute Frege’s notion of sense and thereby account for the informational content of a
199
200
II. History of semantics sentence. His notion of extension is closely connected to Frege’s notion of reference: The extension of a singular term is the object referred to by the use of the term, the extension of a predicate is the property referred to and the extension of a complete assertive sentence is its truth-value. The intension which has to account for the informational content is characterized as a function from possible worlds to the relevant extensions. In the case of a singular term the intension is a function from possible worlds (p.w.) to the object referred to in the relevant possible world. In the same line you receive the intension of predicates (as function from p.w. to sets or n-tuples) and of sentences (as functions from p.w. to truth-values) as shown in Tab. 10.5. Tab. 10.5: Carnap’s semantics of possible worlds
singular terms predicates sentences
Extension
Intension
objects sets of objects and n-tuples of objects truth-values
individual concepts properties propositions
A principle limitation of a semantic of possible worlds is that you cannot account for so-called hyperintensional phenomena, i.e. one cannot distinguish the meaning of two sentences which are necessarily true (e.g. two different mathematical claims) because they are simply characterized by the same intension (matching each p.w. onto the value “true”).
6. Willard V. O. Quine: Logic, existence and propositional attitudes How should we relate quantifiers with our ontology? Quine (1953) is famous for his slogan “To be is to be the value of a bound variable”. Quantifiers “there is (at least) an x (∃x)”, “for all x (∀x)” are the heart of modern predicate logic which was already introduced by Frege (s. above). For Quine the structure of the language determines the structure of the world: If the language which is necessary to receive the best available complete description of the world contains several existential and universal quantifications then these quantifications at the same time determine the objects, properties etc. we have to presuppose. Logic, language and ontology are essentially connected according to this view. Although Quine’s special views about connecting logic, language and world are very controversial nowadays the core of the idea of combining quantificational and ontological claims is widely accepted. Another problem that is essentially inspired by the development of logic is the analysis of propositional attitude ascriptions. Quine established the following standard story: There is a certain man in a brown hat whom Ralph has glimpsed several times under questionable circumstances on which we need not enter here; suffice it to say that Ralph suspects he is a spy. Also there is a gray-haired man, vaguely known to Ralph as rather a pillar of the community, whom Ralph is not aware of having seen except once at the beach. Now Ralph does not know it, but the men are one and the same. Can we say of this man (Bernard J. Ortcutt, to give him a name) that Ralph believes him to be a spy? If so, we find ourselves accepting a conjunction of the type: (9) w sincerely denies ‘…..’ . w believes that …..
10. The influence of logic on semantics as true, with one and the same sentence in both blanks. For, Ralph is ready enough to say, in all sincerity, ‘Bernard J. Ortcutt is no spy.’ If, on the other hand, with a view to disallowing situations of the type (9), we claim simultaneously that (10) Ralph believes that the man in the brown hat is a spy. (11) Ralph does not believe that the man seen at the beach is a spy. then we cease to affirm any relationship between Ralph and any man at all.[…] ‘believes that’ becomes, in a word, referentially opaque. (Quine 1956: 179, examples renumbered)
In line with Russell, Quine starts to analyze the cognitive situation of Ralph by distinguishing two readings of the sentence (12) Ralph believes that someone is a spy. namely: (13) Ralph believes [∃x(x is a spy)] (14) ∃x(Ralph believes [x is a spy]) Quine calls (13) the notional and (14) the relational reading of the original sentence which is at the first glance parallel to the traditional distinction between de dicto (13) and de re (14) reading. But he shows that the difference of these two readings is not sufficient to account for Ralph’s epistemic situation as characterized with the sentences (10) and (11). Intuitively Ralph has a de re reading in both cases, one of the man on the beach and the other of a person wearing a brown hat. The transformation into de re readings leads to: (15) ∃x(Ralph believes [x is a spy]) (out of (10)) (16) ∃x(Ralph does not believe [x is a spy]) (out of (11)) Since both extensional quantifications are about the same object, we receive the combined sentence which explicitly attributes contradictory beliefs to Ralph: (17) ∃x(Ralph believes [x is a spy] ∧ Ralph does not believe [x is a spy]) To avoid this unacceptable consequence Quine suggests that the ambiguity of the belief sentences cannot be accounted for by a distinction of the scopus of the quantifier (leading to de re and de dicto readings) but by a systematic ambiguity of the belief predicate: He suggests to distinguish a two-place predicate “Believe2 (subject, proposition)” and a three-place predicate “believe3 (subject, object-of-belief, property)”. (18) Believe2 (Ralph, that the man with the brown hat is a spy) (19) believe3 (Ralph, the man with the brown hat, spy-being)
201
202
II. History of semantics This distinction is the basis for Quine’s further famous claim: We are not allowed to quantify into propositional attitudes (i.e. implying (14) from (18)): if we have interpreted a sentence such that the ‘believe’ predicate is used intensionally (as a two-place predicate) then we cannot ignore that and we are not allowed to change the reading to the sentence into one using a three-place predicate. We are not allowed to change from a notional reading (18) into a relational reading (19) and vice versa. This line of strategy was further improved e.g. by Kaplan (1969) and Loar (1972). It definitely made clear that we cannot always understand the belief expressed by a belief sentence simply as a relation between a subject and a proposition. Sometimes it has to be understood differently. Quine’s consequence is a systematic ambiguity of the predicate “believe”. This is problematic since it leads also to four-place, five-place predicates etc. (Haas-Spohn 1989: 66): for each singular term which is used in the scopus of the belief ascription we have to distinguish a notional and a relational reading. Therefore Cresswell & von Stechow (1982) suggested an alternative view which only needs to presuppose a two-place predicate “believe” but therefore changes the representation of a proposition: A proposition is not completely characterized by a set of possible worlds (according to which the relevant state of affaires is true) but in addition by a structure of the components of the proposition. Structured propositions are the alternative to a simple possible world semantics to account for propositional attitude ascriptions.
7. Necessity and direct reference: The two-dimensional semantics The development of modal logic essentially put forward by Saul A. Kripke had strongly influenced the semantical theories. The basic intuition the modal logic started with is rather straightforward: Each entity is necessarily identical with itself (and necessarily different from anything else). Kripke (1972) shows that there are sentences which express a necessary truth but nevertheless are a posteriori: “Mark Twain is identical with Samuel Clemens”. Since “Samuel Clemens” is the civil name of Mark Twain the sentence expresses a self-identity but it is not known a priori since knowing that both names refer to the same object is not part of standard linguistic knowledge. There are also sentences which express contingent facts but which can be known to be true a priori, e.g. “I am speaking now”. If I utter the sentence it is a priori graspable that it is true but it is not a necessary truth since otherwise I would be a necessary speaker at this timepoint (but of course I could have been silent). To account for the new distinction between epistemic dimension of a priori/a posteriori and the metaphysical dimension of necessary/contingent Kripke introduced the theory of direct reference of proper names and Kaplan (1979/1989) introduced the two-dimensional semantics. Since Kaplan’s theory of characters is nowadays a standard framework to account for names and indexicals we shortly introduce the core idea: We have to distinguish the utterance context which determines a proposition which is expressed by uttering a sentence and the circumstance of evaluation which is the relevant possible world according to which this proposition will be evaluated as true or false. We can illustrate this two-step approach as in Fig. 10.2:
10. The influence of logic on semantics
203
character of a sentence utterance context
1st step intension = truth-condition (= proposition) 2nd step
circumstance of evaluation extension = truth-value
Fig. 10.2: Two-dimensional semantics
A character of a sentence is a function from possible utterance contexts to truthconditions (propositions) and these truth-conditions are then evaluated relative to the circumstances of evaluation. Especially in the cases of indexicals we can demonstrate the two-dimensional semantics: Let us investigate the sentence “I am a philosopher” according to three different utterance contexts w0, w1 und w2 while in each world there is a different speaker: in w0 Cicero is the speaker of the utterance, in w1 Caesar and in w2 Augustus. Furthermore it is in w0 the case that Cicero is a philosopher while Caesar and Augustus are not. In w1 Caesar and Augustus are philosophers while Cicero isn’t. In w2 no one is a philosopher (poor world!). The three worlds function both as utterance contexts and as circumstances of evaluation but of course different facts are relevant in the two different functions. Now the utterance contexts are represented veritically while the circumstances of evaluation are represented horizontally. Then the sentence “I am a philosopher” receives the character shown in Tab. 10.6. Tab. 10.6: Character of “I am a philosopher” Utterance Contexts
w0 w1 w2
Circumstances of evaluation
w0
w1
w2
w f f
f w w
f f f
Truth conditions
〈Cicero; being a philosopher〉 〈Caesar; being a philosopher〉 〈Augustus; being a philosopher〉
Each line represents the proposition that is determined by the sentence relative to the utterance contexts and this proposition receives a truth-value for each circumstance of evaluation. The character of the sentences has in principle to be represented for all possible worlds not only for the three ones selected above. This instrument of a “character” of a sentence is useable for all expressions of a natural language. Such it is a principle improvement and enlargement in the formal semantics that contains Carnap’s theory of intensions as a special case.
8. Montague-Semantics: Compositionality revisited Frege claimed in his principle of compositonality that the meaning of a complex expression is a function of the meanings of the expression parts and their way of composition (see section 3.). He regarded every complex expression as composed of a saturated part and a non-saturated part. The semantic counterparts are objects and functions. Frege
204
II. History of semantics transfers the syntactic notion of an expression which needs some complement to be a complete and well-formed expression to the semantic realm: Functions are regarded as incomplete and non-saturated. This leads him to the ontological view that functions are not “objects” (“Gegenstände”). This is Frege’s term for any entitity that can be a value of a meaning function mapping expressions to their extensions. Functions, however, can be reified as “Werthverläufe” (courses-of-values), e.g. by combining their expressions with the expression the function, but in Frege’s ontology, the “Werthverläufe” are distinct from the functions themselves. Successors of Frege did not follow him in this ontological respect of his semantics. In Tarskian semantics one-place predicates are usually mapped to sets of individuals, twoplace predicates to sets of pairs of individuals etc. N-place functions are regarded as special kinds of (n-1)-place relations. This approach allowed to make explicit the meaning of Frege’s non-saturated expressions and to give a precise compositional account of the meanings of expressions of predicate logic in form of a recursive definition. But for every type of composition, like connecting formulae, applying a predicate to its arguments, or prefixing a quantifier to a formula, distinct forms of meaning composition were needed. The notion of compositionality could be radically simplified by two developments: The application of type theory and lambda abstraction. Type theory goes back to Russell and was developed to avoid paradoxical notions as sets not containing themselves, see above sec. 4. There are a lot of versions of type theory, but in natural language semantics usually a variant is used which starts with a number of basic types, e.g. the type of objects (entities) e and the type of truth values t for an extensional language, and provides a type of functions 〈T1, T2〉 for every type T1 and every type T2, i.e. the type of functions from entities of type T1 to entities of type T2. Sometimes the set of types is extended to types composed of more than two types. But any such system can easily be reduced to this binary system. Predicates can now be viewed as expressions of type 〈e, t〉, i.e. as functions from objects to truth values, because they yield a true sentence if an argument is filled in that refers to an instance of the predicate is filled in, otherwise they yield a false sentence. That just means that we take the characteristic function of the predicate extension as its meaning. A characteristic function of a set maps its members to the truth value true, its non-members to false. In the same sense the negation operator is of type 〈t, t〉, i.e. a function from a truthvalue to a truth-value (namely to the opposite one); binary sentence connectives are of type 〈t, 〈t, t〉〉, i.e. a function taking the first argument and yielding a function which takes the second argument and then results in a truth value. Frege had already recognized that first-order quantifiers (as the existential quantifier something) are just second-order predicates, i.e. predicates applicable to first-order predicates. The application of an existential quantifier to a one-place first-order predicate is true if the predicate is non-empty. Therefore the existential quantifier can be regarded as a second-order predicate which has non-empty first-order predicates as instances, it is of type 〈〈e, t〉, t〉. The semantics of the universal quantifier is analoguous: It yields the truth value true for a first-order predicate which has the whole universe of discourse as its extension. A type problem arises with this view for predicates of arity greater than one: Their type does not fit to the quantifier. The predicate love, e.g. is of type 〈e, 〈e, t〉〉 because if we add one object as an argument we receive a one-place predicate as introduced above. In the sentence
10. The influence of logic on semantics (20) Everybody loves someone. one of the two quantifiers has to be composed with love in a compositional approach. Let us assume someone is this quantifier. Then its meaning of type 〈〈e, t〉, t〉 has to be applied to the meaning of love, leading to a type clash, because the quantifier needs a type 〈e, t〉 as its argument while the predicate love is of a different type. Analoguous problems arise for complex formulae. How can the meaning of the two-place predicate love be transformed into the type required? The type theory needs an extension by the introduction of a lambda-operator, borrowed from Alonzo Church’s lambda calculus which was developed in Church (1936). The lambda operator is used to transform a type t expression into an expression of 〈T1, t〉 depending on the variable type T1. Let e.g. x be a variable of type e and P be of type t, then λx[P] has type 〈e, t〉, i.e. is a one-place first-order predicate. If we consider love as a two-place first-order predicate and love(x, y) as an expression of type t with two free variables, then λx[love(x, y)] is an expression of type 〈e, t〉. This is the type required by the quantifier. The variable x is bound to the lambda operator and y is free in this expression. If we now take someone as a first-order quantifier, which has type 〈〈e, t〉, t〉, then someone(λx[love(x, y)]) is an expression of type t again with free variable y. This can be made a predicate of type 〈e, t〉 by using the same procedure again: λy[someone(λx[love (x, y)])], which can be used as an argument of a further quantifier everybody. We receive the following new analysis of (20): (21) everybody(λy[someone(λx[love(x, y)])]) The semantics of a lambda expression λx[P] is defined as the characteristic function which yields true for all arguments which would make P true, if they were taken as assignments to x, false for the others. With this semantics we get the following logical equivalences which we used implicitly in the above formalizations: α-conversion (22) λx[P] ≡ λy[P[y/x]] where P[y/x] is the same as P besides the fact that all occurences of x which are free in P are changed into y. α-conversion is just a formal renaming of variables. β-reduction (23) λx[P](a) ≡ P[a/x] where P[a/x] is the same as P besides the fact that all occurences of x which are free in P are changed into a. This, however, may be false in some non-extensional contexts, if a is a non-rigid designator. If we take (24) Ralph believes that x is a spy. as P then the de re reading of (10) is λx[P](a). a is interpreted outside the non-extensional context of P and its factual denotation is taken as its extension. P[a/x], however, is the de
205
206
II. History of semantics dicto reading because a is interpreted within the belief-context. For intensional contexts these differences are treated in the intensional theory of types, see section 10 below. η-conversion (25) λx[P(x)] ≡ P where x does not occur as a free variable in P. η-conversion is needed for the conversion between atomic predicates and λ-expressions. In this type-theoretical view, a lot of other linguistic expression types can easily be integrated. Adjectives, which are applied to nouns of type 〈e, t〉, can be seen as type 〈〈e, t〉, 〈e, t〉〉, e.g. tasty is a modifier which takes predicates, expressed by nouns, like apple and yields new predicates, like tasty apple. Modifiers in general, like adverbs, prepositional phrases and relative and adverbial clauses, are regarded as type 〈T1, T1〉, because they modify the meaning of their argument, but this results in an expression of the same type. This also mirrors that modifiers can be applied in an iterated manner, as red tasty apple, where red further modifies the predicate tasty apple. The compositionality of meaning got a very strict interpretation in Montague’s work. The type-theoretic semantics was accompanied by a syntactic formalism whose expression categories could be directly mapped onto semantic types, called categorial grammar, which was based on ideas by Kasimierz Ajdukiewicz in the mid-1930s and Yehoshua Bar-Hillel (1953). Complex semantic types, i.e. semantic types needing some complemetation, like 〈T1, T2〉 for some types T1, and T2, have their counterpart in complex syntactic categories like S2/S1 and S2 \S1 which need a complementation by S1 to yield the syntactic category S2. Given a syntactic category S2/S1 a complement of type S1 has to be added to the right, in case of S2 \S1 to the left. Let e.g. N be the syntactic category of a noun and DP be the category of a determiner phrase, then DP/N is the category of the determiner in the examples above. The interaction of Montague’s syntactic and semantic conception results in the requirement that the meaning of a complex expression can be recursively decomposed into function-argument-pairs which are always expressed by syntactically immediately adjacent constituents. Categorial grammars describe the same class of languages as context free grammars, and they are subject to the same problems when applied to natural languages. Although most phenomena in natural languages can in principle be represented by a context free grammar, and therefore by a categorial grammar, too, both formalisms lead to quite unintuitive descriptions when applied to a realistic fragment of natural language. Especially with regard to semantic compositionality discontinous constituents require a quite unintuitive multiplication of categories. The type theoretic view of nouns and noun phrases has also consequences for the semantic concept of quantifying expressions. Determiners like all or two as parts of determiner phrases will have to be assigned a suitable type of meaning.
9. Generalized quantifiers As mentioned above, already Frege regarded quantifiers as second- or higher-order predicates. But Frege himself did not go the step from this insight to the consideration
10. The influence of logic on semantics of other quantifiers than the universal and the existential ones. Without being aware of Frege’s concept of quantifiers the generalized view on quantifiers by Mostowski (1957) and Lindström (1966) pathed the way to a generalized theory of quantifiers in linguistics around 1980. This allowed for a proper treatment of syntactically first-order quantifying expressions whose semantics was not expressible in first-order logic. e.g. most. Furthermore, now, noun phrases which had no mere anaphoric function, could be interpreted as a quantifier, consisting of the determiner, e.g. all, which specifies the basic quantifier semantics, and the noun, e.g. women, or a noun-like expression which restricts the quantification. The determiner is a function of type 〈〈e, t〉, 〈〈e, t〉, t〉〉 taking the restrictor as argument and resulting in a generalized quantifier, e.g. all(woman) for the noun phrase all women. This generalized quantifier is (the characteristic function of) a second-order predicate which has all (charateristic functions of) first-order predicates as its instances which are true for all women. As the determiner designates the principle function in such a noun phrase some linguists prefer the term determiner phrase. Generalized quantifiers can be studied with respect to their monotonicity properties. Let Q be a generalized quantifier and Q(P) be true. Then it can be the case – depending on Q’s semantics – that Q(P') is always true if (A) (B)
the extension of P' is a subset of the extension of P or the extension of P is a subset of the extension of P'
In the first case, we call Q monotone decreasing or downward entailing, in case (B) monotone increasing or upward entailing. An example of the first quantifier type is no women, an example of the second type (at least) two women, cf. e.g. the entailment relations between sentences the following. (26a) entails (26b), and (27a) entails (27b). (26) a. At least two women worked as extremely successful CEOs. b. At least two women worked as CEOs. (27) a. No women worked as CEOs. b. No women worked as extremely successful CEOs. While quantifiers also expressible in first-order logic, like those in the examples above, always show one of the monotonicity properties, there are other generalized quantifiers which do not. Numerical expressions providing a lower and an upper bound for a quantity, like exact numbers, are examples for non-montonic quantifiers. If we replace at least two women with exactly two women in the examples above, the entailment relations between the sentences disappear. Mononicity of quantifiers seems to play an important role in natural language although not all quantifiers are monotonic themselves. But it is claimed that all simple quantifiers, i.e. one-word quantifiers or quantifiers of the form determiner + noun, are expressible as conjunctions of monotonic quantifiers. E.g. exactly three Ps is equivalent to the conjunction at least three Ps and no more than three Ps, the first conjunct (at least three Ps) being upward monotonic and the latter (no more than three Ps) being downward monotonic. There are, of course, quantifiers not bearing this property, and they are expressible in natural language, cf. an even number of Ps, but there does not seems to be any natural language which reserves a simple lexical item for such a purpose. The
207
208
II. History of semantics theory of generalized quantifiers therefore raises empirical questions about language universals. Similar considerations on entailment conditions can be applied to the first argument of the determiner. For some determiners D it might be the case that D(R)(P) entails D(R')(P) always if (A') (B')
the extension of R' is a subset of the extension of R or the extension of R is a subset of the extension of R'.
In the second case the determiner is called persistent, while in the first it is called antipersistent. Consider the entailment relations between the sentences below. Here, (28a) entails (28b), and (29a) entails (29b). (28) a. Some extremely successful female CEOs smoke. b. Some female CEOs smoke. (29) a. All female CEOs smoke. b. All extremely successfull female CEOs smoke. It is easy to see that some is persistent while all is antipersistent. All combinations of monotonicity and persistence/antipersistence are realized in natural languages. Tab. 10.7 shows some examples. Note that the quantifiers of the square of opposition are part of this scheme. Tab. 10.7: Monotonicity and (anti-)persistence of quantifiers
antipersistent persistent
upward monotonic
downward monotonic
all, every some, (at least) three
no, at most three not all
And the relations of being contradictory and (sub-)contrary (see sec. 2) in the square of oppositions are mirrored by negations of the whole determiner-governed sentence or the second argument: ¬D(R, P) is contradictory to D(R, P), while D(R, ¬P) is (sub-)contrary to D(R, P) (we use ¬P as short for λx[¬P(x)].). Analyzing quantified expressions as structures consisting of a determiner, a restrictor and a quantifier scope provides us with a relational view of quantifiers. The determiner type 〈〈e, t〉, 〈〈e, t〉, t〉〉 is that of a second-order two-place relation. Besides the logical properties of the argument positions discussed above, there are a number of interesting properties regarding the relation of the two arguments. One of them is conservativity. A determiner is conservative iff it is always the case that D(R, P) ≡ D(R, P ∧ R) (where we use P ∧ R as the conjunction of the predicates P and R, more precisely λx [P(x) ∧ R(x)]). It is quite evident that determiners in general fulfill this condition. E.g. from (30) Most CEOs are incompetent.
10. The influence of logic on semantics follows (31) Most CEOs are incompetent CEOs. But are really all determiners conservative? Only is an apparent counterexample: (32) Only CEOs are incompetent. cannot be paraphrased as (33) Only CEOs are incompetent CEOs. (32) being contingent, (33) tautological. But besides this observation there are syntactical reasons to doubt the classification of only as a determiner. Other quantifying items whose conservativity is questioned are e.g. many and few. The foundation for the generalization of quantifiers was in principle laid in Montague’s work, but he himself did not refer to other quantifiers than the classical existential and universal quantifier. The theory of generalized quantifiers was recognized in linguistics in the earley 1980s, cf. Barwise & Cooper (1980) and article 43 (Keenan) Quantifiers.
10. Intensional theory of types Montague developed his semantics as an intensional semantics, taking into account the non-extensional aspects of natural languages as alethic-modal, temporal and deontic operators. The intensional theory of types builds on Carnap’s concept of intensions as functions from possible worlds to extensions. These functions are built into the type system by adding a new functional type from possible worlds s to the other types. Type s differs from the other types insofar as there are no expressions – constants or variables – directly denoting objects of this type, i.e. no specific possible worlds besides the contextually given current one can be addressed. The difference between an extensional sentential operator like negation and an intensional like necessarily is reflected by their respective types: The meaning of the negation operator has type 〈t, t〉 while necessarily needs 〈〈s, t〉, t〉 because not only the truth value of an argument p in the current world has impact on the truth value of necessarily p, but also the truth values in the alternative worlds. In Montague (1973) he does not directly define a model-theoretic mapping for expressions of English, although this should be feasible in principle, but he gives a translation of English expressions into a logical language. Besides the usual ingredients of modal predicate logic, the lambda operator as well as variables and constants of the various types, he introduces the intensor ^ and extensor ∨ of type 〈T, 〈s, T 〉〉 and 〈〈s, T〉, T 〉 respectively for arbitrary types T. ^ transforms a given meaning into a Carnapian intension, i.e. ^a means the function from possible worlds to a’s extensions in these worlds. In contrast, if b means an intension then ∨b refers to the extension in the current world. The intensor is used if a usually extensionally interpreted expression is used in an intensional context. E.g. a unicorn and a centaur mean generalized quantifiers, say Q1 and Q2, which extensionally are false for any argument. This may be different for other possible worlds. Therefore
209
210
II. History of semantics the intensions ^Q1 and ^Q2 may differ. This accounts for the fact that e.g. the intensional verb seek applied to ^Q1 and ^Q2 may result in different values, as (34) John seeks a unicorn. may be true while (35) John seeks a centaur. may be false at the same time. With the intensional extension of type logic, natural language semantics gets a powerful tool to account for the interaction of intensions and extensions in compositional semantics. The same machinery which is applicable in alethic modal logic – i.e. the logic of possibility and necessity – is transferable to other branches of intensional semantics, as e.g. the semantics of tense and temporal expressions, cf. article 57 (Ogihara) Tense.
11. Dynamic logic Natural language expresssions not only refer to time-dependent situations, but their interpretation also is dependent on time-dependent contexts. A preceding sentence may introduce referents for later anaphoric expressions (cf. article 38 (Dekker) Dynamic semantics and article 89 (Zimmermann) Context dependency). (36) A man walks in the park. He whistles. Among the anaphoric expressions are – of course – nominal anaphora, like pronouns and definite descriptions. Antecedents are typically indefinite noun phrases. But possible antecedents can also be introduced in a less obvious way, e.g. by propositions expressing events. The event time can then be referenced by a temporal anaphora like at the same time. (37) The CEO lighted her cigarette. At the same time the health manager came in. The anaphora at the same time refers to the time when the event described in the first proposition happens. Anaphorical relations are not necessarily realized by overt expressions, they can be implicit, too, or they may be indicated by morphological means, e.g. by the choice of a grammatical tense. Anaphora pose a problem for compositional approaches to semantics based on predicate or type logic. (36) can be formalized by an expression headed by an existential quantifier like (38) ∃x[man(x) ∧ w-i-t-p(x) ∧ whistle(x)] But the man mentioned here can be referred to anywhere in the following discourse. Therefore the quantifier scope cannot be closed at any particular position in the discourse.
10. The influence of logic on semantics The semantics of anaphora, however, is not just a matter of the scope of existential quantifiers. Expressions, like indefinite noun phrases, usually meaning an existential quantifier can in certain contexts introduce discourse referents with a universal reading. This fact was described by Geach (1962). (39) If a farmer owns a donkey he feeds it. means (40) ∀x[∀y[farmer(x) ∧ donkey(y) ∧ own(x, y) → feed(x, y)]] This kind of anaphora is a further challenge for compositional semantics as it has to deal with the fact that an expression which is usually interpreted existentially gets a universal reading here. The challenge is addressed by dynamic semantics. Pre-compositional versions were developed independently by Hans Kamp and Irene Heim as Discourse Representation Theory and File Change Semantics respectively (cf. article 37 (Kamp & Reyle) Discourse Representation Theory). In both approaches structures representing the truth-conditional content of the parts of a discourse as well as the entities which are addressable anaphorically are manipulated by the meaning of discourse constituents. The answer to the question how the meaning of a discourse constituent is to be construed is simply: as a function from given discourse representing structures to new structures of the same type, cf. Muskens (1996). This view is made explicit in dynamic logics. This kind of logics was developed in the 1970er by David Harel and others, cf. Harel (2000), and has been primarily used for the formal interpretation of procedural programming languages. (41) 〈a〉q means that statement a possibly leads to a state where q is true, while (42) [a]q means that statement a necessarily leads to a state where q is true. Regarding states as possible worlds we arrive at a modal logic with as many modalities as there are (equivalence classes of) statements a, for a recursive language usually infinitely many. For many purposes the consideration can be constrained to such modalities where from each state exactly one successor state is accessible. For such modalities with functional accessibility relation the weak (〈…〉) and the strong operator ([…]) collapse semantically into one operator. If we further agree that (43) s1 ∧ [a]s2 can be rewritten as (44) s1[a]s2
211
212
II. History of semantics then this notation has the intuitive reading that state S1 is mapped by the meaning of a into s2. A simplistic application is the following: Assume that s1 is a characterization of the knowledge state of a recipient before receiving the information given by assertion a. Then s2 is a characterization of the knowledge state of the recipient after being informed. If you identify knowledge states with those sets of possible worlds which are consistent with the current knowledge, and if you consider s1 and s2 as descriptions of sets W1 and W2 of possible worlds, then they differ in exactly that respect that W2 is the intersection of W1 and the set Wa of possible worlds in which a is true, i.e. W2 = W1 ∩ Wa. The notion of an informative utterance a can be defined by the condition that W2 ≠ W1. And in order to be consistent with the previous context, it must be true that W2 ≠ ∅. The treatment of discourse states or contexts in dynamic logics is not limited to truth-conditionally characterizable knowledge. In principle any kind of linguistic context parameters can be part of the states, among these the anaphorically accessible antecedents of a sentence. In their Dynamic Predicate Logic, as developed in Groenendijk & Stokhof (1991), Groenendijk and Stokhof model the anaphoric phenomena accounted for in Kamp’s Discourse Representation Theory and Heim’s File Change Semantics in a fully compositional fashion. This is mainly achieved by a dynamic interpretation of the existential quantifier: (45) ∃x[P(x)] is semantically characterized by the usual truth conditions but has the additional effect that free occurrences of the variable x have to be kept assigned to the same object in subsequent expressions which are connected appropriately. The dynamic effect is limited by the scopes of certain operators like the universal quantifier, negation, disjunction, and implication. So x is bound to the same object ouside the syntactic scope of the existential quantifier as in (46) ∃x[man(x) ∧ w-i-t-p(x)] ∧ whistle(x) although the syntactic scope of the existential quantifier ends after w-i-t-p(x). This propsosition is true, only if there is an object x which verifies all three predicates man, w-i-t-p, and whistle. If we characterize the dynamic dimension, the sentences (36) and (39) can be formalized in Dynamic Predicate Logic as (47) ∃x[man(x) ∧ w-i-t-p(x)] ∧ whistle(x) and (48) ∃x[farmer(x) ∧ ∃y[donkey(y) ∧ own(x, y)]] → feed(x, y) respectively. It can easily be seen, how the usual meanings of the discourse sentences (49) A man walks in the park.
10. The influence of logic on semantics and (50) A farmer has a donkey. enter into the composed meaning of the discourse without any changes. In order to get the intended truth conditions for implications, it is required as truth condition that the second clause can be verified for any assigment to x verifying the first clause. Putting together the filtering effect of propositions on possible worlds and the modifying effect on assignment functions, we can consider propositions in Dynamic Predicate Logic as functions on sets of world-assigment pairs. The empty context can be characterized by the Cartesian product of the set of possible worlds and the set of assigments. Each proposition of a discourse filters out certain world-assignment pairs. In some respects Dynamic Predicate Logic deviates from standard dynamic logic approaches, as Groenendijk & Stokhof (1991, sec. 4.3) point out, but it still can be seen as a special case of a logic in this framework. The dynamic view of semantics can be used to model other contextual dependencies than just anaphora. Groenendijk (1999) shows another application in the Logic of Interrogation. Questions add felicity conditions for a subsequent answer to the discourse context. To a great extent, these conditions can be characterized semantically. In the Logic of Interrogation the effect of a question is understood as a partitioning of the current set of possible worlds. Each partition stands for some alternative exhaustive answer, e.g. a yes-no question partitions the set of possible worlds into one subset consistent with the positive answer and a complementary subset compatible with the negative answer. Let us e.g. take the question (51). (51) Does a man walk in the park? According to the formalism of Groenendijk (1999) (51) can be formalized as (52). (52) ?∃x [man(x) ∧ w-i-t-p(x)] (52) partitions the set of possible worlds in two subsets W+ and W–, such that (53) is true for every world in W+ and false for every world in its complement subset W–. An appropriate answer selects exactly one of these subsets. (53) ∃x[man(x) ∧ w-i-t-p(x)] Wh-questions like (54) partition the sets of possible worlds in more partitions than yesno questions. (54) Who walks in the park? Each partition corresponds to an exhaustive answer, which provides the full information who walks in the park and who does not. An assertion answers the question partially if it eliminates at least one partition. If it furthermore eliminates all partitions but one, it answers the question exhaustively. This chapter has given a short overview how logic provided the tools for treating linguistic phenomena. Each logical system has its characteristic strength and its limits.
213
214
II. History of semantics The limits of a logical system sometimes inspired the development of new formal tools (e.g. the step from Aristotelian syllogistic to modern predicate logic) which inspired a new semantics. Sometimes a change in the focus of linguistic phenomena inspired a systematic search for new logical tools (e.g. modal logic) or a reinterpretation of already available logical tools (e.g. dynamic logic). We hope to have illustrated the main developments of the bi-directional influences of logical systems and semantics.
12. References Abaelardus, Petrus 1956. Dialectica. Ed. L.M. de Rijk. Assen: van Gorcum. Almog, Joseph, John Perry & Howard Wettstein (eds.) 1989. Themes from Kaplan. Oxford: Oxford University Press. Aristoteles 1992. Analytica priora. Die Lehre vom Schluß oder erste Analytik. Ed. E. Rolfes. Hamburg: Meiner. Bar-Hillel, Yehoshua 1953. A quasi-arithmetical notation for syntactic description. Language 29, 47–58. Barwise, Jon & Robin Cooper 1980. Generalized quantifiers and natural language. Linguistics & Philosophy 4, 159–218. Carnap, Rudolf 1947. Meaning and Necessity. Chicago, IL: The University of Chicago Press. Church, Alonzo 1936. An unsolvable problem of elementary number theory. American Journal of Mathematics 58, 354–363. Cresswell, Maxwell & Arnim von Stechow 1982. De re belief generalized. Linguistics & Philosophy 5, 503–535. Davidson, Donald 1967. Truth and meaning. Synthesis 17, 304–323. Frege, Gottlob 1966. Grundgesetze der Arithmetik. Hildesheim: Olms. Frege, Gottlob 1977. Begriffsschrift und andere Aufsätze. Ed. I. Angelelli. Darmstadt: Wissenschaftliche Buchgesellschaft. Frege, Gottlob 1988. Die Grundlagen der Arithmetik. Eine logisch mathematische Untersuchung über den Begriff der Zahl. Ed. Chr. Thiel. Hamburg: Meiner. Frege, Gottlob 1994. Funktion, Begriff, Bedeutung. Fünf logische Studien. Göttingen: Vandenhoeck & Ruprecht. Geach, Peter 1962. Reference and Generality: An Examination of Some Medieval and Modern Theories. Ithaca, NY: Cornell University Press. Gödel, Kurt 1930. Die Vollständigkeit der Axiome des logischen Funktionenkalküls. Monatshefte für Mathematik und Physik 37, 349–360. Gödel, Kurt 1931. Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme. Monatshefte für Mathematik und Physik 38, 173–198. Groenendijk, Jeroen 1999. The Logic of Interrogation. Research report, ILLC Amsterdam. Groenendijk, Jeroen & Martin Stokhof 1991. Dynamic Predicate Logic. Linguistics & Philosophy 14, 39–100. Haas-Spohn, Ulrike 1989. Zur Interpretation der Einstellungszuschreibungen. In: E. Falkenberg (ed.). Wissen, Wahrnehmen, Glauben. Epistemische Ausdrücke und propositionale Einstellungen. Tübingen: Niemeyer, 50–94. Harel, David 2000. Dynamic logic. In: D. Gabbay & F. Guenthner (eds.). Handbook of Philosophical Logic, vol. II: Extensions of classical logic, chap. II.10. Dordrecht: Reidel, 497–604. Hodges, Wilfried 1991. Logic. London: Penguin. Kaplan, David 1969. Quantifying in. In: D. Davidson & J. Hintikka (eds.). Words & Objections. Essays on the Work of W.V.O. Quine. Dordrecht: Reidel, 206–242. Kaplan, David 1979. On the logic of demonstratives. Journal of Philosophical Logic 8, 81–98. Kaplan, David 1989. Demonstratives. In: J. Almog, J. Perry & H. Wettstein (eds.). Themes from Kaplan. Oxford: Oxford University Press, 481–563.
10. The influence of logic on semantics
215
Kripke, Saul 1972. Naming and necessity. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 253–233 and 763–769. von Kutschera, Franz 1989. Gottlob Frege. Eine Einführung in sein Werk. Berlin: de Gruyter. Leibniz, Gottfried Wilhelm 1992. Schriften zur Logik und zur philosophischen Grundlegung von Mathmatik und Naturwissenschaft. Ed. H. Herring. Darmstadt: Wissenschaftliche Buchgesellschaft. Lenzen, Wolfgang 1990. Das System der Leibnizschen Logik. Berlin: de Gruyter. Lenzen, Wolfgang 2000. Guilielmi Pacidii Non plus ultra oder: Eine Rekonstruktion des Leibnizschen Plus-Minus-Kalküls. Philosophiegeschichte und logische Analyse 3, 71–118. Lindström, Per 1966. First order predicate logic with generalized quantifiers. Theoria 32, 186–195. Loar, Brian 1972. Reference and propositional attitudes. The Philosophical Review 81, 43–62. Montague, Richard 1973. The proper treatment of quantification in ordinary English. In: J. Hintikka, J. Moravcsik & P. Suppes (eds.). Approaches to Natural Language. Dordrecht: Reidel, 221–242. Reprinted in: R. Thomason (ed.). Formal Philosophy. Selected Papers of Richard Montague. New Haven, CT: Yale University Press, 1974, 247–270. Mostowski, Andrzej 1957. On a generalization of quantifiers. Fundamenta Mathematicae 44, 12–36. Muskens, Reinhard 1996. Combining Montague Semantics and Discourse Representation. Linguistics & Philosophy 19, 143–186. Nortmann, Ulrich 1996. Modale Syllogismen, mögliche Welten, Essentialismus. Eine Analyse der aristotelischen Modallogik. Berlin: de Gruyter. Quine, Willard van Orman 1953. From a Logical Point of View. Cambridge, MA: Harvard University Press. Quine, Willard van Orman 1956. Quantifiers and propositional attitudes. The Journal of Philosophy 53, 177–187. Russell, Bertrand 1903. The Principles of Mathematics. Cambridge: Cambridge University Press. Russell, Bertrand 1905. On denoting. Mind 14, 479–493. Russell, Bertrand 1908. Mathematical logic as based on the theory of types. American Journal of Mathematics 30, 222–262. Russell, Bertrand 1910. Knowledge by acquaintance and knowledge by description. Proceedings of the Aristotelian Society 11, 108–128. Russell, Bertrand & Alfred N. Whitehead 1910–1913. Principia Mathematica. Cambridge: Cambridge University Press. Tarski, Alfred 1935. Der Wahrheitsbegriff in den formalisierten Sprachen. Studia Philosophica 1, 261–405. Zalta, Edward 2000. Leibnizian theory of concepts. Philosophiegeschichte und logische Analyse 3, 137–183.
Albert Newen, Bochum (Germany) Bernhard Schröder, Essen (Germany)
216
II. History of semantics
11. Formal semantics and representationalism 1. Logic, formal languages and the grounding of linguistic methodologies 2. Natural languages as formal languages: Formal semantics 3. The challenge of context-dependence 4. Dynamic Semantics 5. The shift towards proof-theoretic perspectives 6. Ellipsis: A window on context 7. Summary 8. References
Abstract This paper shows how formal semantics emerged shortly after the explosion of interest in formally characterising natural language in the fifties, swiftly replacing all advocacy of semantic representations within explanations of natural-language meaning. It then charts how advocacy of such representations has progressively re-emerged in formal semantic characterisations through the need to model the systemic dependency on context of natural language construal. First, the logic concepts on which subsequent debates depend are introduced, as is the formal-semantics (model-theoretic) framework in which meaning in natural language is defined as a reflection of a direct language-world correspondence. The problem of context dependence is then set out which has been the primary motivation for introducing semantic representation, with sketched accounts of pronoun construal relative to context. It is also shown how, in addition to such arguments, proof-theoretic (hence structural) concepts have been increasingly used in semantic modelling of natural languages. Finally, ellipsis is introduced as a novel window on context providing additional evidence for the need to advocate representations in semantic explanation. The paper concludes with reflections on how the goal of modelling the incremental dynamics of natural language interpretation forges a much closer link between competence and performance models than has hitherto been envisaged.
1. Logic, formal languages, and the grounding of linguistic methodologies The meaning of natural-language (NL) expressions, of necessity, is wholly invisible; and one of the few supposedly reliable ways to establish the interpretation of expressions is through patterns of inference. Inference is the relation between two pieces of information such that one can be wholly inferred from the other (if for example I assert that I discussed my analysis with at least two other linguists, then I imply I have an analysis, that I have discussed it with more than one linguist, and that I am myself a linguist). Inference alone is not however sufficient to pinpoint the interpretation potential of NL expressions. Essential to NL interpretation is the pervasive dependence of expressions on context for how they are to be understood, a problem which has been a driving force in the development of formal semantics. This article surveys the emergence of formal semantics following growth of interest in the formal characterisation of NL grammars (Chomsky Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 216–241
11. Formal semantics and representationalism 1955, Lambek 1958). It introduces logic concepts on which subsequent debates depend and the formal-semantics framework (the formal articulation of what has been called truth-conditional semantics), in which NL is claimed to be a logic. It then sets out the problem of context dependence with its apparent need to add a level of semantic representation over and above whatever is needed for syntax, and shows how semanticists have increasingly turned to tools of proof theory (the syntactic mechanisms for defining inference in logic) to project compositionality of content in natural language. Ellipsis data are introduced as an additional basis for evaluating the status of representations specific to characterising NL construal; and the paper concludes with reflections on how modelling the incremental dynamics of the step-by-step way in which information is built up in discourse forges a much closer link between competence and performance models than has hitherto been envisaged. During the sixties, with emergence of the new Chomskian framework (Chomsky 1965), inquiry into the status of NL semantics within the grammar of a language was inevitable. There were two independent developments: articulation of semantics as part of the broadly Chomskian philosophy (Katz & Fodor 1963, Katz 1972), and Montague’s extension of formal-language semantic tools to NL (Thomason (ed.) 1974). The point of departure for both Chomskian and formal semantics paradigms was the inspiration provided by the formal languages of logic, though, as we shall see, they make rather different use of this background. Logics are defined for the formal study of inference irrespective of subject matter, with individual formal languages defined to reflect specific forms of reasoning, modal logic to reflect modal reasoning, temporal logic to reflect temporal reasoning, etc. Predicate logic, as its name implies, is defined to reflect forms of reasoning that turn on subsentential structure, involving quantification, names, and predicates: it is the logic arguably closest to natural languages. In predicate logic, the grammar defines a system for inducing an infinite set of propositional formulae with internal predicate-argument structure over which semantic operations can be defined to yield a compositional account of meaning for these formulae. Syntactic rules involve mappings from (sub)-formulae to (sub)-formulae making essential reference to structural properties; semantic rules assign interpretations to elementary parts of such formulae and then compute interpretations by mapping interpretations onto interpretations from bottom to top (‘bottomup’) as dictated by the structures syntactically defined. With co-articulation of syntax and semantics, inference as necessary, truth dependence is then defined syntactically and semantically, the former making reference solely to properties of structure of the formulae in question, the latter solely to truth-values assigned to such formulae (relative to some so-called model). The syntactic characterisation of inference is defined by rules which map one propositional structure into another, the interaction of this small set of rules (the proof rules) predicting all and only the infinite set of valid inferences. We can use this pattern to define what we mean by representationalism, as follows. A representationalist account is one that involves essential attribution of structure in the characterisation of the phenomenon under investigation. Any account of natural language which invokes syntactic structure of natural language strings is providing a representationalist account of language. More controversially, representationalist accounts of meaning are those in which the articulation of structure is an integral part of the account of NL interpretation in addition to whatever characterisation is provided of syntactic properties of sentence-strings.
217
218
II. History of semantics
1.1. The Chomskian methodology In the Chomskian development of a linguistic philosophy for NL grammar, it was the methodology of grammar-writing for these familiar logics which was adapted to the NL case. By analogy, NL grammars were defined as a small number of rules inducing an infinite set of strings, success in characterising a language residing in whether all and only the wellformed sentences of the language are characterised by the given rule set. Grammars were to be evaluated not by data of language use or corpus analysis, but by whether the grammar induces the set of strings judged by a speaker to be grammatical. In this, Chomsky was universally followed: linguists generally agreed that there should be no grounding of grammars directly in evidence from what is involved in producing or parsing a linguistic string. Models of language were logically prior to consideration of performance factors; so data relevant to grammar construction had to be intuitions of grammaticality as made by individuals with capacity in the language. This commitment to complete separation of competence-based grammars from all performance considerations has been the underpinning to almost all linguistic theorising since then, though as we shall see, this assumption is being called into question. Following the Chomskian methodology, Katz and colleagues set out analogous criteria of adequacy for semantic theories of NL: that they should predict relations between word meaning and sentence meaning as judged by speakers of the language; synonymy for all expressions having the same meaning; entailment (equivalently, inference) for all clausal expressions displaying a (possibly asymmetric) dependence of meaning; ambiguity for expressions with more than one interpretation. The goal was to devise rule specifications that yield these results, with candidate theories evaluated solely by relative success in yielding the requisite set of semantic relations/properties. Much of the focus was on exploring appropriate semantic representations in some internalised language of thought to assign to words to secure a basis for predicting such entailment relations as John killed Bill, Bill died, or John is a bachelor, John is an unmarried man (Katz 1972, Fodor 1981, 1983, 1998, Pustejovsky 1995, Jackendoff 2002). There was no detailed mapping defined from such constructs onto the objects/events which the natural language expression might be presumed to depict.
1.2. Language as logic: The formal-semantic methodology It was Montague and the program of formal semantics that defined a truth-theoretic grounding for natural language interpretation. Montague took as his point of departure both the methodology and formal tools of logic (cf. article 10 (Newen & Schröder) Logic and semantics). He argued that by extending the syntactic and semantic systems of modal/temporal predicate logic with techniques defined in the lambda calculus, the extra flexibility of natural language could be directly captured, with each individual natural language defined to be a formal language no different in kind from a suitably enriched variant of predicate logic (see Montague 1970 reprinted in Thomason (ed.) 1974). In predicate logic, inference is definable from the strings of the language, with invocation of syntax as an independent level of representation essentially eliminable in being no more than a way of describing semantic combinatorics; and, in applying this concept to natural languages, many formal semanticists adopt similar assumptions (in particular categorial grammar: Morrill 1994). Their theoretical assumptions are thus unlike Chomskian NL
11. Formal semantics and representationalism grammars in which syntax, hence representations of structure, is central. Since these two paradigms made such distinct use of these formal languages in grounding their NL grammars, an essential background to appreciating the debate is a grasp of predicate logic as a language.
1.3. Predicate logic: Syntax, semantics and proof theory The remit of predicate logic is to express inference relations that make essential reference to sub-propositional elements such as quantifiers and names, extending propositional logic (with its connectives ∧ (‘and’), → (‘if-then’, the conditional connective), ∨ (‘or’), ¬ (‘not’). The language has a lexicon, syntax and semantics (Gamut 1991). There is a finite stock of primitive expressions, and a small set of operators licensed by the grammar: the propositional connectives and the quantifiers ∀ (universal), ∃ (existential). Syntactic rules define the properties of these operators, mapping primitive expressions onto progressively more complex expressions; and for each such step, there is a corresponding semantic rule so that meanings of individual expressions can be defined as recursively combining to yield a formal specification of necessary and sufficient truth-conditions for propositional formulae in which they are contained. There are a number of equivalent ways of defining such semantics. In the Montague system, the semantics is defined with respect to a model defined as (i) a set of stipulated individuals as the domain of discourse, (ii) appropriate assignment of a denotation (equivalently extension) for the primitive expressions from that set: individuals from the domain of discourse for names, sets of individuals for one-place predicate expressions, and so on. Semantic rules map these assignments onto denotations for composite expressions. Such denotations are based exclusively on assignments given to their parts and their mode of combination as defined by the syntax, yielding a truth value (True, False) with respect to the model for each propositional formula. There is a restriction on the remit of such semantics. Logics are defined to provide the formal vehicle over which inference independent of subject matter can be defined. Accordingly, model-theoretic semantics for the expressions of predicate logic takes the denotation of terminal expressions as a primitive, hence without explanation: all it provides is a formal way of expressing compositionality for the language, given an assumption of a stipulated language-denotation relation for the elementary expressions. With such semantics, relationships of inference between propositional formulae are definable. There are two co-extensive characterisations of inference. One, more familiar to linguists, is a semantic characterisation (entailment). A proposition φ entails a distinct proposition ψ if in all models in which φ is true ψ is true (explicitly a characterisation in terms of truth-dependence). Synonymy, or equivalence, ≡, is when this relation is two-way. The other characterisation of inference is syntactic, i.e. proof-theoretic, defined as the deducibility of one propositional formula from another using proof rules: all such derivations (proofs) involve individual steps that apply strictly in virtue of structural properties of the formulae. Bringing syntax and semantics together, a logic is sound if the proof rules derive only inferences that are true in the intended models, and complete if it can derive all the true statements according to the models. Thus, in logic, syntax and semantics are necessarily in a tight systematic relation. These proof rules constitute some minimal set, and it is interaction between them which determines all and only the correct inferences expressible in the language. In natural deduction systems (Fitch 1951, Prawitz 1965), defined to reflect individual local steps
219
220
II. History of semantics
T
in any such derivation, each operator has an associated introduction and elimination rule. Elimination rules map complex formulae onto simpler formulae: introduction rules map simpler formulae onto a more complex formula. For example, there is Conditional Elimination (Modus Ponendo Ponens), which given premises of the form φ and φ → ψ licenses the deduction of ψ ; there is Conditional Introduction, which, conversely, from the demonstration of a proof of ψ on the basis of some assumption φ enables the assumption of φ to be removed, and a weaker conclusion φ → ψ to be derived. Of the predicatelogic rules, Universal Elimination licenses the inference of ∀xF(x) to F(a), simplifying the formula by removing the quantifying operator and replacing its variable with a constructed arbitrary name. Universal Introduction enables the universal quantifier to be re-introduced into a formula replacing a corresponding formula containing such a name, subject to certain restrictions. In rather different spirit, Existential Elimination involves a move from ∃xF(x) by assumption to F(a) (a an arbitrary name) to derive some conclusion φ that crucially does not depend on any properties associated with the particular name a. A sample proof, with annotations as metalevel commentary detailing the rules used, illustrates a characteristic proof pattern. Early steps of the proof involve eliminating the quantificational operators and the structure they impose, revealing the propositional structure simpliciter with names in place of variables; central steps of inference (here just one) involve rules of propositional calculus; late steps of the proof re-introduce the quantificational structure with suitable quantifier-variable binding. ( is the proof sign for valid inference.)
Fig. 11.1: Sample proof by universal elimination/introduction
These rules provide a full characterisation of inference, in that their interaction yields all and only the requisite inference relations as valid proofs of the system, with semantic definitions grounding the syntactic rules appropriately. In sum, inference in logical systems is characterisable both semantically, in terms of denotations, and syntactically, in proof-theoretic terms; and, in these defined languages, the semantic and syntactic characterisations are co-extensive, defined to yield the same results.
1.4. Predicate logic and natural language Bringing back into the picture the relation between such formal languages and natural languages, what first strikes a student of NL is that predicate logic is really rather unlike natural languages. In NL, quantifying expressions occur in just the same places as other noun phrases, and not, as in predicate logic, in some position adjoined to fully defined sentential units (see ∀x(F (x) → G(x)) in Fig. 11.1. Nonetheless, there is parallelism between predicate logic and NL structure in the mechanisms defined for such quantified
11. Formal semantics and representationalism formulae. The arbitrary names involved in proofs for quantified formulae display a pattern similar to NL quantifying expressions. In some sense then, NL quantifying expressions can be seen as closer to the constructs used to model the dynamics of inferential action than to the predicate-logic language itself. Confirming this, there is tantalising parallelism between NL quantifiers and terms of the epsilon calculus (Hilbert & Bernays 1939), the logic defining the properties of these arbitrary names. In this logic, the epsilon term that corresponds to an arbitrary name carries a record of the mode of combination of the propositional formula within which it occurs: (1)
∃xF(x) ≡ F(εxF (x))
The formula on the right hand side of the equivalence sign is a predicate-argument sequence and within the argument of this sequence, there is a required second token of the predicate F as the restrictor for that argument term (ε is the variable-binding term operator that is the analogue of the existential quantifier). The effect is that the term itself replicates the content of the overall formula. As we shall see, this internal complexity to epsilon terms corresponds directly to so-called E-type pronouns (Geach 1972, Evans 1980), in which the pronoun appears to reconstruct the whole of some previous propositional formula, despite only coreferring to an indefinite term. In (2), that is, it is the woman that was sitting on the steps that the pronoun she is used to refer to: (2)
A woman was sitting on the steps. She was sobbing.
So there is interest in exploring links between construal of NL quantifiers and epsilon terms (see von Heusinger 1997, Kempson, Meyer-Viol & Gabbay 2001, von Heusinger & Kempson (eds.) 2004). There is also, more generally, as we shall see, growth of interest in exploring links between NL interpretation processes and proof-theoretic characterisations of inference. In the meantime, what should be remembered is that predicate logic, with its associated syntax-semantics correspondence, is taken as the starting point for all formalsemantic modelling of a truth-conditional semantics for NL; and this leads to very different assumptions held in the conflicting Chomskian and formal-semantic paradigms about the status of representations within the overall NL grammar. For Chomskians, with the parallelism with logic largely residing in the methodology of defining a grammar yielding predictions that are consistent and complete for the phenomenon being modelled, the ontology for NL grammars is essentially representationalist. The grammar is said to comprise rules, an encapsulated body of knowledge acquired by a child (in part innate, and encapsulated from other devices controlled by the cognitive system). Whether or not semantics might depend on some additional system of representations is seen as an empirical matter, and not of great import. To the contrary in the Montague paradigm, no claims about the relation between language and the mind are made, and in particular there is no invocation of any mind-internal language of thought. Language is seen as an observable system of patterns similar in kind to formal languages of logic (cf. article 10 (Newen & Schröder) Logic and semantics). From this perspective, syntax is only a vehicle over which semantic (model-theoretic) rules project interpretations for NL strings. If natural languages are indeed to be seen as formal languages, as claimed,
221
222
II. History of semantics the syntax will do no more in the grammar than yield the appropriate pairing of phonological sequences and denotational contents, so is not the core defining property of a grammar: it is rather the pairing of NL strings and truth-conditionally defined content that is its core. This is the stance of categorial grammar: Lambek (1958), Morrill (1994). Not all grammars incorporating formal semantic insights are this stringent: Montague’s defined grammar for English as a formal language had low-level rules ensuring appropriate morphological forms of strings licensed by the grammar rules. But even though formal semanticists might grant that structural properties of language require an independent system of structure-inducing rules said to constitute NL syntax, the positing of an additional mentalistic level of representation internal to the projection of denotational content for the structured strings of the language is debarred in principle. The core formal-semantics claim is that interpretation for NL strings is definable over the interpretation of the terminal elements of the language and their mode of combination as dictated by the syntax and nothing else: this is the compositionality of meaning principle. Positing any level of representation intermediate between the system projecting syntactic structure of strings and the projection of content for those strings is tantamount to abandoning this claim. The move to postulate a level of semantic representation as part of some supposed semantic component of NL grammar is thus hotly contested. In short, though separation of competence performance considerations as a methodological assumption is shared by all, there is not a great deal else for advocates of the Chomskian and Montague paradigms to agree about.
2. Natural languages as formal languages: Formal semantics The early advocacy of semantic representations as put forward by Katz and colleagues (cf. Katz & Fodor 1963, Katz 1972) was unsuccessful (see the devastating critique of Lewis 1970); and from then on, research in NL semantics has been largely driven by the Montague program. Inevitably not all linguists followed this direction. Those concerned with lexical specifications tended to resist formal-semantic assumptions, retaining articulations of representationalist forms of analysis despite lack of formal-semantic underpinnings, relying on linguistic, computational, or psycho-linguistic forms of justification (Pustejovsky 1995, Fodor 1998, Jackendoff 2002). However, there were in any case what were taken at the time to be good additional reasons for not seeking to develop a representational alternative to the model-theoretic program following the pattern of the syntactic characterisation of inference for predicate logic provided by the rules of proof (though see Hintikka 1974). For any one semantic characterisation, there are a large number of alternative proof systems for predicate and propositional calculus, with no possible means of choosing between them, the only unifying factor being their shared semantic characterisation. Hence it would seem that, if a single choice for explanation has to be made, it has to be a semantic one. Moreover, because of the notorious problems in characterising generalized quantifiers such as most (Barwise & Cooper 1981), it is only model-theoretic characterisations of NL content that have any realistic chance of adequate coverage. In such debates, the putative relevance of processing considerations was not even envisaged. Yet amongst many variant proof-theoretic methods for predicate-logic proof systems, Fitch-style natural deduction is invariably cited as the closest to the observable procedural nature of natural language reasoning (Fitch 1951, Prawitz 1965). If consideration of external factors such as psycholinguistic plausibility had been taken
11. Formal semantics and representationalism as a legitimate criterion for determining selection between alternative candidate proof systems, this scepticism about the feasibility of selecting from amongst various prooftheoretic methodologies to construct a representationalist (proof-theoretic) model of NL inference might not have been so widespread. However, inclusion of performancerelated considerations was, and largely still is, deemed to be illegitimate; and a broad range of subsequently established empirical results appear to confirm the decision to retain the NL-as-formal-language methodology. A corollary has been that vocabularies for syntactic and semantic generalisations had to be disjoint, with only the syntactic component of the grammar involving representations, semantics notably involving no more than a bottom-up characterisation of denotational content defined model-theoretically over the structures determined by the syntax, hence with a non-representationalist account of NL semantics. Nevertheless, as we shall see, representationalist assumptions within semantics have been progressively re-emerging as these formal-semantic assumptions have led to ever increasing postulations of unwarranted ambiguity, suggesting that something is amiss in the assumption that the semantics of NL expressions are directly encapsulated in their assigned denotational content.
2.1. The Lambda calculus and NL semantics There was one critical tool which Montague utilised to substantiate the claim that natural languages can be treated as having denotational semantics read off their syntactic structure: the lambda calculus (cf. also article 33 (Zimmermann) Model-theoretic semantics). The lambda calculus is a formal language with a function operator λ which binds variables in some open formula to yield an expression that denotes a function from the type of the variable onto the type of the formula. For example F(x), an open predicatelogic formula, can be used as the basis for constructing the expression λx[F(x)], where the lambda-term is identical in content to the one-place predicate expression F (square brackets for lambda-binding visually distinguish lambda binding and quantifier binding). Thus the formula λx[F(x)] makes explicit the functional nature of the predicate term F, as does its logical type 〈e, t〉 (equivalently e → t): any such expression is a predicate that explicitly encodes its denotational type mapping individual-denoting expressions onto propositional formulae. All that is needed to be able to define truth conditions over a syntactically motivated structure is to take the predicate logic analogue for any NL quantification-containing sentence, and define whatever processes of abstraction are needed over the predicate-expressions in the agreed predicate-logic representation of content to yield a match with requirements independently needed by the NL expressions making up that sentence. For example, on the assumption that ∀x(Student(x) → Smoke(x)) is an appropriate point of departure for formulating the semantic content of Every student smokes, two steps of abstraction can be applied to that predicatelogic formula, replacing the two predicate constants with appropriately typed variables and lambda operators to yield the term λPλQ[∀x(P(x) → Q(x))] as specifying the lexical content of every. This term can then combine first with the term Student (to form a noun-phrase meaning) and then with the term Smokes (as a verb-phrase meaning) to yield back the predicate logic formula ∀x(Student(x) → Smokes(x)). This derivation can be represented as a tree structure with parallel syntactic and semantic labelling:
223
224
II. History of semantics ∀x (Student(x) → Smokes(x)):S
(3)
λQ [∀x(Student(x) → Q(x))]:NP λPλQ [∀x(P(x) → Q (x))]:DET
Smokes:VP
Student:N
The consequence is that noun phrases are not analysed as individual-denoting expressions of type e (for individual), but as higher-type expressions ((e → t) → t). Since the higher-typed formula is deducible from the lower-typed formula, such a lifting was taken to be fully justified and applied to all noun-phrase contents, thereby in addition reinstating syntactic and semantic parallelism for these expressions. This gave rise to the generalised quantifier theory of natural language quantification (Barwise & Cooper 1981, article 43 (Keenan) Quantifiers). The surprising result is the required assumption that the content attributable to the VP is semantically the argument (despite whatever linguistic arguments there might be that the verb is the syntactic head in its containing phrase), and the subject expresses the functor that applies to it, mapping it into a propositional content, so semantic and syntactic considerations appear no longer to coincide. This methodology of defining suitable lambda terms that express the same content as some appropriate predicate-logic formula was extended across the broad array of NL structures (with the addition of possible world and temporal indices in what was called intensional semantics). Hence the demonstrable claim that the formal semantic method captures a concept of compositionality for natural language sentences while retaining predicate-logic insights into the content to be ascribed, a formal analysis which also provides a basis for characterisations of entailment, synonymy, etc. With syntactic and semantic characterisations of formal languages defined in strictly separate vocabulary, albeit in tandem, the Montague methodology for natural languages imposes separation of syntactic and semantic characterisations of natural language strings, the latter being defined exclusively in terms of combinatorial operations on denotational contents, with any intermediate form of representation being for convenience of exegesis only. Montague indeed explicitly demonstrated that the mapping onto intermediate (intensional) logical forms in articulating model-theoretic meanings was eliminable. Following predicate-logic semantics methodology, there was little concern with the concept of meaning for elementary expressions. However, relationships between word meanings were defined by imposing constraints on possible denotations as meaning postulates (following Carnap 1947). For example, the be of identity was defined so that extensions of its arguments were required to be co-extensive across all possible worlds: conversely, the verbs look for and find were defined to ensure that they did not have equivalent denotational contents.
3. The challenge of context-dependence Despite the dismissal by formal semanticists in the seventies of any form of representation within a semantic characterisation of interpretation, it was known rightaway that the model-theoretic stance as a program for natural language semantics within the grammar is not problem-free. One major problem is the inability to distinguish synonymous
11. Formal semantics and representationalism expressions when within propositional attitude reports (John believes groundhogs are woodchucks vs John believes groundhogs are groundhogs), indeed any necessary truths, a problem which led to invoking structured meanings (Cresswell 1985, Lappin & Fox 2005). Another major problem is the context-dependency of NL construal, a truth-theoretic semantics for NL expressions having to be defined relative to some concept of context (cf. article 89 (Zimmermann) Context dependency). This is a foundational issue which has been a recurrent concern amongst philosophers over many centuries (cf. article 9 (Nerlich) Emergence of semantics for a larger perspective on the same problem). The pervasiveness of the context-dependence of NL interpretation was not taken to be of great significance by some, in Lewis (1970) the only provision being the stipulation of an addition to the model of an open-ended set of indices indicating objects in the utterance context (speaker, hearer, and some finite list of individuals). Nevertheless, the problem posed by context was recognised as a challenge; and an early attempt to meet it, sustaining a core Montagovian conception of denotational semantics, while nevertheless disagreeing profoundly over details, was proposed by Barwise & Perry (1983). Their proposed enriched semantic ontology, Situation Semantics, included situations and an array of partial semantic constructs (resource situations, infons, etc.) with sentence meanings requiring anchoring in such situations in order to constitute contents with context-determined values. Inference relations were then defined in terms of relations between situations, with speakers being attuned to such relations between situations (see the subsequent exchange between Fodor and Barwise on such direct interpretation of NL strings: Fodor 1988, Barwise 1989).
3.1. Anaphoric dependencies Recognition of the extent of this problem emerged in attempts to provide principled explanations for how pronouns are understood. Early on, Partee (1973) had pointed out that pronouns can be interpreted anaphorically, indexically or as a bound-variable. In (4), the pronoun is subject to apparent indexical construal (functioning like a name as referring to some intended object from the context); but in (5) it is interpreted like a predicate-logic variable with its value determined by some antecedent quantifying expression, hence not from the larger context: (4)
She is tired.
(5)
Every woman student is panicking that she is inadequate.
Yet, as Kamp and others showed (Evans 1980, Kamp 1981, Kamp & Reyle 1993), this is just the tip of the iceberg, in that the phenomenon of anaphoric dependence is not restrictable to the domain provided by any one NL sentence in any obvious analogue of predicate-logic semantics. There are for example E-type uses of that same pronoun in which it picks up its interpretation from a quantified expression across a sentential boundary: (6)
A woman student left. She had been panicking about whether she was going to pass.
If natural languages matched predicate-logic patterns, not only would we apparently be forced to posit ambiguity as between indexical and bound-variable uses of pronouns, but
225
226
II. History of semantics one would be confronted with puzzles that do not fall into either classification: this pronoun appears to necessitate positing a term denoting some arbitrary witness of the preceding propositional formula, in (6) a name arbitrarily denoting some randomly picked individual having the properties of being a student, female, and having left. Such NL construal is directly redolent of the epsilon terms underpinning arbitrary names of natural deduction proofs, carrying a history of the compilation of content from the sentence providing the antecedent (von Heusinger 1997). But this just adds to the problem for it seems there are three different interpretations for one pronoun, hence ambiguity. Moreover, as Partee had pointed out, tense specifications display all the hallmarks of anaphora construal, able to be interpreted either anaphorically, indexically or as a bound variable, indeed with E-type effects as well, so this threat of proliferating ambiguities is not specific to pronouns.
4. Dynamic Semantics The responses to this challenge can all be labelled dynamic semantics (Muskens, van Benthem & Visser 1997, Dekker 2000).
4.1. Discourse Representation Theory Discourse Representation Theory (DRT: Kamp 1981, Kamp & Reyle 1993) was the first formal articulation of a response to the challenge of modelling anaphoric dependence in a way that enables its various uses to be integrated. Sentences of natural language were said to be interpreted by a construction algorithm for interpretation which takes the syntactic structure of a string as input and maps this by successive constructional steps onto a structured representation called a Discourse Representation Structure (DRS), which was defined to correspond to a partial model for the interpretation of the NL string. A DRS contains named entities (discourse referents) introduced from NL expressions, with predicates taking these as arguments, the sentence relative to which such a partial model is constructed being defined to be true as long as there is at least one embedding of the DRS into the overall model. For example, for a simple sentence-sequence such as (7), the construction algorithm for building discourse representation structure induces a DRS for the interpretation of the first sentence in which one discourse referent is entered into the DRS corresponding to the name and one for the quantifying expression, together with a set of predicates corresponding to the verb and nouns. (7)
John loves a woman. She is French.
(8)
x, y John=x loves(y)(x) woman(y)
The DRS in (8) might then be extended, continuing the construal process for the overall discourse by applying the construction algorithm to the second sentence to yield the expanded DRS:
11. Formal semantics and representationalism (9)
x, y, z John=x loves(y)(x) woman(y) z=y French(y)
To participate in such a process, indefinite NPs are defined as introducing a new discourse referent into the DRS, definite NPs and pronouns require that the referent entered into the DRS be identical to some discourse referent already introduced, and names require a direct embedding into the model providing the interpretation. Once constructed, the DRS is evaluated by its embeddability into the model. Any such resulting DRS is true in a model if and only if there is at least one embedding of it within the overall model. Even without investigating further complexities that license the embeddability of one DRS within another and the famous characterisation of If a man owns a donkey, he beats it (cf. article 37 (Kamp & Reyle) Discourse Representation Theory), an immediate bonus for this approach is apparent. The core cases of the so-called E-type pronouns fall into the same characterisation as more obvious cases of co-reference: all that is revised is the domain across which some associated quantifying expression can be seen to bind. It is notable in this account that there is no structural reflex of the syntactic properties of the individual quantifying determiner: indeed this formalism was among the first to come to grips with the name-like properties of such quantified formulae (cf. Fine 1984). It might of course seem that such a construction process is obliterating the difference between names, quantifying expressions, and anaphoric expressions, since all lead to the construction of discourse referents in a DRS. But, as we have seen, these expressions are distinguished by differences in the construction process. The burden of explanation for NL expressions is thus split: some aspect of their content is characterised by the mode of construction of the intervening DRS, some of it by the embeddability conditions of that structure into the overall model. The particular significance of DRT lies in the Janus-faced properties of the DRS’s defined. On the one hand, a DRS corresponds to a partial model (or more weakly, is a set of constraints on a model), defined as true if and only if it is embeddable in the overall model (hence is the same type of construct). On the other hand, specific structural properties of the DRS may be invoked in defining antecedent-pronoun relations, hence such a level is an essential intermediary between NL string and the denotations assigned to its expressions. Nonetheless, this level has a fully defined semantics constituting its embeddability into an overall model, so its properties are explicitly defined. There is a second sense in which DRT departs from previous theories. In providing a formal articulation of the incremental process of how interpretation is built up relative to some previously established context, there is implicit rejection of the methodology disallowing reference to performance in articulations of NL competence. Indeed the DRS construction algorithm is a formal reflection of sentence-by-sentence accumulation of content in a discourse (hence the term Discourse Representation Theory). So DRT not only offers a representationalist account of NL meaning, but one reflecting the incrementality of utterance processing.
227
228
II. History of semantics
4.2. Dynamic Predicate Logic This account of anaphoric resolution sparked immediate response from proponents of the model-theoretic tradition. For example, Groenendijk & Stokhof (1991) argued that the intervening construct of DRT was both unnecessary and illicit in making compositionality of NL expressions definable not directly over the NL string but only via this intermediate structure. Part of their riposte to Kamp involved positing Dynamic Predicate Logic (DPL) with two variables for each quantifier and a new attendant semantics, so that once one of these variables gets closed off in ways familiar from predicatelogic binding, the second remains open, bindable by a quantifying mechanism introduced as part of the semantic combinatorics associated with some preceding string, hence obtaining cross-sentential anaphoric binding without any ancillary level of representation as invoked in DRT (cf. article 38 (Dekker) Dynamic semantics). Both the logic and its attendant semantics were new. Nevertheless, such a view is directly commensurate with the stringently model-theoretic view of context-dependent interpretation for natural language sentences provided by e.g. Stalnaker (1970, 1999): in these systems, progressive accumulation of interpretation across sequences of sentences in a discourse is seen exclusively in terms of intersections of sets of possible worlds progressively established, or rather, to reflect the additional complexity of formulae containing unbound variables, intersection of sets of pairs of worlds and assignments of values to variables (see Heim 1982 where this is set out in detail). In the setting out of DPL as a putative competitor over DRT in characterising the same data without any level of representation, there was no attempt to address the challenge which Kamp had brought to the fore in articulating DRT, that of characterizing how anaphoric expressions contribute to the progressive accumulation of interpretation: on the DPL account, the pronoun in question was simply presumed to be coindexed with its antecedent. Notwithstanding this lack of take up of the challenge which DRT was addressing, there has been continuing debate since then as to whether any intervening level of representation is justified over and above whatever syntactic levels are posited to explain syntactic properties of natural language expressions. Examples such as (10)–(11) have been central to the debate (Kamp 1996, Dekker 2000): (10) Nine of the ten marbles are in the bag. It is under the sofa. (11) One of the ten marbles isn’t in the bag. It is under the sofa. According to the DRT account, the reason why the pronoun it cannot successfully be used with the interpretation that it picks up on the one marble not in the bag in (10) is because such an entity is only inferrable from information given by expressions in the previous sentence: no representation of any term denoting such an entity in (10) has been made available by the construction process projecting a discourse representation structure on the basis of which the truth conditions of the previous sentence are compiled. So though in all models validating the truth of (10) there must be a marble not in the bag described, there cannot be a successful act of reference to such an individual in using the pronoun. By way of contrast, in (11), despite its being true in all the same models that (10) is true, it is because the term denoting the marble not in the bag is
11. Formal semantics and representationalism specifically introduced that anaphoric resolution is successful. Hence, it is argued, the presence of an intermediate level of representation is essential.
4.3. The pervasiveness of context-dependence Despite Kamp’s early insight (Kamp 1981) that anaphora resolution was part of the construction for building up interpretation, this formulation of anaphoric dependence was set aside in the face of the charge that DRT did not provide a compositional account of natural language semantics, and alternative generalised-quantifier accounts of natural language quantification were provided reinstating compositionality of content over the NL string within the DRT framework even though positing an intermediate level of representation (van Eijck & Kamp 1997). Despite great advances made by DRT, the issue of how to model context dependence continues to raise serious challenges to model-theoretic accounts of NL content. The different modes of interpretation available for pronouns extend far beyond a single syntactic category. Definite NPs, demonstrative NPs, and tense all present a three-way ambiguity between indexical, bound-variable and E-type forms of construal; and this ambiguity, if not reducible to some general principle, requires that the language be presumed to contain more than one such expression, with many discrete expression-denotation pairings. Such ambiguity is a direct consequence of the assumption that an articulation of meaning of an expression has to be in terms of its systematic contribution to truth conditions of the sentences in which it occurs. Such distinct uses of pronouns do indeed need to be expressed as contributing different truth conditions, whether as variable, as name, or as some analogue of an epsilon term; but the very fact that this ambiguity occurs in all context-dependent expressions in all languages indicates that something systematic is going on: this pattern is wholly unlike the accidental homonymy typical of lexical ambiguity. Furthermore, it is the non-existence of context-dependence in interpretation of classical logic which lies at the heart of the difference between the two types of system. In predicate logic, by definition, there is no articulation of context or how interpretation is built up relative to that. The phenomenon under study is that of inference, and the formal language is defined to match such patterns (and not the way in which the formulae in question might themselves have been established). Natural languages are however not purpose-built systems; and context-dependence is essential to their success as an economical vehicle for expressing arbitrarily rich pieces of information relative to arbitrarily varying contexts. This perspective is buttressed by work in the neighbouring disciplines of philosophy of language and pragmatics. The gap between intrinsic content of words and their interpretation in use had been emphasised by the later Wittgenstein (1953), Austin (papers collected in 1961), Grice (papers collected in 1989), Sperber & Wilson (1986/1995), Carston (2002). The fact that context is essential to NL construal imposes an additional condition of adequacy on accounts of NL content: a formal characterisation of the meaning of NL expressions needs to define both the input which an individual NL expression provides to the interpretation process and the nature of contexts with which such input interacts. Answers to the problem of context formulation cannot, however, be expected to come from the semantics of logics. Rather, we need some basis for formulating specifications that under-determine any assigned content. Given the
229
230
II. History of semantics needed emphasis on underspecification, on what it means to be part-way through a process whereby some content is specifiable only as output, it is natural to think in terms of representations, or, at least, in terms of constraints on assignment of content. This is now becoming common-place amongst linguists (Underspecified Discourse Representation Semantics (UDRT), Reyle 1993, van Leusen & Muskens 2003). Indeed Hamm, Kamp & van Lambalgen (2006) have argued explicitly that such linguistically motivated semantic representations have to be construed within a broadly computational, hence representationalist theory of mind.
5. The shift towards proof-theoretic perspectives It might seem as though, with representationalism in semantics such a threat to the fruitfulness of the Montague paradigm, any such claim would be deluged with counterarguments from the formal-semantics community (Dekker 2000, Muskens 1996). But, to the contrary, use of representationalist tools is continuing apace in both orthodox and less orthodox frameworks; and the shift towards more representational modes of explanation is not restricted to anaphora resolution.
5.1. The Curry-Howard isomorphism An important proof-theoretic result pertaining to the compositionality of NL content came from the proof that the lambda calculus and type deduction in intuitionistic logic are isomorphic (intuitionistic logic is weaker than classical logic in that several classical tautologies do not hold). This is the so-called Curry-Howard isomorphism. Its relevance to linguists, which I define ostensively by illustration, is that the fine structure of how compositionality of content for NL expressions is built up can be represented prooftheoretically, making use of this isomorphism (Morrill 1994, Carpenter 1997). The isomorphism is displayed in proofs of type deduction in which propositions are types, with a label demonstrating how that proof type as conclusion was derived. The language of the labels is none other than the lambda calculus, with functional application in the labels corresponding to type deduction on the formula side. So in the label, we might have a lambda term, e.g. λx[Sneeze(x)], and in the formula its corresponding type e → t. The compositionality of content expressible through functional application defined over lambda terms can thus be represented as a step of natural deduction over labelled propositional formulae, with functional application on the labels and modus ponens on the typed formula. For example, a two-place predicate representable as λxλy[See(x)(y)] can be stated as a label to a typed formula: (12) λxλy[See(x)(y)] : e → (e → t) This, when combined with a formula (13) Mary : e yields as output: (14) λy[See(Mary)(y)] : e → t
11. Formal semantics and representationalism And this in its turn when paired with (15) John : e yields as output by one further step of simultaneous functional application and Conditional Elimination: (16) See(Mary)(John) : t So compilation of content for the string John sees Mary can be expressed as a labelled deduction proof (12)–(16), reflecting the bottom-up compilation of content for the NL expression.
5.2. Proof theory as syntax: Type Logical Grammar This method for using the fine structure of natural deduction as a means of representing NL compositionality has had very wide applicability, in categorial grammar and elsewhere (Dalrymple, Shieber & Pereira 1991, Dalrymple (ed.) 1999). In categorial grammar in particular (following Lambek 1958 who defined a simple extension of the lambda calculus with the incorporation of two order-sensitive operators indicating functional application with respect to some left/right placed argument), the Curry-Howard isomorphism is central, and with later postulation of modal operators to define syntactic domains (Morrill 1990, 1994), such systems have been shown to have expressive power sufficient to match a broad array of variation expressible in natural languages (leaving on one side the issue of context dependence). Hence the categorial-grammar claim that NL grammars are logics with attendant proof systems. These are characteristically presented either in natural-deduction format or in its meta-level analogue, the sequent calculus. The advantage of such systems is the fine level of granularity they provide for reflecting bottom-up compositionality of content, given suitable (higher) type assignments to NL expressions. In being logics, natural languages are presumed to have syntax and semantics defined in tandem, with the proof display being no more than an elegant display of how prosodic sequences (words) are paired projected by steps of labelled type deduction onto denotational contents. In particular, no representationalist assumptions are made vis a vis either syntax or logical representations (see Morrill 1994 for a particularly clear statement of this strictly denotationalist commitment). Nonetheless in more recent work, Morrill departs from at least one aspect of this stringent categorial grammar proof-theoretic ideology. In all categorial grammar formalisms, as indeed in many other frameworks, there is strict separation of competence and performance considerations with grammar-formalisms only evaluated in terms of their empirical predictive success; yet Morrill & Gavarró (2004) and Morrill (2010) argue that an advantage of the particular categorial-grammar characterisation adopted (with linear-logic proof derivations), is the step-wise correspondence of individual steps in the linear-logic derivation to measures of complexity in processing the NL string, this being a bonus for the account. Thus representational properties of grammar-defined derivations would seem at least evidenced by performance data, even if such prooftheoretically defined derivations are taken to be eliminable as a core property of the grammar formalism itself.
231
232
II. History of semantics
5.3. Type Theory with Records A distinct proof-theoretic equivalent of Montague’s PTQ grammar was defined by Ranta (1994), who adopted as a basis for his framework the methodology of MartinLöf’s (1984) proof theory for intuitionistic type theory. By way of introducing the Martin-Löf ontology, remember how in natural deduction proofs there are metalevel annotations (section 1.4), but these are only partially explicit: many do not record dependencies between arbitrary names as they are set up. The Martin-Löf methodology to the contrary requires that all such dependencies are recorded, and duly labelled; and Ranta used this rich attendant labelling system to formulate analyses of anaphora resolution and quantification, and from these he established a structural concept of context, notably going beyond what is achievable in categorial grammar formalisms. Furthermore, such explicit representations of dependency have been used to establish a fully explicit proof-theoretic grounding for generalized quantifiers, from which an account of anaphora resolution follows incorporating E-type pronouns as a subtype (Ranta 1994, Piwek 1998, Fernando 2002). In Cooper (2006), the Ranta framework is taken a step further, yielding the Type Theory with Records framework (TTR). Cooper uses a concept of record and recordtype to set out a general framework for modelling both context-dependent interpretation and the intrinsic underspecification that NL expressions themselves contribute to the interpretation process. Echoing the DRT formulation, the interpretation to be assigned the sentence A man owns a donkey is set out as taking the form of the recordtype (Cooper 2005) (the variables in these formulations, as labels to proof terms, are like arbitrary names, expressing dependencies between one term and another: (17) ⎡ x : Ind ⎤ ⎢c : man(x ) ⎥ ⎢1 ⎥ ⎢ y : Ind ⎥ ⎢ ⎥ c donkey y ) : ( ⎢ 2 ⎥ ⎢c : own( y )( x ) ⎥ ⎣ 3 ⎦ x,y are variables of individual type, c1 is of the type of proof that x is a man, (hence a proof that is dependent on some proof of x), c 2 is of the type of proof that y is a donkey, . . . A record of that record type would be some instantiation of variables e.g.:
11. Formal semantics and representationalism (18) ⎡ x = a ⎤ ⎢c = p ⎥ 1⎥ ⎢ 1 ⎢y = b ⎥ ⎥ ⎢ ⎢c 2 = p 2 ⎥ ⎢c = p ⎥ 3⎦ ⎣ 3 p1 a proof of ‘man(a)’, p2 a proof of ‘donkey(y)’, and so on. This is a proof-theoretic reformulation of the situation-theory concepts of infon (= situation-type) and situation. A record represents some situation that provides values that make some record-type true. The concept of record-type corresponds to sentence meanings in abstraction from any given context/record: a sentence meaning is a mapping from records to record-types. It is no coincidence that such dependency labelling has properties like that of epsilon terms, since, like them, the terms that constitute the labels to the derived types express the history of the mode of combination. The difference between this and DRT lies primarily in the grounding of records and record-types in proof-theoretic rather than model-theoretic underpinnings (Cooper 2006). Like DRT, this articulation of record theory with types as a basis for NL semantics leaves open the link with syntax. However, in recent work, Ginzburg and Cooper have extended the concept of records and record-types yet further to incorporate full details of linguistic signs (with phonological, syntactic and semantic information – a multi-level representational system Ginzburg & Cooper 2004, cf. article 36 (Ginzburg) Situation Semantics).
6. Ellipsis: A window on context So far, the main focus has been anaphora construal, but ellipsis presents another window on context. Elliptical fragments are those where there is only a fragment of a clause: words and whole phrases can simply be omitted when the context fully determines what is needed to complete the interpretation. The intrinsic interest of such elliptical fragments is that they provide evidence that very considerable richness of labelling is required in modelling NL construal. Like pronominal anaphora, ellipsis displays a huge diversity. Superficially, the phenomenon might seem to be amenable to some more sophisticated variant of anaphora construal. There are, that is, what look like indexical construals, and also analogues of coreference and bound-variable anaphora: (19) (Mother to a toddler stretching up to the stove above their head) Johnny, don’t. (20) John stopped in time but I didn’t. (21) Everyone who submitted their thesis without checking it wished that the others had too. To capture the array of effects illustrated in (20)–(21), there is a model-theoretic account of ellipsis (Dalrymple, Shieber & Pereira 1991) whose starting point is to take
233
234
II. History of semantics the model-theoretic methodology, and from the propositional content provided by the antecedent clause, to define some lambda term to isolate an appropriate predicate for combining with the NP term provided by the fragment. For (20), one might define the term λx[stop-in-time(x)]; for (21) the term λx[∃y(thesis(y) ∧ submit(y)(x) ∧ ¬check(y) (x))]. However, even allowing for complexities of possible sequences of quantificational dependencies replaced at the ellipsis site as in (21), the rebinding mechanism has to be yet more complex, as what is rebound may be not merely some subject value and whatever quantified expressions are dependent on that, but cases where the subject expression is itself dependent on some expression within its restrictor, so that an even higher-order form of abstraction is required: (22) The man who arrested Joe failed to read him his rights, as did the man who arrested Sue. This might seem to be an echo of the debate between DRT and DPL, with the higherorder account having the formal tools to express the parallelism of construal carried over from the antecedent clause to the ellipsis site as in (20)–(21), without requiring any invocation of a representation of content. However, there are many cases of ellipsis, as syntacticians have demonstrated, where only an account which invokes details of some assigned structure to the string can explain the available interpretations (Fiengo & May 1994, Merchant 2004). For example, there are morphological idiosyncracies displayed by individual languages that surface as restrictions on licensed elliptical forms. In a case-rich language such as German, for example, the fragment has to occur with the case form it would be expected to have in a fully explicit follow-on to the antecedent clause (Greek also displays exactly the same type of requirement). English, where case is very atrophied, imposes no such requirement (Morgan 1973, Ginzburg & Cooper 2004): (23) Hans will nach London gehen. Ich/*mich auch. (24) Hans wants to go to London. Me too. This is not the kind of information which a higher-order unification account can express, as its operations are defined on the model-theoretic construal of the antecedent conjunct, not over morphological sequences. There are many more such structure-particular variations indicating that ellipsis would seem to have to be defined over representations of structure, hence to be analysed as a syntactic phenomenon, in each case defined over some suitable approximate correspondent to the surface strings (there has to be this caveat as there are well-known cases of vehicle-change where what is reconstructed is not the linguistic form, Fiengo & May 1994): (25) John has checked his thesis notes carefully, but I haven’t. However, there are puzzles for both types of account. Fragments may occur in dialogue for which arguably only a pragmatic enrichment process can capture the effects (not based either on structured strings or on their model-theoretic contents, Stainton 2006): (26) A (leaving hotel): The station? B (receptionist): Left out of the door. Then second on the right.
11. Formal semantics and representationalism For reasons such as this, ellipsis continues to receive a great deal of attention, with no single account apparently able single-handedly to match the fine-grained nature of the cross-conjunct/speaker patterning that ellipsis construal can achieve. The conclusion generally drawn is that the folk intuition about ellipsis is simply not expressible. Indeed the added richness in TTR of combining Ranta’s type-logical formalism with the featurematrix vocabulary of Head Driven Phrase Structure Grammar (HPSG: Sag, Wasow & Bender 2002) to obtain a system combining independent semantic, syntactic and morphological paradigms was at least in part driven by the observed ‘fractal heterogeneity’ of ellipsis (Ginzburg & Cooper 2004). However, there is one last chance to capture such effects in an integrated way, and this is to posit a system of labelling recording individual steps of information build-up with whatever level of granularity is needed to preserve the idiosyncracies in the individual steps. Ellipsis construal can then be defined by making reference to such records. This type of account is provided in Dynamic Syntax (Kempson, Meyer-Viol & Gabbay 2001, Cann, Kempson & Marten 2005, Purver, Cann & Kempson 2006, Cann, Kempson & Purver 2007).
6.1. Dynamic Syntax: A reflection of parsing dynamics as a basis for syntax Dynamic Syntax (DS) models the dynamics of how interpretation is incrementally built up following a parsing dynamics, with progressively richer representations of content constructed as words are processed relative to context. This articulation of the stepwise way in which interpretation is built up is claimed to be all that is needed for explaining NL syntax: representations constructed are simultaneously the vehicle for explaining syntactic distributions and a vehicle for representing how interpretation is built up. The methodology adopts a representationalist stance vis a vis content (Fodor 1983). Predicate-argument structures are represented in a tree format with the assumption of progressive update of partial tree-representations of content. Context, too, is represented in the same terms, evolving in tandem with each update. This concept of structural growth totally replaces the semantically blind syntactic specifications characteristic of such formalisms as Minimalism (Chomsky 1995) and HPSG. It is seen as starting from an initial one-node tree stating the goal of the interpretation process to establish some propositional formula (the tree representation to the left of the ↦ in (27)). Then, using both parse input and information from context, some propositional formula is progressively built up (the tree representation to the right of the ↦ in (27)). (27) Parsing John upset Mary ?t, ◊ ↦
John ′ : e
Upset ′(Mary ′) (John ′): t, ◊
Upset ′ (Mary ′) : e → t
Mary ′ : e Upset ′ : e → (e → t)
235
236
II. History of semantics The output is a fully decorated tree whose topnode is a representation of some proposition expressed with its associated type specification, and each dominated node has a concept formula, e.g. John⬘ representing some individual John, and an indication of what semantic type that concept is. The primitive types are types e and t as in formal semantics but construed syntactically as in the Curry-Howard isomorphism, labelled type deduction determining the decorations on non-terminal nodes once all terminal nodes are fixed and suitably decorated with formula values. There is invariably one node under development in any partial tree, as indicated by the pointer ◊. So a parse process for (27) would constitute a transition across partial trees, the substance of this transition turning on how the growth relation ↦ is to be determined by the word sequence. The concept of requirement ?X for any decoration X is central. Decorations on nodes such as ?t, ?e, ?(e → t), etc. express requirements to construct formulae of the appropriate type on the nodes so decorated, and these requirements drive the subsequent tree-construction process. A string can then be said to be wellformed if and only if there is at least one derivation involving monotonic growth of partial trees licensed by computational (general), lexical and pragmatic actions following the sequence of words that yields a complete tree with no requirements outstanding. Just as the concept of tree growth is central, so too is the concept of procedure for mapping one partial tree to another. Individual transitions from partial tree to partial tree are all defined as procedures for tree growth. The formal system underpinning the partial trees that are constructed is a logic of finite trees (LOFT: Blackburn & Meyer-Viol 1994). There are two basic modalities, 〈↓〉 and 〈↑〉, and Kleene * operators defined over these relations, e.g. 〈↑∗〉Tn(a) indicating that somewhere dominating this node is the tree-node Tn(a) (a standard tree-theoretic characterisation of ‘dominate’). The procedures in terms of which the tree growth processes are defined then involve such actions as make(〈↓〉), go(〈↓〉), make(〈↓∗〉), put(X) (for any decoration X), etc. This applies both to general constraints on tree growth (hence the syntactic rules of the system) and to specific tree update actions constituting the lexical content of words. So the contribution which a word makes to utterance interpretation is expressed in the same vocabulary as general structural growth processes, and is invariably more than just a concept specification: it is a sequence of actions developing a sub-part of a tree, possibly building new nodes, and assigning them decorations such as formula and type specifications. Of the various concepts of underspecification, two are central. On the one hand, there is underspecification of conceptual content, with anaphoric expressions being defined as adding to a node in a tree a place-holding metavariable of a given type as a provisional formula value to be replaced by some fixed value which the immediate context makes available. This is a relatively uncontroversial approach to anaphora construal, equivalent to formulations in many other frameworks (notably DRT and TTR). On the other hand, there is underspecification and update of structural relations, in particular replacing all movement or feature-passing accounts of discontinuity effects, with introduction of an unfixed node, one whose structural relation is not specified at the point of construction, and whose value must be provided from the construction process. Formally the construction of a new node within a partial tree is licensed from some node requiring a propositional type, with that relation being characterised only as that of domination (weakly specified tree relations are indicated by a dashed line with Tn(0) as the rootnode: this is step (i) of (28). The update to this relatively weak tree-relation for English, lacking as it does any case specifications,
11. Formal semantics and representationalism
237
becomes possible only once having parsed the verb. This is the unification step (ii) of (28), an action which satisfies both type and structure update requirements: (28) Parsing Mary, John upset ?t
?t, Tn(0) Mary⬘: e, Mary ⬘: e, 〈↑∗〉Tn(0), ◊ step (i)
?(e → t)
John⬘: e
〈↑∗〉Tn(0) ?e, ◊
Upset⬘: (e → (e → t ))
step (ii)
This process, like the substitution operation associated with pronoun construal feeds into the ongoing process of creating a completed tree, in this case by steps of labelled type deduction. It might seem that all such talk of (partial) trees as representations of content could not in principle simultaneously serve as both a syntactic explanation and a basis for semantic interpretation, because of the problems posed by quantification, known to necessitate a globally defined process expressing scope dependencies between quantifying expressions, and generally agreed to involve mismatch between logical and syntactic category (see section 2.1). However, quantified expressions are taken to map onto epsilon terms, hence of type e (Kempson, Meyer-Viol & Gabbay 2001, ch.7); and in having a restrictor which reflects the content of the proposition in which they are contained, the defining property of epsilon terms is that they grow incrementally, as additional predicates are added to the term under construction (see section 1.4). And these epsilon terms, metavariables and tree relations, all under development, all interact to yield the complex interactions more familiarly seen as scope and anaphoric sensitivities to long-distance and other discontinuity effects. The bonus of this framework for modelling ellipsis is that by taking not merely structured representations of content as first-class citizens but the procedures used to incrementally build up such stucture, ellipsis construal is expressible in an integrated way: it is indeed definable as determined directly by context. What context comprises is the richer notion that includes some sequence of words, their assigned content, and its attendant propositional tree structure, plus the sequence of actions whereby that structure and its content were established. Any one of these may be used to build up interpretation of the fragment, and together they determine the range of interpretations available: recovery of some predicate content from context (19)–(20), recovery of some actions in order to create a parallel but distinct construal (21)–(22). This approach to ellipsis as building up structure by reiterating content or actions can thus replace the higher order unification account with a more general, intuitive, and yet essentially representationalist account. A unique property of the DS grammar mechanisms is that production is expressed in the same terms as parsing, as progressive build-up of semantic structure, differing from parsing only in having also a richer representation of what the speaker is trying to express as a filter on whether appropriate semantic structure is being induced by the selected words. This provides a basis from which to expect fluent exchange of speaker/
238
II. History of semantics hearer roles in dialogue, in which a speaker sets out a partial structure which their interlocutor, parsing what they say, takes over: (29) Q: Who did John upset? A: Himself. (30) A: I saw John. B: With his mother? A: Yes, with Sue. In the park. Since both lexical and syntactic actions are defined as tree-growth processes, construal of elliptical fragments both within and across speakers is expected to allow replication of any such actions following their use to build interpretation for some antecedent, predicting the mixture of semantic and syntactic factors in ellipsis. Scope dependencies are unproblematic. These are not expressed on the trees, but formulated as an incrementally collected set of constraints on the evaluation of the emergent epsilon terms. These are applied once the propositional formula is constructed, determining the resulting epsilon term. So despite the apparently global nature of scope, parallel scope dependencies can be expressed as the re-use of a sequence of actions that had earlier been used in the construal of the antecedent string, replicating scope actions but relative to the terms constructed in interpreting the fragment. Even case, defined as an output filter on the resultant tree, is unproblematic, unlike for semantic characterisations of ellipsis, as in (23). So, it is claimed, interaction of morphology, syntax, and even pragmatics in ellipsis construal is predictable given DS assumptions. To flesh this out would involve a full account of how DS mechanisms interact to yield appropriate results. All this sketch provides is a point of departure for addressing ellipsis in an integrated manner (see also a variant of DRT: Asher & Lascarides 2002). More significantly, the DS account of context and the emergent concept of content intrinsic to NL words themselves are essentially representationalist. The ellipsis account succeeds in virtue of the fact that lexical specifications are procedures that interact with context to induce the building of representations of denotational content. Natural languages are accordingly denotationally interpretable only via a mapping onto an intermediate logical system. And, since compositionality of denotational content is defined over the resulting trees, it is the incrementality of projection of word meaning and the attendant monotonicity of the tree growth process which constitutes compositionality definable over the words making up sentences.
7. Summary Inevitably, issues raised by anaphora and ellipsis remain open. But, even without resolving these, research goals have strikingly shifted since early work in formal semantics and the representationalism debate it engendered. The dispute remains; but answers to the questions are very different. On the view espoused within TTR and DS formulations, a natural language is not definable as a logic in the mould of predicate logic. Rather, it is a set of mechanisms out of which truth-denoting objects can be built relative to what is available in context. To use Cooper’s apt turn of phrase ‘natural languages are tools for formal language construction’ (Cooper & Ranta 2008): so semantic representations are central to the form of explanation.
11. Formal semantics and representationalism Whatever answers individual researchers might reach to individual questions, one thing is certain: new puzzles are taking centre-stage. The data of semantic investigation are not now restricted to judgements of entailment relations between sentences. The remit includes modelling the human capacity to interpret fragments in context in conjunction with other participants in dialogue; and defining appropriate concepts of information update. Each of these challenges involves an assumption that the human capacity for natural language is a capacity for language processing in context. With this narrowing of the gap between competence and performance considerations, assumptions about the nature of such semantic representations of content can be re-evaluated. We can again pose the question of the status of representations in semantic modelling; but we can now pose this as a question of the relation of such representations to those required for modelling cognitive inference more broadly. Furthermore, such questions can be posed from a number of frameworks as starting point: categorial grammar, Type Theory with Records, DRT and its variants, Dynamic Syntax, to name but a few. And it is these new avenues of research which are creating new ways of understanding the nature of natural language and linguistic competence. Eleni Gregoromichelaki, Ronnie Cann and two readers provided most helpful comments leading to significant improvement of this paper. However, normal disclaimers apply.
8. References Asher, Nicholas & Alex Lascarides 2002. Logics of Conversation. Cambridge, MA: The MIT Press. Austin, John 1961. How to Do Things with Words. Oxford: Clarendon Press. Barwise, Jon 1989. The Situation in Logic. Stanford, CA: CSLI Publications. Barwise, Jon & Robin Cooper 1981. Generalized quantifiers and natural language. Linguistics & Philosophy 4, 159–219. Barwise, Jon & John Perry 1983. Situations and Attitudes. Cambridge, MA: The MIT Press. Blackburn, Patrick & Wilfried Meyer-Viol 1994. Linguistic logic and finite trees. Bulletin of the Interest Group of Pure and Applied Logics 2, 2–39. Cann, Ronnie, Ruth Kempson & Lutz Marten 2005. Dynamics of Language. Amsterdam: Elsevier. Cann, Ronnie, Ruth Kempson & Matthew Purver 2007. Context, wellformedness and the dynamics of dialogue. Research on Language and Computation 5, 333–358. Carnap, Rudolf 1947. Meaning and Necessity. A Study in Semantics and Modal Logic. Chicago, IL: The University of Chicago Press. Carpenter, Robert 1997. Type-Logical Semantics. Cambridge, MA: The MIT Press. Carston, Robyn 2002. Thoughts and Utterances: The Pragmatics of Explicit Utterances. Oxford: Blackwell. Chomsky, Noam 1955. Syntactic Structures. The Hague: Mouton. Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge, MA: The MIT Press. Chomsky, Noam 1995. The Minimalist Program. Cambridge, MA: The MIT Press. Cooper, Robin 2005. Austinian truth, attitudes and type theory. Research on Language and Computation 3, 333–362. Cooper, Robin 2006. Records and record types in semantic theory. Journal of Logic and Computation 15, 99–112. Cooper, Robin & Aarne Ranta 2008. Natural language as collections of resources. In: R. Cooper & R. Kempson (eds.). Language in Flux: Dialogue Coordination, Language Variation, Change, and Evolution. London: College Publications, 109–120. Cresswell, Max 1985. Structured Meaning. Cambridge, MA: The MIT Press.
239
240
II. History of semantics Dalrymple Mary, Stuart Shieber & Fernando Pereira 1991. Ellipsis and higher-order unification. Linguistics & Philosophy 14, 399–452. Dalrymple, Mary (ed.) 1999. Semantics and Syntax in Lexical Functional Grammar: The Resource Logic Approach. Cambridge, MA: The MIT Press. Dekker, Paul 2000. Coreference and representationalism. In: K. von Heusinger & U. Egli (eds.). Reference and Anaphoric Relations. Dordrecht: Kluwer, 287–310. Evans, Gareth 1980. Pronouns. Linguistic Inquiry 11, 337–362. Fernando, Timothy 2002. Three processes in natural language interpretation. In: W. Sieg, R. Sommer & C. Talcott (eds.). Reflections on the Foundations of Mathematics: Essays in Honor of Solomon Feferman. Natick, MA: Association for Symbolic Logic, 208–227. Fiengo, Robert & Robert May 1994. Indices and Identity. Cambridge, MA: The MIT Press. Fine, Kit 1984. Reasoning with Arbitrary Objects. Oxford: Blackwell. Fitch, Frederic 1951. Symbolic Logic. New York: The Ronald Press Company. Fodor, Jerry 1981. Re-Presentations. Cambridge, MA: The MIT Press. Fodor, Jerry 1983. Modularity of Mind. Cambridge, MA: The MIT Press. Fodor, Jerry 1988. A situated grandmother. Mind & Language 2, 64–81. Fodor, Jerry 1998. Concepts. Oxford: Oxford University Press. Gamut, L.T.F. 1991. Logic, Language and Meaning. Chicago, IL: The University of Chicago Press. Geach, Peter 1972. Logic Matters. Oxford: Oxford University Press. Ginzburg, Jonathan & Robin Cooper 2004. Clarification, ellipsis, and the nature of contextual updates. Linguistics & Philosophy 27, 297–365. Grice, Paul 1989. Studies in the Way of Words. Cambridge, MA: Harvard University Press. Groenendijk, Jeroen & Martin Stokhof 1991. Dynamic predicate logic. Linguistics & Philosophy 14, 39–100. Hamm, Fritz, Hans Kamp & Michiel van Lambalgen 2006. There is no opposition between formal and cognitive semantics. Theoretical Linguistics 32, 1–40. Heim, Irene 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation. University of Massachusetts, Amherst, MA. Reprinted: Ann Arbor, MI: University Microfilms. Hilbert, David & Bernays, Paul 1939. Grundlagen der Mathematik. Vol. II. 2nd edn. Berlin: Springer. Hintikka, Jaako 1974. Quantifiers vs. quantification theory. Linguistic Inquiry 5, 153–177. Jackendoff, Ray 2002. Foundations of Language: Brain, Meaning, Grammar, Evolution. Oxford: Oxford University Press. Kamp, Hans 1981. A theory of truth and semantic representation. In: J. Groenendijk, T. Janssen & M. Stokhof (eds.). Formal Methods in the Study of Language. Amsterdam: Mathematical Centre, 277–322. Kamp, Hans 1996. Discourse Representation Theory and Dynamic Semantics: Representational and nonrepresentational accounts of anaphora. French translation in: F. Corblin & C. Gardent (eds.). Interpréter en contexte. Paris: Hermes, 2005. Kamp, Hans & Uwe Reyle 1993. From Discourse to Logic. Dordrecht: Kluwer. Katz, Jerrold 1972. Semantic Theory. New York: Harper & Row. Katz, Jerrold & Jerry Fodor 1963. The structure of a semantic theory. Language 39, 170–210. Kempson, Ruth, Wilfried Meyer-Viol & Dov Gabbay 2001. Dynamic Syntax: The Flow of Language Understanding. Oxford: Blackwell. Lambek, Joachim 1958. The mathematics of sentence structure. American Mathematical Monthly 65, 154–170. Lappin, Shalom & Christopher Fox 2005. Foundations of Intensional Semantics. Oxford: Blackwell. van Leusen, Noor & Reinhard Muskens 2003. Construction by description in discourse representation. In: J. Peregrin (ed). Meaning: The Dynamic Turn, chap. 12. Amsterdam: Elsevier, 33–65. Lewis, David 1970. General semantics. Synthese 22, 18– 67. Reprinted in: G. Harman & D. Davidson (eds.). Formal Semantics of Natural Language. Dordrecht: Reidel, 1972, 169–218. Martin-Löf, Per 1984. Intuitionistic Type Theory, Naples: Bibliopelis.
11. Formal semantics and representationalism Merchant, Jason 2004. Fragments and ellipsis. Linguistics & Philosophy 27, 661–738. Montague, Richard 1970. English as a formal language. In: Bruno Visentini et al. (eds.). Linguaggi nella Societ à e nella Tecnica. Milan: Edizioni di Comunità, 189–224. Reprinted in: R. Thomason (ed.). Formal Philosophy. Selected Papers of Richard Montague. New Haven, CT: Yale University Press, 1974, 188–221. Morgan, Jerry 1973. Sentence fragments and the notion ‘sentence’. In: H. R. Kahane, R. Kahane & B. Kachru (eds.). Issues in Linguistics. Urbana, IL: University of Illinois Press, 719–751. Morrill, Glyn 1990. Intensionality and boundedness. Linguistics & Philosophy 13, 699–726. Morrill, Glyn 1994. Type Logical Grammar. Dordrecht: Foris. Morrill, Glyn 2010. Categorial Grammar: Logical Syntax, Semantics, and Processing. Oxford: Oxford University Press. Morrill, Glyn & Anna Gavarró 2004. On aphasic comprehension and working memory load. In: Proceedings of Categorial Grammars: An Efficient Tool for Natural Language Processing. Montpellier, 259–287. Muskens, Reinhard 1996. Combining Montague semantics and Discourse Representation. Linguistics & Philosophy 19, 143–186. Muskens, Reinhard, Johan van Benthem & Albert Visser 1997. Dynamics. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 587–648. Partee, Barbara 1973. Some structural analogies between tenses and pronouns in English. The Journal of Philosophy 70, 601–609. Piwek, Paul 1998. Logic, Information and Conversation. Ph.D dissertation. University of Technology, Eindhoven. Prawitz, Dag 1965. Natural Deduction: A Proof-Theoretical Study. Uppsala: Almqvist & Wiksell. Purver, Matthew, Ronnie Cann & Ruth Kempson 2006. Grammars as parsers: The dialogue challenge. Research on Language and Computation 4, 289–326. Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA: The MIT Press. Ranta, Aarne 1994. Type-Theoretic Grammar. Oxford: Clarendon Press. Reyle, Uwe 1993. Dealing with ambiguities by underspecification: Construction, representation and deduction. Journal of Semantics 10, 123–179. Sag, Ivan, Tom Wasow & Emily Bender 2002. Head Phrase Structure Grammar: An Introduction. 2nd edn. Stanford, CA: CSLI Publications. Sperber, Dan & Deirdre Wilson 1986/1995. Relevance: Communication and Cognition. Oxford: Blackwell. Stainton, Robert 2006. Words and Thoughts. Oxford: Oxford University Press. Stalnaker, Robert 1970. Pragmatics. Synthese 22, 272–289. Stalnaker, Robert 1999. Context and Content. Oxford: Oxford University Press. Thomason, Richmond (ed.) 1974. Formal Philosophy. Selected Papers of Richard Montague. New Haven, CT: Yale University Press. van Eijk, Jan & Hans Kamp 1997. Representing discourse in context. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 179–239. von Heusinger, Klaus 1997. Definite descriptions and choice functions. In: S. Akama (ed.). Logic, Language and Computation. Dordrecht: Kluwer, 61–92. von Heusinger, Klaus & Ruth Kempson (eds.) 2004. Choice Functions in Linguistic Theory. Special issue of Research on Language and Computation 2. Wittgenstein, Ludwig 1953. Philosophical Investigations. Translated by Gillian Anscombe. 3rd edn. Oxford: Blackwell, 1967.
Ruth Kempson, London (United Kingdom)
241
III. Methods in semantic research 12. Varieties of semantic evidence 1. 2. 3. 4. 5. 6. 7. 8.
Introduction: Aspects of meaning and possible sources of evidence Fieldwork techniques in semantics Communicative behavior Behavioral effects of semantic processing Physiological effects of semantic processing Corpus-linguistic methods Conclusion References
Abstract Meanings are the most elusive objects of linguistic research. The article summarizes the type of evidence we have for them: various types of metalinguistic activities like paraphrasing and translating, the ability to name entities and judge sentences true or false, as well as various behavioral and physiological measures such as reaction time studies, eye tracking, and electromagnetic brain potentials. It furthermore discusses the specific type of evidence we have for different kinds of meanings, such as truth-conditional aspects, presuppositions, implicatures, and connotations.
1. Introduction: Aspects of meaning and possible sources of evidence 1.1. Why meaning is a special research topic If we ask an astronomer for evidence for phosphorus on Sirius, she will point out that spectral analysis of the light from this star reveals bands that are characteristic of this element, as they also show up when phosphorus is burned in the lab. If we ask a linguist the more pedestrian question for evidence that a certain linguistic expression – say, the sentence The quick brown fox jumps over the lazy dog – has meaning, answers are probably less straightforward and predictable. He might point out that speakers of English generally agree that it has meaning – but how do they know? So it is perhaps not an accident that the study of meaning is the subfield of linguistics that developed only very late in the 2500 years of history of linguistics, in the 19th century (cf. article 9 (Nerlich) Emergence of semantics). The reason why it is difficult to imagine what evidence for meaning could be is that it is difficult to say what meaning is. According to a common assumption, communication consists in putting meaning into a form, a form that is then sent from the speaker to the addressee (the conduit metaphor of communication, see Lakoff & Johnson 1980). Aspects that are concerned with the form of linguistic expressions and their material realization as studied in syntax, morphology, phonology and phonetics; they are generally Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 242–268
12. Varieties of semantic evidence more tangible than aspects concerned with their content. But semanticists in general hold that semantics, the study of linguistic meaning, indeed has an object to study that is related but distinct from the forms in which it is encoded, from the communicative intentions of the speaker and from the resulting understanding of the addressee.
1.2. Aspects of meaning The English noun meaning is multiply ambiguous, and there are several readings that are relevant for semantics. One branch of investigation starts out with meaning as a notion rooted in communication. Grice (1957) has pointed out that we can ask what a speaker meant by uttering something, and what the utterance means that the speaker uttered. Take John F. Kennedy’s utterance of the sentence Ich bin ein Berliner on June 26, 1963. What JFK meant was that in spite of the cold war, the USA would not surrender West Berlin – which was probably true. What the utterance meant was that JFK is a citizen of Berlin, which was clearly false. Obviously, the speaker’s meaning is derived from the utterance meaning and the communicative situation in which it was uttered. The way how this is derived, however, is less obvious – cf. article 2 (Jacob) Meaning, intentionality and communication, especially on particularized conversational implicatures. A complementary approach is concerned with the meaning of linguistic forms, sometimes called literal meanings, like the meaning of the German sentence Ich bin ein Berliner which was uttered by JFK to convey the intended utterance meaning. With forms, one can distinguish the following aspects of meaning (cf. also article 3 (Textor) Sense and reference, and article 4 (Abbott) Reference). The character is the meaning independent from the situation of utterance (like speaker, addressee, time and location – see Kaplan 1978). The character of the sentence used by JFK is that the speaker of the utterance is a citizen of Berlin at the time of utterance. If we find a sticky note in a garbage can, reading I am back in five minutes – where we don’t know the speaker, the time, or location of the utterance – we just know the character. A character, supplied with the situation of utterance, gets us the content or intension (Frege’s Sinn) of a linguistic form. In the proper historical context, JFK’s utterance has the content that JFK is a citizen of Berlin on June 26, 1963. (We gloss over the fact here that this first has to be decoded as a particular speech act, like an assertion.) This is a proposition, which can be true or false in particular circumstances. The extension or reference of an expression (Frege’s Bedeutung) is its content when applied to the situation of utterance. In the case of a proposition, this is a truth value; in the case of a name or a referring expression, this is an entity. Sometimes meaning is used in a more narrow sense, as opposed to reference; here I have used meaning in an encompassing way. Arguably, the communicative notion of meaning is the primary one. Meaning is rooted in the intention to communicate. But human communication crucially relies on linguistic forms, which are endowed with meaning as outlined, and for which speakers can construct meanings in a compositional way (see article 6 (Pagin & Westerståhl) Compositionality). Semantics is concerned with the meaning of linguistic forms, a secondary and derived notion. But the use of these forms in communication is crucial data to re-engineer the underlying meaning of the forms. The ways how literal meanings are used in acts of communication and their effects on the participants, in general, is part of pragmatics (cf. article 88 (Jaszczolt) Semantics and pragmatics).
243
244
III. Methods in semantic research
1.3. Types of access to meaning Grounding meaning of linguistic expressions in communication suggests that there are various kinds of empirical evidence for meaning. First, we can observe the external behavior of the participants in, before and after the act of communication. Some kinds of behavior can be more directly related to linguistic meaning than others, and hence will play a more central role in discovering underlying meaning. For example, commands often lead to a visible non-linguistic reaction, and simple yes/no-questions will lead to linguistic reactions that are easily decodable. Secondly, we can measure aspects of the external behavior in detail, like the reaction times to questions, or the speed in which passages of text are read (cf. article 102 (Frazier) Meaning in psycholinguistics). Third, we can observe physiological reactions of participants in communication, like the changing size of their pupil, the saccades of the eyes reading a text, the eye gaze when presented to a visual input together with a spoken comment, or the electromagnetic field generated by their cortex (cf. article 15 (Bott, Featherston, Radó & Stolterfoht) Experimental methods, and article 105 (Pancheva) Meaning in neurolinguistics). Fourth, we can test hypotheses concerning meaning in the output of linguistic forms itself, using statistical techniques applied to corpora (cf. article 109 (Katz) Semantics in corpus linguistics).
1.4. Is semantics possible? The reader should be warned that correlations between meanings and observable phenomena like non-linguistic patterns of behavior or brain scans do not guarantee that the study of meaning can be carried out successfully. Leonard Bloomfield, a behavioralist, considered the observable effects so complex and interwoven with other causal chains that the science of semantics is impossible: We have defined the meaning of a linguistic form as the situation in which the speaker utters it and the response which it calls forth in the hearer. […] In order to give a scientifically accurate definition of meaning for every form of a language, we should have to have a scientifically accurate knowledge of everything in the speaker’s world. The actual extent of human knowledge is very small compared to this. […] The statement of meanings is therefore the weak point in language-study, and will remain so until human knowledge advances very far beyond its present state. (Bloomfield 1933: 139f)
We could imagine similar skepticism concerning the science of semantics from a neuroscientist believing that meanings are activation patterns of our head. The huge number of such patterns, and their variation across individuals that we certainly have to expect, seems to preclude that they will provide the foundation for the study of meaning. Despite Bloomfield’s qualms, the field of semantics has flourished. Where he went wrong was in believing that we have to consider the whole world of the speaker, or the speaker’s whole brain. There are ways to cut out phenomena that stand in relation to, and bear evidence for, meanings in much more specific ways. For example, we can investigate whether a specific sentence in a particular context and describing a particular situation is considered true or false; and derive from that hypotheses about the meaning of the
12. Varieties of semantic evidence sentence and the meaning of the words involved in that sentence. The usual methods of science – forming hypotheses and models, deriving predictions, making observations and constructing experiments that support or falsify the hypotheses – have turned out to be applicable to linguistic semantics as well.
1.5. Native semantic activities There are many native activities that directly address aspects of meaning. When Adam named the animals of paradise he assigned expressions to meanings, as we do today when naming things or persons or defining technical terms. We explain the meaning of words or idioms by paraphrasing them – that is, by offering different expressions with the same or at least similar meanings. We can refer to aspects of meaning: We say that one expression means the same as another one, or its opposite; we say that one expression refers to a subcase of another. As for speaker’s meanings, we can elaborate on what someone meant by such-and-such words, and can point out differences between that and what the words actually meant or how they were understood by the addressee. Furthermore, for human communication to work it is crucial that untruthful use of language can be detected, and liars can be identified and punished. For this, a notion of what it means for a sentence or text to be true or false is crucial. Giving a statement at court means to know what it means to speak the truth, and the whole truth. Hence, it seems that meanings are firmly established in the pre-scientific ways we talk about language. We can translate, that is, rephrase an expression in one language by an expression in another while keeping the meaning largely constant. We can teach the meaning of words or expressions to second language learners or to children acquiring their native language – even though both groups, in particular children in first language acquisition, will acquire meanings to a large part implicitly, by contextual clues. The sheer possibility of translation has been enormously important for the development of humankind. We find records of associated practices, like the making of dictionaries, dating back to Sumerian-Akkadian glossaries of 2300 BC. These linguistic activities show that meaning is a natural notion, not a theoretical concept. They also provide important source of evidence for meaning. For example, it would be nearly impossible to construct a dictionary in linguistic field work without being able to ask for what a particular word means, or how a particular object is called. As another example, it would be foolish to dismiss the monumental achievements of the art of dictionary writing as evidence for the meaning of words. But there are problems with this kind of evidence that one must be aware of. Take dictionary writing. Traditional dictionaries are often unsystematic and imprecise in their description of meaning. They do not distinguish systematically between contextual (or “occasional”) meaning and systematic meaning, nor do they keep ambiguity and polysemy apart in a rigorous way. They often do not distinguish between linguistic aspects and more general cultural aspects of the meaning and use of words. Weinreich (1964) famously criticized the 115 meanings of the verb to turn that can be found in Webster’s Third Dictionary. Lexicography has greatly improved since then, with efforts to define lexical entries by a set of basic words and by recognizing regularities like systematic variations between word meanings (e.g. the intransitive use of transitive verbs, or the polysemy triggered in particular contexts of use).
245
246
III. Methods in semantic research
1.6. Talking about meanings Pre-scientific ways to address meanings rely on an important feature of human language, its self reference – we can use language to talk about language. This feature is so entrenched in language that it went unnoticed until logicians like Frege, Russell and Tarski, working with much more restricted languages, pointed out the importance of the metalanguage / object language distinction. It is only quite recently that we distinguish between regular language, reference to expressions, and reference to meanings by typographical conventions and write things like “XXX means ‘YYY’”. The possibility to describe meanings may be considered circular – as when Tarski states that ‘Snow is white’ is true if and only if snow is white. However, it does work under certain conditions. First, the meaning of an unknown word can be described, or at least delimited, by an expression that uses known words; this is the classical case of a definition. If we had only this procedure available as evidence for meaning, things would be hopeless because we have to start somewhere with a few expressions whose meanings are known; but once we have those, they can act as bootstraps for the whole lexicon of a language. The theory of Natural Semantic Metalanguage even claims that a small set of concepts (around 200) and a few modes of combining them are sufficient to achieve access to the meanings of all words of a language (Goddard 1998). Second, the meanings of an ambiguous word or expression can be paraphrased by expressions that have only one or the other meaning. This is common practice in linguistic semantics, e.g. when describing the meaning of He saw that gasoline can explode as (a) ‘He saw an explosion of a can of gasoline’ and (b) ‘He recognized the fact that gasoline is explosive’. Speakers will generally agree that the original sentence has the two meanings teased apart by the paraphrases. There are variations on this access to meaning. For example, we might consider a sentence in different linguistic contexts and observe differences in the meaning of the sentence by recognizing that it has to be paraphrased differently. For the paraphrases, we can use a language that has specific devices that help to clarify meanings, like variables. For example, we can state that a sentence like Every man likes a woman that likes him has a reading ‘Every man x likes a woman y that likes x’, but not ‘There is a woman y that every man x likes and that likes x'. The disadvantage of this is that the paraphrases cannot be easily grasped by naïve native speakers. In the extreme case, we can use a fully specified formal language to specify such meanings, such as first-order predicate logic; the existing reading of our example then could be specified as ∀x[man(x) → ∃y[woman(y) ∧ likes(x, y) ∧ likes (y, x)]]. Talking about meanings is a very important source of evidence for meanings. However, it is limited not only by the problem mentioned above, that it describes meanings with the help of other meanings. There are many cases where speakers cannot describe the meanings of expressions because this task is too complex – think of children acquiring their first language, or aphasics loosing the capacity of language. And there are cases in which the description of meanings would be too complex for the linguist. We may think of first fieldwork sessions in a research project on an unknown language. Somewhat closer to home, we may also think of the astonishingly complex meanings of natural language determiners such as a, some, a certain, a particular, a given or indefinite this in there was this man standing at the door whose meanings had to be teased apart by careful considerations of their acceptability in particular contexts.
12. Varieties of semantic evidence
2. Fieldwork techniques in semantics In this section we will discuss various techniques that have been used in linguistic fieldwork, understood in a wide sense as to include work on one’s own language and on language acquisition, for example. There are a variety of sources that reflect on possible procedures; for example, the authors in McDaniel, McKee & Cairns (eds.) (1996) discuss techniques for the investigation of syntax in child language, many of which also apply to semantic investigations, and Matthewson (2004) is concerned with techniques for semantic research in American languages which, of course, are applicable for work on other languages as well (cf. also article 13 (Matthewson) Methods in cross-linguistic semantics, and article 103 (Crain) Meaning in first language acquisition).
2.1. Observation, transcription and translation The classical linguistic fieldwork method is to record conversations and texts in natural settings, transcribe them, and assign translations, ideally with the help of speakers that are competent in a language that they share with the investigator. In classical American structuralism, this has been the method de rigueur, and it is certainly of great importance when we want to investigate natural use of language. However, this technique is also severely limited. First, even large text collections may not provide the evidence that distinguishes between different hypothesis. Consider superlatives in English; is John is the tallest student true if John and Mary both are students that are of the same height and taller than any other student? Competent English speakers say no, superlatives must be unique – but it might be impossible to find out on the basis of a corpus of non-elicited text. Secondly, there is the problem of translation. Even when we grant that the translation is competent according to usual standards, it is not clear how we should deal with distinctions in the object language that are not easily made in the meta language. For example, Matthewson (2004) shows that in Menominee (Algonquian, Northern Central United States of America), inalienable nouns can have a prefix me- indicating an arbitrary owner, as contrasted with a prefix o- indicating a specific 3rd person owner. This difference could not be derived from simple translations of Menominee texts into English, as English does not make this distinction. There is also the opposite problem of distinctions that are forced on us by the meta language; for example, pronouns in English referring to humans distinguish two genders, which may not be a feature of the object language. Hence, as Matthewson puts it, translations should be seen as clues for semantic analysis, rather as its result. Translations, or more generally paraphrases, are problematic for more fundamental reasons as evidence for meaning, as they explain the meaning of an expression α by way of the meaning of an expression β, hence it presupposes the existence and knowledge of meanings, and a judgment of similarity of meaning. However, it appears that without accepting this type of hermeneutic circle the study of semantics could not get off the ground. But there are methods to test hypotheses that have been generated first with the help of translations and paraphrases by independent means.
2.2. Pointing Pointing is a universal non-linguistic human behavior that aligns with aspects of meanings of certain types of linguistic expressions (cf. also article 90 (Diessel) Deixis and
247
248
III. Methods in semantic research demonstratives). Actually, pointing may be as characteristic for humans as language, as humans appear to be the only apes that point (cf. Tomasello 2008). Pointing is most relevant for referring expressions, with names as the prototypical example (cf. article 4 (Abbott) Reference). These expressions denote a particular entity that is also identified by the pointing gesture, and hence pointing is independent evidence for the meaning of such expressions. For example, if in a linguistic fieldwork situation an informant points to a person and utters Max, this might be taken to be the name of that person. We can conclude that Max denotes that person, in other words, that the meaning of Max is the person pointed at. Simple as this scenario is, there are certain prerequisites for it to work. For example, the pointing gesture must be recognized as such; in different cultures, the index finger, the stretched-out hand, or an upward movement of the chin may be used, and in some cultures there may be a taboo against pointing gestured when directed at humans. Furthermore, there must be one most salient object in the pointing cone (cf. Kranstedt et al. 2006) that will then be identified. This presupposes a pre-linguistic notion of objects, and of saliency. This might work well when persons or animals are pointed at, who are cognitively highly salient. But mistakes can occur when there is more than one object in the pointing cone that are equally salient. When Captain Cook on his second voyage visited an island in the New Hebrides with friendly natives and tried to communicate with them, he pointed to the ground. What he heard was tanna, which he took as the name of the island, which is still known under this name. Yet the meaning of tana in all Melanesian languages is simply “earth”. The native name for Tanna is reported to be parei (Gregory 2003); it is not in use anymore. Pointing gestures may also help to identify the meaning of common nouns, adjectives, or verbs – expressions that denote sets of entities or events. The pointing is directed towards a specimen, but reference is at entities of the same type as the one pointed at. There is an added source of ambiguity or vagueness here: What is “the same type as”? On his first voyage, Captain Cook made landfall in Australia, and observed creatures with rabbit-like ears hopping on their hind legs. When naturalist Joseph Banks asked the local Guugu Yimidhirr people how they are called, presumably with the help of some pointing gesture, he was the first to record the word kangaroo. But the word gangurru actually just refers to a large species of black kangaroo, not to the marsupial family in general (cf. Haviland 1974). Quine (1960: ch. II), in an argument to discount the possibility of true translation, famously described the problems that even a simple act like pointing and naming might involve. Assume a linguist points to a white rabbit, and gets the response gavagai . Quine asks whether this may mean ‘rabbit’, or perhaps ‘animal’, or perhaps ‘white’, or perhaps even ‘non-detached rabbit parts’. It also might mean ‘rabbit stage’, in which case repeated pointing will identify different reference objects. All these options are theoretical possibilities under the assumption that words can refer to arbitrary aspects of reality. However, it is now commonly assumed that language is build on broad cognitive commonalities about entities and classes. There is evidence that pre-linguistic babies and higher animals have concepts of objects (as contrasted to substances) and animals (as contrasted to lifeless beings) that preclude a conceptualization of a rabbit as a set of rabbit legs, a rabbit body, a rabbit head and a pair of rabbit ears moving in unison. Furthermore, there is evidence that objects are called with terms of a middle layer of a
12. Varieties of semantic evidence
249
taxonomic hierarchy, the so-called “generic level”, avoiding terms that are too general or too specific (cf. Berlin, Breedlove & Raven 1973). Hence a rabbit will not be called thing in English, or animal, and it will not be called English angora either except perhaps by rabbit breeders that work with a different taxonomy. This was the reason for Captain Cooks misunderstanding of gangurru; the native Guugu Yimidhirr people had a different, and more refined, taxonomic hierarchy for Australian animals, where species of kangaroo formed the generic level; for the British visitors the family itself belonged to that level. Pointing, or related gestures, have been used to identify the meaning of words. For example, in the original study of Berlin & Kay (1969) on color terms subjects were presented with a two-dimensional chart of 320 colors varying according to spectral color and saturation. The task was to identify the best specimen for a particular color word (the focal color) and the extent to which colors fall under a particular color word. Similar techniques have been used for other lexical fields, for example for the classification of vessels using terms like cup, mug or pitcher (cf. Kempton 1981; see Fig. 12.1).
mug
Coffee cup
Fig. 12.1: Vessel categories after Kempton (1981: 103). Bold lines: Identification of >80% agreement between subjects for mug and coffee cup. Dotted lines: Hypothetical concept that would violate connectedness and convexity (see below)
Tests of this type have been carried out in two ways: Either subjects were presented with a field of reference objects ordered after certain dimensions; e.g. Berlin & Kay (1969) presented colors ordered after their wave length (the order they present themselves in a rainbow) and after their saturation (with white and black as the extremes). Kempton’s
250
III. Methods in semantic research vessels were presented as varying in two dimensions: The relation between the upper and lower diameters, and the relation between height and width. When judging whether certain items fall under a term or not, the neighboring items that already have been classified might influence the decision. Another technique, which was carried out in the World Color Survey (see Kay et al. 2008), presented color chips in random order to avoid this kind of influence. The pointing test can be used in two ways: Either we point at an entity in order to get the term that is applicable to that entity, or we have a term and point to various objects to find out whether the term is applicable. The first approach asks an onomasiological question; it is concerned with the question: How is this thing called? The second approach asks the complementary semasiological question: What does this expression mean? Within a Fregean theory of meaning, a distinction is made between reference and sense (cf. article 3 (Textor) Sense and reference). With pointing to concrete entities we gain access to the reference of expressions, and not to the sense, the concept that allows us to identify the reference. But by varying potential reference objects we can form hypotheses about the underlying concept, even though we can never be certain that by a variation of reference objects we will uncover all aspects of the underlying concept. Goodman (1955) illustrated this with the hypothetical adjective grue that, say, refers to green objects when used before the year 2100 and to blue objects when used after that time; no pointing experiment executed before 2100 could differentiate grue from green. Meaning shifts like that do happen historically: The term Scotia referred to Ireland before the 11th century, and after to Scotland; the German term gelb was reduced in extension when the term orange entered the language (cf. the traditional local term Gelbe Rüben ‘yellow turnips’ for carrots). But these are language changes, and not meanings of items within a language. A meaning like the hypothetical grue appears as strange as a reference towards non-detached rabbit parts. We work under the hypothesis that meanings of lexical items are restricted by general principles of uniformity over time. There are other such principles that restrict possible meanings, for example connectedness and, more specifically, convexity (Gärdenfors 2000). In the vessel example above, where potential reference objects were presented following certain dimensions, we expect that concepts do not apply to discontinuous areas and have the general property that when x is an α and y is an α, then everything in between x and y is an α as well. The dotted lines in Fig. 12.1 represent an extension of a concept that would violate connectedness and convexity. In spite of all its problems, pointing is the most elementary kind of evidence for meaning without which linguistic field work, everyday communication and language acquisition would be impossible. Yet it seems that little research has been done on pointing and language acquisition, be it first or second. Its importance, however, was recognized as early as in St. Augustin’s Confessions (5th century AD), where he writes about his own learning of language: When they [the elders] called some thing by name and pointed it out while they spoke, I saw it and realized that the thing they wished to indicate was called by the name they then uttered. And what they meant was made plain by the gestures of their bodies, by a kind of natural language, common to all nations […] (Confessions, Book I: 8)
12. Varieties of semantic evidence
2.3. Truth value judgments (TVJ) Truth value judgments do the same job for the meaning of sentences as pointing does for referring expressions. In the classical setup, a situation is presented with non-linguistic means together with a declarative sentence, and the speaker has to indicate whether this sentence is true or false with respect to the situation. This judgement is an linguistic act by itself, so it can be doubted that this provides a way to base the study of meaning wholly outside of language. But arguably, agreeing or disagreeing are more primitive linguistic acts that may even rely on simple gestures, just as in the case of pointing. The similarity between referential expressions – which identify objects – and declarative sentences – which identify states of affairs in which they are true – is related to Frege’s identification of the reference of sentences with their truth value with respect to a particular situation (even though this was not the original motivation for this identification, cf. Frege 1892). This is reflected in the two basic extensional types assumed in sentence semantics: Type e for entities referred to by names, and type t for truth values referred to by sentences. But there is an important difference here: There are many distinct objects – De, the universe of discourse, is typically large; but there are just two (basic) truth values – Dt, the set of truth values, standardly is {0, 1}, falsity and truth. Hence we can distinguish referring expressions more easily by their reference than we can distinguish declarative sentences. One consequence of this is that onomasiological tests do not work. We cannot present a “truth value” and expect a declarative sentence that is true. Also, on presenting a situation in a picture or a little movie we cannot expect that the linguistic reactions are as uniform as when we, say, present the picture of an apple. But the semasiological direction works fine: We can present speakers with a declarative sentence and a situation or a set of situations and ask whether the sentence is true in those situations. Truth values are not just an ingenious idea of language philosophers to reduce the meaning of declarative sentences to judgments whether a sentence is true or false in given situations. They are used pre-linguistically, e.g. in court procedures. Within linguistics, they are used to investigate the meaning of sentences in experiments and in linguistic field work. They have been particularly popular in the study of language acquisition because they require a rather simple reaction by the child that can be expected even from two-year olds. The TVJ task comes in two flavors. In both, the subjects are presented with a sentence and a situation, specified by a picture, an acted-out scene with hand puppets or a movie, or by the actual world provided that the subjects have the necessary information about it. In the first version, the subjects should simply state whether the sentence is true or false. This can be done by a linguistic reaction, by a gesture, by pressing one of two buttons, or by ticking off one of two boxes. We may also record the speed of these reactions in order to get data about the processing of expressions. In the second version, there is a character, e.g. a hand puppet, that utters the sentence in question, and the subjects should reward or punish the character if the sentence is true or false with respect to the situation presented (see e.g. Crain 1991). A reward could be, for example, feeding the hand puppet a cookie. Interestingly, the second procedure taps into cognitive resources of children that are otherwise not as easily accessible. Gordon (1998), in a description of TVJ in language acquisition, points out that this task is quite natural and easy. This is presumably so because truth value judgment is an elementary linguistic activity, in contrast to, say, grammaticality judgments. TVJ also puts
251
252
III. Methods in semantic research less demands on answers than wh-questions (e.g. Who chased the zebra? vs. Did the lion chase the zebra?) This makes it the test of choice for children and for language-impaired persons. But there are potential problems in carrying out TVJ tasks. For example, Crain et al. (1998) have investigated the phenomenon that children seem to consider a sentence like Every farmer is feeding a donkey false if there is a donkey that is not fed by the farmer. They argue that children are confused by the extra donkey and try to reinterpret the sentence in a way that seems to make sense. A setup in which attention is not drawn to a single object might be better; even adding a second unfed donkey makes the judgments more adult-like. Also, children respond better to scenes that are acted out than to static pictures. In designing TVJ experiments, one should consider the fact that positive answers are given quicker and more easily than negative ones. Furthermore, one should be aware that unconscious reactions of the experimenter may provide subtle clues for the “right” answer (the “Clever Hans” effect, named after the horse that supposedly could solve arithmetic problems). For example, when acting out and describing a scene, the experimenter may be more hesitant when uttering a false statement.
2.4. TVJ and presuppositions/implicatures There are different aspects of meaning beyond the literal meaning, such as presuppositions, conventional implicatures, conversational implicatures and the like, and it would be interesting to know how such meaning components fare in TVJ tasks. Take presuppositions (cf. also article 91 (Beaver & Geurts) Presupposition). Theories such as Stalnaker (1974) that treat them as preconditions of interpretation predict that sentences cannot be interpreted with respect to situations that violate their presuppositions. The TVJ test does not seem to support this view. The sentence The dog is eating the bone will most likely be judged true with respect to a picture showing two dogs, where one of the dogs is eating a bone. This may be considered evidence for the ease of accommodation, which consists of restricting the context to the one dog that is eating a bone. Including a third option or truth value like “don’t know” might reveal the specific meaning contribution of presuppositions As for conversational implicature (cf. article 92 (Simons) Implicature) we appear to get the opposite picture. TVJ tests have been used to check the relevance of scalar implicatures. For example, Noveck (2001), building on work of Smith (1980), argued that children are “more logical” than adults because they can dissociate literal meanings from scalar implicatures. Children up to 11 years react to statements like some giraffes have long necks (where the picture shows that all giraffes have long necks) with an affirmative answer, while most adults find them inappropriate.
2.5. TVJ variants: Picture selection and acting out The picture selection task has been applied for a variety of purposes beyond truth values (cf. Gerken & Shady 1998). But for the purpose of investigating sentence meanings, it can be seen as a variant to the TVJ task: The subject is exposed to a declarative sentence and two or more pictures and has to identify the picture for which the sentence is true. It is good to include irrelevant pictures as filler items, which can test the attention of
12. Varieties of semantic evidence the subjects. The task can be used to identify situations that fit best to a sentence. For example, for sentences with presuppositions it is expected that a picture will be chosen that does not only satisfy the assertion, but also the presupposition. So, if the sentence is The dog is eating a bone, and if a picture with one or two dogs is shown, then presumably the picture with one dog will be preferred. Also, sentences whose scalar implicature is satisfied will be preferred over those for which this is not the case. For example, if the sentence is some giraffes have long necks, a picture in which some but not all giraffes have long necks will be preferred over a picture in which all giraffes are long-necked. Another relative of the TVJ task is the Act Out task in which the subject has to “act out” a sentence with a scene such that the sentence is true. Again, we should expect that sentences are acted out in a way as to satisfy all meaning components – assertion, presupposition, and implicature – of a sentence.
2.6. Restrictions of the TVJ methodology One restriction of the various TVJ methodologies appears to be that they just target expressions that have a truth value, that is, sentences. However, they allow to investigate the meaning of subsentential expressions, under the assumption that the meaning of sentences is computed in a compositional way from the meanings of their syntactic parts (cf. article 6 (Pagin & Westerståhl) Compositionality). For example, the meaning of spatial presuppositions like on, on top of, above or over can be investigated with scenes in which objects are arranged in particular ways. Another potential restriction of TVJ as discussed so far is that we assumed that the situations are presented by pictures. Language is not restricted to encoding information that can be represented by visual stimuli. But we can also present sounds, movie scenes or comic strips that represent temporal developments, or even olfactory and tactile stimuli to judge the range of meanings of words (cf. e.g. Majid et al. 2006 for verbs of cutting and breaking). TVJ is also difficult to apply when deictic expressions are involved, as they often require reference to the speaker, who is typically not part of the picture. For example, in English the sentence The ball is in front of the tree means that the ball is in between the speaker that faces the tree and the tree; the superficially corresponding sentence in Hausa means that the ball is behind the tree (cf. Hill 1982). In English, the tree is seen as facing the speaker, whereas in Hausa the speaker aligns with the tree (cf. article 98 (Pederson) The expression of space). Such differences are not normally represented in pictures, but it can be done. One could either represent the picture from a particular angle, or represent a speaker with a particular position and orientation in the picture itself and ask the subject to identify with that figure. The TVJ technique is systematically limited for sentences that do not have truth values, such as questions, commands, or exclamatives. But we can generalize it to a judgment of appropriateness of sentences given a situation, which sometimes is done to investigate politeness phenomena and the like. There are also subtypes of declarative sentences that are difficult to investigate with TVJ, namely modal statements, e.g. Mary must be at home, or habituals and generics that allow for exceptions, like Delmer walks to school, or Birds fly (cf. article 47 (Carlson) Genericity). This is arguably so because those sentences require to consider different possible worlds, which cannot be easily represented graphically.
253
254
III. Methods in semantic research
2.7. TVJ with linguistic presentation of situation The TVJ technique can be applied for modal or generic statements if we present the situation linguistically, by describing it. For example, we could ask whether Delmer walks to school is true if Delmer walks every day except Fridays, when his father gives him a ride. Of course, this kind of linguistic elicitation technique can be used in nearly all the cases described so far. It has clear advantages: Linguistic descriptions are easy and cheap to produce and can focus the attention of the subject to aspects that are of particular relevance for the task. For this reason it is very popular for quick elicitations whether a sentence can mean such-and-such. Matthewson (2004) argues that elicitation is virtually the only way to get to more subtle semantic phenomena. She also argues that it can be combined with other techniques, like TVJ and grammaticality judgments. For example, in investigating aspect marking in St’át’imcets Salish (Salishan, Southwestern Canada) the sentence Have you been to Seattle? is translated using an adverb lán that otherwise occurs with the meaning ‘already’; a follow-up question could be whether it is possible to drop lán in this context, retaining roughly the same meaning. The linguistic presentation of scenes comes with its own limitations. There is the foundational problem that we get at the meaning of an expression α by way of the meaning of an expression β. It cannot be applied in case of insufficient linguistic competence, as with young children or language-impaired persons.
2.8. Acceptability tests In this type of test, speakers are given an expression and a linguistic context and/or an description of an extralinguistic situation, and are asked whether the expression is acceptable with respect to this context or the situation. With it, we can explore the felicity conditions of an expression, which often are closely related to certain aspects of its meaning. Acceptability tests are the natural way to investigate presuppositions and conventional implicatures of expressions. For example, additive focus particles like also presuppose that the predication holds for an alternative to the focus item. Hence in a context like John went to Paris, the sentence John also went to PRAGUE is felicitous, but the sentence Mary also went to PRAGUE is not. Acceptability tests can also be used to investigate information-structural distinctions. For example, in English, different accent patterns indicate different focus structures; this can be seen when judging sentences like JOHN went to Paris vs. John went to PARIS in the context of questions like Who went to Paris? and John went where? (cf. article 66 (Krifka) Questions). As another example, Portner & Yabushita (1998) discussed the acceptability of sentences with a topic-comment structure in Japanese where the topic was identified by a noun phrase with a restrictive relative clause and found that such structures are better if the relative clause corresponds to a comment on the topic in the preceding discourse. Acceptability tests can also be used to test the appropriateness of terms with honorific meaning, or various shades of expressive meaning, which have been analyzed as conventional implicatures by Potts (2005). When applying acceptability judgments, it is natural to present the context first, to preclude that the subject first comes up with other contexts which may influence
12. Varieties of semantic evidence the interpretation. Another issue is whether the contexts should be specified in the object language, or can also be given in the meta-language that is used to carry out the investigation. Matthewson (2004) discusses the various advantages and disadvantages – especially if the investigator has a less-than-perfect command over the object language – and argues that using a meta-language is acceptable, as language informants generally can resist the possible influence of the metalanguage on their responses.
2.9. Elicited production We can turn the TVJ test on its head and ask subjects to describe given situations with their own words. In language acquisition research, this technique is known as “elicited production”, and encompasses all linguistic reactions to planned stimuli (cf. Thornton 1998). In this technique the presumed meaning is fixed, and controls the linguistic production; we can hypothesize about how this meaning can be represented in language. The best known example probably is the retelling of a little movie called the Pear Story, which has unearthed interesting differences in the use of tense and aspect distinctions in different languages (cf. Chafe 1980 for the original publication). Another example, which allows to study the use of meanings in interaction, is the “map task”, where one person explains the configuration of objects or a route on a map to another without visual contact. The main problem of elicited production is that the number of possible reactions by speakers is, in principle, unlimited. It might well be that the type of utterances one expects do not occur at all. For example, we could set up a situation in which person A thinks that person B thinks that person C thinks that it is raining, to test the recursivity of propositional attitude expressions, but we will have to wait long till such utterances are actually produced. So it is crucial to select cues that constrain the linguistic production in a way that ensures that the expected utterances will indeed occur.
2.10. From sentence meanings to word meanings The TVJ technique and its variants test the meaning of sentences, not of words or subsentential expressions. Also, with elicitation techniques, often we will get sentence-like reactions. With elicited translations, it is also advisable to use whole sentences instead of single words or simpler expressions, as Matthewson (2004) argues. It is possible to elicit the basic meaning of nouns or certain verbs directly, but this is impossible for many other words. The first ten most frequent words in English are often cited as being the, of, and, a, to, in, is, you, that; it would be impossible to ask a naïve speaker of English what they mean or discover there meanings in other more direct ways, with the possible exception of you. We can derive hypotheses about the meaning of such words by using them in sentences and judging the truth value of the sentences with respect to certain situations, and their acceptability in certain contexts. For example, we can unearth the basic uses of the definite article by presenting pictures containing one or two barking dogs, and ask to pick out the best picture for the dog is barking. The underlying idea is that the assignment of meanings to expressions is compositional, that is, that the meaning of the complex expression is a result of the meaning of its parts and the way they are combined.
255
256
III. Methods in semantic research
3. Communicative behavior Perhaps the most important function of language is to communicate, that is, to transfer meanings from one mind to another. So we should be able to find evidence for meaning by investigating communicative acts. This is obvious in a trivial sense: If A tells B something, B will often act in certain ways that betray that B understood what A meant. More specifically, we can investigate particular aspects of communication and relate them to particular aspects of meaning. We will look at three examples here: Presuppositions, conversational implicatures and focus-induced alternatives. Presuppositions (cf. article 91 (Beaver & Geurts) Presupposition) are meaning components that are taken for granted, and hence appear to be downtoned. This shows up in possible communicative reactions. For example, consider the following dialogues: A: Unfortunately, it is raining. B: No, it isn’t. Here, B denies that it is raining; the meaning component of unfortunate expressing regret by the speaker is presupposed or conventionally implicated. A: It is unfortunate that it is raining. B: No, it isn’t. Here, B presupposes that it is raining, and states that this is unfortunate. In order to deny the presupposed part, other conversational reactions are necessary, like But that’s not unfortunate, or But it doesn’t rain. Simple and more elaborate denials are a fairly consistent test to distinguish between presupposed and proffered content (cf. van der Sandt 1988). For conversational implicatures (cf. article 92 (Simons) Implicature) the most distinctive property is that they are cancelable without leading to contradiction. For example, John has three children triggers the scalar implicature that John has exactly three children. But this meaning component can be explicitly suspended: John has three children, if not more. It can be explicitly cancelled: John has three children, in fact he has four. And it does not arise in particular contexts, e.g. in the context of People get a tax reduction if they have three children. This distinguishes conversational implicatures from presuppositions and semantic entailments: John has three children, {if not two / in fact, two} is judged contradictory. Our last example concerns the introduction of alternatives that are indicated by focus, which in turn can be marked in various ways, e.g. by sentence accent. A typical procedure to investigate the role of focus is the question-answer test (cf. article 66 (Krifka) Questions). In the following four potential question-answer pairs (A1-B1) and (A2-B2) are well-formed, but (A1-B2) and (A2-B1) are odd. A1: Who ate the cake? A2: What did Mary eat? B1: MARY ate the cake. B2: Mary ate the CAKE.
12. Varieties of semantic evidence This has been interpreted as saying that the alternatives of the answer have to correspond to the alternatives of the question. To sum up, using communicative behavior as evidence for meaning consists in evaluating the appropriateness of certain conversational interactions. Competent speakers generally agree on such judgments. The technique has been used in particular to identify, and differentiate, between different meaning components having to do with the presentation of meanings, in particular with information structure.
4. Behavioral effects of semantic processing When discussing evidence for the meaning of expressions we have focused so far on the meanings themselves. We can also investigate how semantic information is processed, and get a handle on how the human mind computes meanings. To get information on semantic processing, judgment tasks are often not helpful, and might even be deceiving. We need other types of evidence that arguably stand in a more direct relation to semantic processing. It is customary to distinguish between behavioral data on the one hand, and neurophysiologic data that directly investigates brain phenomena on the other. In this section we will focus on behavioral approaches (cf. also article 15 (Bott, Featherston, Radó & Stolterfoht) Experimental methods).
4.1. Reaction times The judgment tasks for meanings described so far can also tap into the processing of semantic information if the timing of judgments is considered. The basic assumption is that longer reaction times, everything else being equal, are a sign for semantic processing load. For example, Clark & Lucy (1975) have shown that indirect speech acts take longer for processing than direct ones, and attribute this to the additional inferences that they require. Noveck (2004) has shown that the computation of scalar implicature takes time; people that reacted to sentences like Some elephants are mammals with a denial (because all elephants and not just some are) took considerably longer. Kim (2008) has investigated the processing of only-sentences, showing that the affirmative content is evaluated first, and the presupposition is taken into account only after. Reaction times are relevant for many other psycholinguistic paradigms, beyond tasks like TVJ, and can provide hints for semantic processing. One notable example is the semantic phenomenon of coercion, changes of meanings that are triggered by the particular context in which meaning-bearing expressions occur (cf. article 25 (de Swart) Mismatches and coercion). One well-known example is aspectual coercion: Temporal adverbials of the type until dawn select for atelic verbal predicates, hence The horse slept until dawn is fine. But The horse jumped until dawn is acceptable as well, under an iterative interpretation of jump that is not reflected overtly. This adaptation of the basic meaning to fit the requirements of the context should be cognitively costly, and there is indeed evidence for the additional semantic processing involved. Piñango et al. (2006) report on various studies and their own experiments that made use of the dual task interference paradigm: Subjects listen to sentences and, at particular points, deal with an unrelated written lexical decision task. They were significantly slower in deciding this
257
258
III. Methods in semantic research task just after an expression that triggered coercion (e.g. until in the second example, as compared to the first). This can be taken as evidence for the cognitive effort involved in coercion; notice that there is not syntactic difference between the sentences to which such reaction time difference could be attributed.
4.2. Reading process: Self-paced reading and eye tracking Another window into semantic processing is the observation of the reading process. There are two techniques that have been used: (i) Self-paced reading, where subjects are presented with a text in a word-by-word or phrase-by-phrase fashion; the subject has control over the speed of presentation, which is recorded. (ii) Eye tracking, where the reading movements of the subject are recorded by cameras and matched with the text being read. While self-paced reading is easier to handle as a research paradigm, it has the disadvantage that it might not give fine-grained data, as subjects tend to get into a rhythmical tapping habit. Investigations of reading have provided many insights into semantic processing; however, it should be kept in mind that by their nature they only help to investigate one particular aspect of language use that lacks many features of spoken language. For example, reading speed has been used to determine how speakers deal with semantic ambiguity: Do they try to resolve it early on, which would mean that they slow down when reading triggers of ambiguity, or do they entertain an underspecified interpretation? Frazier & Rayner (1990) have shown that reading slows down after ambiguous words, as e.g. in The records were carefully guarded {after they were scratched / after the political takeover}, showing evidence for an early commitment for a particular reading. However, with polysemous words, no such slowing could be detected; an example is Unfortunately the newspaper was destroyed, {lying in the rain / managing advertising so poorly}. The newspaper example is a case of coercion, which shows effects for semantic processing under the dual task paradigm (see discussion of Piñango et al. 2006 above). Indeed, Pickering, McElree & Frisson (2006) have shown that the aspectual coercion cases do not result in increased reading times; thus different kinds of tests seem to differ in their sensitivity. Another area for which reading behavior has been investigated is the time course of pronoun resolution: Are pronouns resolved as early as possible, at the place where they occur, or is the semantic processor procrastinating this decision? According to Ehrlich & Rayner (1983), the latter is the case. They manipulated the distance between an antecedent and its pronoun and showed that distance had an effect on reading times, but only well after the pronoun itself was encountered.
4.3. Preferential looking and the visual world paradigm Visual gaze and eye movement can be used in other ways as windows to meaning and semantic processing. One technique to investigate language understanding is the preferential looking paradigm, a version of the picture selection task that can be administered to young infants. Preferential looking has been used for the investigation of stimulus discrimination, as infants look at new stimuli longer than at stimuli that they are already accustomed to. For
12. Varieties of semantic evidence the investigation of semantic abilities, so-called “Intermodal Preferential Looking” is used: Infants hear an expression and are presented at the same time with two pictures or movie scenes side by side; they preferentially look at the one that fits the description best. HirshPasek & Golinkoff (1996) have used this technique to investigate the understanding of sentences by young children that produce only single-word utterances. A second procedure that uses eye gaze is known as “Visual World Paradigm”. The general setup is as follows: Subjects are presented with a scene and a sentence or text, and have to judge whether the sentence is true with respect to the scene. In order to perform this verification, subjects have to glance at particular aspects of the scene, which yields clues about the way how the sentence is verified or falsified, that is, how it is semantically processed. In an early study, Eberhard et al. (1995) have shown that eye gaze tracks semantic interpretation quite closely. Listeners use information on a word-by-word basis to reduce the set of possible visual referents to the intended one. For example, when instructed to Touch the starred yellow square, subjects were quick to look at the target in the left-hand situation, slower in the middle situation, and slowest in the right-hand situation. Sedivy et al. (1999) have shown that there are similar effects of incremental interpretation even with non-intersective adjectives, like tall.
Fig. 12.2: Stimulus of eye gaze test (from Eberhard et al. 1995)
Altman & Kamide (1999) have shown that eye gaze is not just cotemporaneous with interpretation, but may jump ahead; subjects listening to The boy will eat the… looked preferentially at the picture of a cake than at the picture of something non-edible. In a number of studies, including Weber, Braun & Crocker (2006), the effect of contrastive accent has been studied. When listeners had already fixated one object – say, the purple scissors – and now are asked to touch the RED scissors (where there is a competing red vase), they gaze at the red scissors more quickly, presumably because the square property is given. This effect is also present, though weaker, without contrastive accent, presumably because the use of modifying adjectives is inherently contrastive. For another example of this technique, see article 15 (Bott, Featherston, Radó & Stolterfoht) Experimental methods.
5. Physiological effects of semantic processing There is no clear-cut way to distinguishing physiological effects from behavioral effects. With the physiological phenomena discussed in this section it is evident that they are truly beyond conscious control, and thus may provide more immediate access to semantic processing.
259
260
III. Methods in semantic research Physiological evidence can be gained in a number of ways: From lesions of the brain and how they affect linguistic performance, from excitations of brain areas during surgery, from the observable metabolic processes related to brain activities, and from the electro-magnetic brain potentials that accompany the firing of bundles of neurons. There are other techniques that have been used occasionally, such as pupillary dilation, which correlates with cognitive load. For example, Krüger, Nuthmann & van der Meer (2001) show with this measure that representations of event sequences following their natural order are cognitively less demanding than when not following the time line.
5.1. Brain lesions and stimulations Since the early discoveries of Broca and Wernicke, it has been assumed that specific brain lesions affect the relation between expressions to meanings. The classical picture of Broca’s area responsible for production and Wernicke’s area responsible for comprehension is now known to be incomplete (cf. Damasio et al. 2004), but it is still assumed that Broca’s aphasia impedes the ability to use complex syntactic forms to encode and also to decode meanings. From lesion studies it became clear that areas outside the classical Broca/Wernicke area and the connecting Geschwind area are relevant for language production and understanding. Brain regions have been identified where lesions lead to semantic dementia (also known as anomic aphasia) that selectively affects the recognition of names of persons, nouns for manipulable objects such as tools, or nouns of natural objects such as animals. These regions are typically situated in the left temporal lobe, but the studies reported by Damasio et al. also indicate that regions of the right hemisphere play an important role. It remains unclear, however, whether these lesions affect particular linguistic abilities or more general problems with the pre-linguistic categorization of objects. A serious problem with the use of brain lesions as source of evidence is that they are often not sufficiently locally constrained as to allow for specific inferences. Stimulation techniques allow for more directed manipulations, and hence for more specific testing of hypothesis. There are deep stimulation techniques that can be applied during brain surgery. There is also a new technique, Transcranial Magnetic Stimulation (TMS), which affects the functioning of particular brain regions by electromagnetic fields applied from outside of the skull.
5.2. Brain imaging of metabolic effects The last decades have seen a lively development of methods that help to locate brain activity by identifying correlated metabolic effects. Neuronal activity in certain brain regions stimulate the flow of oxygen-rich blood, which in turn can be localized by various means. While early methods like PET (Positron-Electron Tomography) required the use of radioactive markers, the method of fMRI (functional Magnetic-Resonance Imaging) is less invasive; it is based on measuring the electromagnetic fields of water molecules excited by strong magnetic fields. A more recent method, NIRS (Near Infrared Spectroscopy), applies low-frequency laser light from outside the skull; it is currently the least invasive technique. All the procedures mentioned have a low temporal resolution, as metabolic changes are slow, within the range of a second or so. However, their spatial resolution is quite acute, especially for fMRI using strong magnetic fields.
12. Varieties of semantic evidence Results of metabolic brain-image techniques often support and refine findings derived from brain lesions (cf. Damasio et al. 2004). As an example of a recent study, Tyler, Randall & Stamatakis (2008) challenge the view that nouns and verbs are represented in different brain regions; they rather argue that inflected nouns and verbs and minimal noun phrases and minimal verb phrases, that is, specific syntactic uses of nouns and verbs, are spatially differentiated. An ongoing discussion is how general the findings about localizations of brain activities are, given the enormous plasticity of the brain.
5.3. Event-related potentials This family of procedures investigates the electromagnetic fields generated by the cortical activity. They are observed by sensors placed on the scalp that either track minute variations of the electric field (EEG) or the magnetic field (MEG). The limitations of this technique are that only fields generated by the neocortex directly under the cranium can be detected. As the neocortex is deeply folded, this applies only to a small part of it. Furthermore, the number of electrodes that can be applied on the scalp is limited (typically 16 to 64, sometimes up to 256), hence the spatial resolution is weak even for the accessible parts of the cortex. Spatial resolution is better for MEG, but the required techniques are considerably more complex and expensive. On the positive side, the temporal resolution of the technique is very high, as it does not measure slow metabolic effects of brain activity, but the electric fields generated by the neurons themselves (more specifically, the action potentials that cause neurotransmitter release at the synapses). EEG electrodes can record these fields if generated by a large number of neurons in the pyramidal bundles of neurons in which the cortex is organized, in the magnitude of at least 1000 neurons. ERP (Event-related potentials), the correlation of EEG signals with stimuli events, has been used for thirty years in psycholinguistic research, and specifically for semantic processing since the discovery by Kutas & Hillyard (1980) of a specific brain potential, the N400. This is a frequently observed change in the potential leading to higher negativity roughly 400ms after the onset of a relevant stimulus. See Kutas, van Petten & Kluender (2006) for a review of the vast literature, and Lau, Phillips & Poeppel (2008) for a partially critical view of standard interpretations. The N400 effect is seen when subjects are presented in an incremental way with sentences like I like my coffee with cream and {sugar / socks}, and the EEG signals of the first and the second variant is compared. In the second variant, with a semantically incongruous word, a negativity around 400ms after the onset of the anomalous word (here: socks) appears when the brain potential development is averaged over a number of trials.
Fig. 12.3: Averaged EEG over sentences with no semantic violation (solid line) and with semantic violation (dotted line); vertical axis at the onset of the anomalous word (from Lau, Phillips & Poeppel 2008)
261
262
III. Methods in semantic research There are at least two interpretations of the N400 effect: Most researchers see it as a reflex of the attempt to integrate the meaning of a subexpression into the meaning of the larger expression, as constructed so far. With incongruous words, this task is hard or even fails, which is reflected by a stronger N400. The alternative view is that the N400 reflects the effort of lexical access. This is facilitated when the word is predictable by the context, but also when the word is frequent in general. There is evidence that highly frequent words lead to a smaller N400 effect. Also, N400 can be triggered by simple word priming tasks; e.g. in coffee – {tea /chair}, the non-primed word chair leads to an N400. See Lau, Phillips & Poeppel (2008) for consequences of the integration view and the lexical access view of the N400. The spatial location of the N400 is also a matter of dispute. While Kutas, van Petten & Kluender (2006) claim that its origins are in the left temporal lobe and hence can be related to established language areas, the main electromagnetic field can be observed rather in the centroparietal region, and often on the right hemisphere. Lau, Phillips & Poeppel (2008) discuss various possible interpretations of these findings. There are a number of other reproducible electrophysiological effects that point at additional aspects of language processing. In particular, Early Left Anterior Negativity (ELAN) has been implicated in phrase structure violations (150ms), Left Anterior Negativity (LAN) appears with morphosyntactic agreement violations (300-500ms), and P600, a positivity after 600ms, has been seen as evidence for difficulties of syntactic integration, perhaps as evidence for attempts at syntactic restructuring. It is being discussed how specific N400 is for semantics; while it is triggered by phenomena that are clearly related to the meaning aspects of language, it can be also found when subjects perform certain non-linguistic tasks, as in melody recognition. Interestingly, N400 can be masked by syntactic inappropriateness, as Hahne & Friederici (2002) have shown. This can be explained by the plausible assumption that structures first have to make syntactic sense before semantic integration can even start to take place. There are a number of interesting specific findings around N400 or related brain potentials (cf. Kutas, van Petten & Kluender 2006 for an overview). Closed-class words generally trigger smaller N400 effects than open-class words, and the shape of their negativity is different as well – it is more drawn out up to about 700ms. As already mentioned, low-frequency words trigger greater N400 effects, which may be seen as a point in favor for the lexical access theory; however, we can also assume that low frequency is a general factor that impedes semantic integration. It has been observed that N400 is greater for inappropriate concrete nouns than for inappropriate abstract nouns. With auditory presentations of linguistic structures, it was surprising to learn that N400 effects can appear already before the end of the triggering word; this is evidence that word recognition and semantic integration sets in very early, after the first phonemes of a word. The larger context of an expression can modulate the N400 effect, that is, the preceding text of a sentence can determine whether a particular word fits and is easy to integrate, or does not fit and leads to integration problems. For example, in a context in which piercing was mentioned, earring triggers a smaller N400 than necklace. This has been seen as evidence that semantic integration does not differentiate between lexical access, the local syntactic fit and the more global semantic plausibility; rather, all factors play a role at roughly the same time. N400 has been used as evidence for semantic features. For example, in the triple The pizza was too hot to {eat / drink / kill}, the item drink elicits a smaller N400 than kill, which
12. Varieties of semantic evidence can be interpreted as showing that the expected item eat and the test item drink have semantic features in common (ingestion), in contrast to eat and kill. Brain potentials have also been used to investigate the semantic processing of negative polarity items (cf. article 64 (Giannakidou) Polarity items). Saddy, Drenhaus & Frisch (2004) and Drenhaus et al. (2006) observe that negative polarity items in inappropriate contexts trigger an N400 effect (as in {A / no} man was ever happy}. With NPIs and with positive polarity items, a P600 could be observed as well, which is indicative for an attempt to achieve a syntactic structure in which there is a suitable licensing operator in the right syntactic configuration. Incidentally, these findings favor the semantic integration view of the N400 over the lexical access view. There are text types that require special efforts for semantic integration – riddles and jokes. With jokes based on the reinterpretation of words, it has been found that better comprehenders of jokes show a slightly higher N400 effect on critical words, and a larger P600 effect for overall integration. Additional effort for semantic integration has also been shown for metaphorical interpretations. A negativity around 320ms has been identified by Fischler et al. (1985) for statements known to the subjects to be false, even if they were not asked to judge the truth value. But semantic anomaly clearly overrides false statements; as Kounios & Holcomb (1992) have showed, in the examples like No dogs are {animals / fruits}, the latter triggers an N400 effect. More recent experiments using MEG have discovered a brain potential called AMF (Anterior Midline Field) situated in the frontal lobe, an area that is not normally implied in language understanding. The effect shows up with coercion phenomena (cf. article 25 (de Swart) Mismatches and coercion). Coercion does not lead to an N400 effect; there is no anomaly with John began the book (which has to be coerced to read or write the book). But Pylkkänen & McElree (2007) found an AMF effect about 350ms after onset. This effect is absent with semantically incongruous words, as well with words that do not require coercion. Interestingly, the same brain area has been implied for the understanding of sarcastic and ironic language in lesion studies (Shamay-Tsoory et al. 2005).
6. Corpus-linguistic methods Linguistic corpora, the record of past linguistic production, is a valuable source of evidence for linguistic phenomena in general, and in case of extinct languages, the only kind of source (cf. article 109 (Katz) Semantics in corpus linguistics). This includes the study of semantic phenomena. For the case of extinct languages we would like to mention, in particular, the task of deciphering, which consists in finding a mapping between expressions and meanings. Linguistic corpora can provide for evidence of meaning in many different ways. An important philosophical research tradition is hermeneutics, originally the art of understanding of sacred texts. Perhaps the most important concept in the modern hermeneutic tradition is the explication of the so-called hermeneutic circle (cf. Gadamer 1960): The interpreter necessarily approaches the text with a certain kind of knowledge that is necessary for an initial understanding, but the understanding of the text in a first reading will influence and deepen the understanding in subsequent readings. With large corpora that are available electronically, new statistical techniques have been developed that can tap into aspects of meaning that might otherwise be difficult
263
264
III. Methods in semantic research to recognize. In linguistic corpora, the analysis of word co-occurrences and in particular collocations can yield evidence for meaning relations between words. For example, large corpora have been investigated for verb-NP collocations using the so-called Expectation Maximation (EM) algorithm (Rooth et al. 1999). This algorithms leads to the classification of verbs and nouns into clusters such that verbs of class X frequently occur with nouns of class Y. The initial part of one such cluster, developed from the British National Corpus, looks as in the following table. The verbs can be characterized as verbs that involve scalar changes, and the nouns as denoting entities that can move along such scales.
Fig. 12.4: Clustering analysis of nouns and verbs; dots represent pairs that occur in the corpus. “as:s” stands for subjects of intransitive verbs, “aso:s” and “aso:o” for subjects and objects of transitive verbs, respectively (from Rooth et al. 1999)
We can also look at the frequency of particular collocations within this cluster, as illustrated in the following table for the verb increase. Tab. 12.1: Frequency of nouns occurring with INCREASE (from Rooth et al. 1999) increase number demand pressure temperature cost
134.147 30.7322 30.5844 25.9691 23.9431
proportion size rate level price
23.8699 22.8108 20.9593 20.7651 17.9996
While pure statistical approaches as Rooth et al. (1999) are of considerable interest, most applications of large-scale corpus-based research are based on a mix between hand-coding and automated procedures. The best-known project that has turned into an important application is WordNet (Fellbaum (ed.) 1998). A good example for the mixed procedure is Gildea & Juravsky (2002), a project that attempted semi-automatic assignment of thematic roles. In a first step, thematic roles were hand-coded for a large number of verbs, where a large corpus provided for a wide variety of examples. These initial examples, together with the coding, were used to train an automatic syntactic parser, which then was able to assign thematic roles to new instances of known predicates and even to new, unseen predicates with reasonable accuracy.
12. Varieties of semantic evidence Yet another application of corpus-linguistic methods involves parallel corpora, collections of texts and their translations into one or more other languages. It is presupposed that the meanings of the texts are reasonably similar (but recall the problems with translations mentioned above). Refined statistical methods can be used to train automatic translation devices on a certain corpus, which then can be extended to new texts that then are translated automatically, a method known as example based machine translation. For linguistic research, parallel corpora have been used in other ways as well. If a language α marks a certain distinction overtly and regularly, whereas language β marks that distinction only rarely and in irregular ways, good translations pairs of texts from α into β can be used to investigate the ways and frequency in which the distinction in β is marked. This method is used, for example, in von Heusinger (2002) for specificity, using Umberto Eco’s Il nome della rosa, and Behrens (2005) for genericity, using Sait-Exupéry’s Le petit prince. The articles in Cysouw & Wälchli (2007) discuss the potential of the technique, and its problems, for typological research.
7. Conclusion This article, hopefully, has shown that the elusive concept of meaning has many reflexes that we can observe, and that semantics actually stands on as firm grounds as other disciplines of linguistics. The kinds of evidence for semantic phenomena are very diverse, and not always as convergent as semanticists might wish them to be. But they provide for a very rich and interconnected area of study that has shown considerable development since the first edition of the Handbook Semantics in 1991. In particular, a wide variety of experimental evidence has been adduced to argue for processing of meaning. It is to be hoped that the next edition will show an even richer and, hopefully, more convergent picture.
8. References Altmann, Gerry & Yuki Kamide 1999. Incremental interpretation at verbs: Restricting the domain of subsequent reference. Cognition 73, 247–264. Behrens, Leila. 2005. Genericity from a cross-linguistic perspective. Linguistics 43, 257–344. Berlin, Brent, Dennis E. Breedlove & Peter H. Raven 1973. General principles of classification and nomenclature in folk biology. American Anthropologist 75, 214–242. Berlin, Brent & Paul Kay 1969. Basic Color Terms. Their Universality and Evolution. Berkeley, CA: University of California Press. Bloomfield, Leonard 1933. Language. New York: Henry Holt and Co. Chafe, Wallace (ed.) 1980. The Pear Stories: Cognitive, Cultural, and Linguistic Aspects of Narrative Production. Norwood, NJ: Ablex. Clark, Herbert H. & Peter Lucy 1975. Understanding what is meant from what is said: A study in conversationally conveyed requests. Journal of Verbal Learning and Verbal Behavior 14, 56–72. Crain, Stephen 1991. Language acquisition in the absence of experience. Behavioral and Brain Sciences 14, 597–650. Crain, Stephen, Rosalind Thornton, Laura Conway & Diane Lillo-Martin 1998. Quantification without quantification. Language Acquisition 5, 83–153. Cysouw, Michael & Bernhard Wälchli 2007. Parallel texts: Using translational equivalents in linguistic typology. Special issue of Sprachtypologie und Universalienforschung 60, 95–99.
265
266
III. Methods in semantic research Damasio, Hanna, Daniel Tranel, Thomas Grabowski, Ralph Adolphs & Antonio Damasio 2004. Neural systems behind word and concept retrieval. Cognition 92, 179–229. Drenhaus, Heiner, Peter beim Graben, Doug Saddy & Stefan Frisch 2006. Diagnosis and repair of negative polarity constructions in the light of symbolic resonance analysis. Brain & Language 96, 255–268. Eberhard, Kathleen, Michael J. Spivey-Knowlton, Judy Sedivy & Michael Tanenhaus 1995. Eye movement as a window into real-time spoken language comprehension in a natural context. Journal of Psycholinguistic Research 24, 409–436. Ehrlich, Kate & Keith Rayner 1983. Pronoun assignment and semantic integration during reading: Eye movement and immediacy of processing. Journal of Verbal Learning and Verbal Behavior 22, 75–87. Fellbaum, Christiane (ed.) 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: The MIT Press. Fischler, Ira, Donald G. Childers, Teera Achariyapaopan & Nathan W. Perry 1985. Brain potentials during sentence verification: Automatic aspects of comprehension. Biological Psychology 21, 83–105. Frazier, Lyn & Keith Rayner 1990. Taking on semantic commitments: Processing multiple meanings vs. multiple senses. Journal of Memory and Language 29, 181–200. Frege, Gottlob 1892. Über Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik 100, 25–50. Gadamer, Hans Georg 1960. Wahrheit und Methode. Grundzüge einer philosophischen Hermeneutik. Tübingen: Mohr. Gärdenfors, Peter 2000. Conceptual Spaces – The Geometry of Thought. Cambridge, MA: The MIT Press. Gerken, LouAnn & Michele E. Shady 1998. The picture selection task. In: D. McDaniel, C. McKee & H. S. Cairns (eds.). Methods for Assessing Children’s Syntax. Cambridge, MA: The MIT Press, 125–146. Gildea, Daniel & Daniel Jurafsky 2002. Automatic labeling of semantic roles. Computational Linguistics 28, 245–288. Goddard, Cliff 1998. Semantic Analysis. A Practical Introduction. Oxford: Oxford University Press. Goodman, Nelson 1955. Fact, Fiction, and Forecast. Cambridge, MA: Harvard University Press. Gordon, Peter 1998. The truth-value judgment task. In: D. McDaniel, C. McKee & H. S. Carins (eds.). Methods for Assessing Children’s Syntax. Cambridge, MA: The MIT Press, 211–235. Gregory, Robert J. 2003. An early history of land on Tanna, Vanuatu. The Anthropologist 5, 67–74. Grice, Paul 1957. Meaning. The Philosophical Review 66, 377–388. Hahne, Anja & Angela D. Friederici 2002. Differential task effects on semantic and syntactic processes are revealed by ERPs. Cognitive Brain Research 13, 339–356. Haviland, John B. 1974. A last look at Cook’s Guugu-Yimidhirr wordlist. Oceania 44, 216–232. von Heusinger, Klaus 2002. Specificity and definiteness in sentence and discourse structure. Journal of Semantics 19, 245–274. Hill, Clifford. 1982. Up/down, front/back, left/right: A contrastive study of Hausa and English. In: J. Weissenborn & W. Klein (eds.). Here and There: Cross-Linguistic Studies on Deixis and Demonstration. Amsterdam: Benjamins, 11–42. Hirsh-Pasek, Kathryn & Roberta M. Golinkoff 1996. The Origin of Grammar: Evidence from Early Language Comprehension. Cambridge, MA: The MIT Press. Kaplan, David 1978. On the logic of demonstratives. Journal of Philosophical Logic VIII, 81–98. Reprinted in: P. French, T. Uehling & H.K. Wettstein (eds.). Contemporary Perspectives in the Philosophy of Language. Minneapolis, MN: University of Minnesota Press, 1979, 401–412. Kay, Paul, Brent Berlin, Luisa Maffi & Wiliam R. Merrifield 2008. The World Color Survey. Stanford, CA: CSLI Publications. Kempton, William. 1981. The Folk Classification of Ceramics. New York: Academic Press.
12. Varieties of semantic evidence Kim, Christina 2008. Processing presupposition: Verifying sentences with ‘only’. In: J. Tauberer, A. Eilam & L. MacKenzie (eds.). Proceedings of the 31st Penn Linguistics Colloquium. Philadelphia, PA: University of Pennsylvania. Kounios, John & Phillip J. Holcomb 1992. Structure and process in semantic memory: Evidence from event-related brain potentials and reaction times. Journal of Experimental Psychology. General 121, 459–479. Kranstedt, Alfred, Andy Lücking, Thies Pfeiffer & Hannes Rieser 2006. Deixis: How to determine demonstrated objects using a pointing cone. In: Sylvie Gibet, Nicolas Courty & Jean-François Kamp (eds.). Gesture in Human-Computer Interaction and Simulation. Berlin: Springer, 300–311. Krüger, Frank, Antje Nuthmann & Elke van der Meer 2001. Pupillometric indices of the temporal order representation in semantic memory. Zeitschrift für Psychologie 209, 402–415. Kutas, Marta & Steven A. Hillyard 1980. Reading senseless sentences: Brain potentials reflect semantic incongruity. Science 207, 203–205. Kutas, Marta, Cyma K. van Petten & Robert Kluender 2006. Psycholinguistics electrified II 1994– 2005. In: M.A. Gernsbacher & M. Traxler (eds.). Handbook of Psycholinguistics. New York: Elsevier, 83–143. Lakoff, George & Mark Johnson 1980. Metaphors we Live by. Chicago, IL: The University of Chicago Press. Lau, Ellen F., Colin Phillips & David Poeppel 2008. A cortical network for semantics: (de) constructing the N400. Nature Reviews Neuroscience 9, 920–933. McDaniel, Dena, Cecile McKee & Helen Smith Cairns (eds.) 1996. Methods for Assessing Children’s Syntax. Cambridge, MA: The MIT Press. Majid, Asifa, Melissa Bowerman, Miriam van Staden & James Boster 2006. The semantic categories of cutting and breaking events: A crosslinguistic perspective. Cognitive Linguistics 18, 133–152. Matthewson, Lisa 2004. On the methodology of semantic fieldwork. International Journal of American Linguistics 70, 369–415. Noveck, Ira 2001. When children are more logical than adults: Experimental investigations of scalar implicatures. Cognition 78, 165–188. Noveck, Ira 2004. Pragmatic inferences linked to logical terms. In: I. Noveck & D. Sperber (eds.). Experimental Pragmatics. Basingstoke: Palgrave Macmillan, 301–321. Pickering, Martin J., Brian McElree & Steven Frisson 2006. Underspecification and aspectual coercion. Discourse Processes 42, 131–155. Piñango, Maria M., Aaron Winnick, Rashad Ullah & Edgar Zurif 2006. Time course of semantic composition: The case of aspectual coercion. Journal of Psycholinguistic Research 35, 233–244. Portner, Paul & Katsuhiko Yabushita 1998. The semantics and pragmatics of topic phrases. Linguistics & Philosophy 21, 117–157. Potts, Christopher 2005. The Logic of Conventional Implicatures. Oxford: Oxford University Press. Pylkkänen, Liina & Brian McElree 2007. An MEG study of silent meaning. Journal of Cognitive Neuroscience 19, 1905–1921. Quine, Willard van Orman 1960. Word and Object. Cambridge, MA: The MIT Press, 26–79. Rayner, Keith 1998. Eye movements in reading and information processing: 20 years of research. Psychological Bulletin 124, 372–422. Rooth, Mats, Stefan Riezler, Detlef Prescher & Glenn Carroll 1999. Inducing a semantically annotated lexicon via EM-based clustering. In: R. Dale & K. Church (eds.). Proceedings of the 37th Annual Meeting of the Association for computational Linguistics. College Park, MD: ACL. 104–111. Saddy, Douglas, Heiner Drenhaus & Stefan Frisch 2004. Processing polarity items: Contrastive licensing costs. Brain & Language 90, 495–502. van der Sandt, Rob A. 1988. Context and Presupposition. London: Croom Helm. Sedivy, Julie C., Michael K. Tanenhaus, Craig C. Chambers & Gregory N. Carlson 1999. Achieving incremental semantic interpretation through contextual representation. Cognition 71, 109–147. Shamay-Tsoory, Simone, Rachel Tomer & Judith Aharon-Peretz 2005. The neuroanatomical basis of understanding sarcasm and its relationship to social cognition. Neuropsychology 19, 288–300.
267
268
III. Methods in semantic research Smith, Carlotta L. 1980. Quantifiers and question answering in young children. Journal of Experimental Child Psychology 30, 191–205. Stalnaker, Robert 1974. Pragmatic presuppositions. In: M. K. Munitz & P. K. Unger (eds.). Semantics and Philosophy. New York: New York University Press, 197–214. Thornton, Rosalind 1998. Elicited production. In: D. McDaniel, C. McKee & H. S. Carins (eds.). Methods for Assessing Children’s Syntax. Cambridge, MA: The MIT Press, 77–95. Tomasello, Michael 2008. Why don’t apes point? In: R. Eckardt, G. Jäger & Tonjes Veenstra (eds.). Variation, Selection, Development. Probing the Evolutionary Model of Language Change. Berlin: Mouton de Gruyter, 375–394. Tyler, Lorraine K., Billi Randall & Emmanuel A. Stamatakis 2008. Cortical differentiation for nouns and verbs depends on grammatical markers. Journal of Cognitive Neuroscience 20, 1381–1389. Weber, Andrea, Bettina Braun & Matthew W. Crocker 2006. Finding referents in time: Eye-tracking evidence for the role of contrastive accents. Language and Speech 49, 367–392. Weinreich, Uriel 1964. Webster’s Third. A critique of its semantics. International Journal of American Linguistics 30, 405–409.
Manfred Krifka, Berlin (Germany)
13. Methods in cross-linguistic semantics 1. 2. 3. 4. 5. 6. 7. 8.
Introduction The need for cross-linguistic semantics On semantic fieldwork methodology What counts as a universal? How do we find universals? Semantic universals in the literature Variation References
Abstract This article outlines methodologies for conducting research in cross-linguistic semantics, with an eye to uncovering semantic universals. Topics covered include fieldwork methodology, types of evidence for semantic universals, different types of semantic universals (with examples from the literature), and semantic parameters.
1. Introduction This article outlines methodologies for conducting research in cross-linguistic semantics, with an eye to uncovering semantic universals. Section 2 briefly motivates the need for cross-linguistic semantic research, and section 3 outlines fieldwork methodologies for work on semantics. Section 4 addresses the issue of what counts as a universal, and section 5 discusses how one finds universals. Section 6 provides examples of semantic universals, and section 7 discusses variation. The reader is also referred to articles 12 (Krifka) Varieties of semantic evidence and 95 (Bach & Chao) Semantic types across languages. Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 268–285
13. Methods in cross-linguistic semantics
2. The need for cross-linguistic semantics It is by now probably uncontroversial that our field’s empirical base needs to be as broad as possible. Typologists have always known this; generativists have also mostly advanced beyond the view that we can uncover Universal Grammar by doing in-depth study of a single language. Although semantics was the last sub-field of formal linguistics to undertake widespread cross-linguistic investigation, such work has been on the rise ever since the synthesis of generative syntax and Montague semantics in the 1980s. In the 20 years since Comrie (1989: 4) wrote that for a ‘Chomskyist’ for whom universals are abstract principles, “there is no way in which the analysis of concrete data from a wide range of languages would provide any relevant information”, we have seen many instances where empirical cross-linguistic work has falsified abstract universal principles. Among many other others, Maria Bittner’s pioneering work on Kalaallisut (Eskimo-Aleut, Greenland) is worth mentioning here (e.g., Bittner 1987, 1994, 2005, 2008), as well as the papers in Bach et al.’s (1995) quantification volume, and the body of research responding to Chierchia’s (1998) Nominal Mapping Parameter. It is clear that we will only come close to discovering semantic universals by looking at data from a wide range of languages.
3. On semantic fieldwork methodology The first issue for a cross-linguistic semantic researcher is how to obtain usable data in the field. The material presented here draws on Matthewson (2004); the reader is also referred to article 12 (Krifka) Varieties of semantic evidence. The techniques described here are theory-neutral; I assume that regardless of one’s theoretical assumptions, similar data-collection processes are appropriate.
3.1. In support of direct elicitation It was once widely believed that direct elicitation is an illegitimate methodology; see for example Harris & Voegelin (1953: 59). Even in this century we find statements like the following: The referential meaning of nouns (in terms of definiteness and specificity) is an intricate topic that is extremely hard to investigate on the basis of elicitation. In the end it is texts or connected discourse in general in the language under investigation which provide the most important clues for analysis of these grammatical domains. (Dimmendaal 2001: 69)
It is true that connected discourse is an indispensable part of any investigation into noun phrase semantics. However, this methodology alone is insufficient. Within semantics, there are two main reasons why textual materials are insufficient as an empirical base. The first is shared with syntactic research, namely that one cannot obtain negative evidence from texts. One cannot claim with any certainty that any structure is impossible, if one only investigates spontaneously produced data. The second reason is specific to semantics, namely that texts provide little direct evidence about meaning. A syntactician working with a textual corpus has at least a number of sentences which may be assumed to be grammatical. But a text is paired with at most a translation – an incomplete representation of meaning. Direct elicitation results in more detailed,
269
270
III. Methods in semantic research targeted information than a translation. (This is not to deny that examination of texts is useful; texts are good sources of information about topic-tracking devices or reference time maintenance, for example.)
3.2. Eliciting semantic judgments The semantics of utterances or parts of utterances are not consciously accessible to native speakers. Comments, paraphrases and translations offered by consultants are all useful clues for the fieldworker (see article 12 (Krifka) Varieties of semantic evidence). However, just as a syntactician does not ask a native speaker to provide a treestructure (but rather asks for grammaticality judgments of sentences which are designed to test for constituency, etc.), so a semanticist does not ask a native speaker to conduct semantic analysis. This includes generally avoiding questions such as ‘what does this word mean?’, ‘when is it okay to use this word?’ and ‘does this sentence have two meanings?’ Instead, we mainly proceed by requesting judgments on the acceptability of sentences in particular discourse contexts. A judgment is an acceptance or rejection of some linguistic object, and ideally reflects the speaker’s native competence. (However, see Carden 1970, Schütze 1996 on the problem of speaker variability in judgments.) I assume there is only one kind of judgment within semantics, that of whether a grammatical sentence (or string of sentences) is acceptable in a given discourse context. Native speakers are not qualified to give ‘judgments’ about technical concepts such as entailment, tautology, ambiguity or vagueness (although of course consultants’ comments can give valuable clues about these issues). All of these concepts can and should be tested for via acceptability judgment tasks. The acceptability judgment task is similar to the truth value judgment task used in language acquisition research (cf. Crain & Thornton 1998 and much other work; see article 103 (Crain) Meaning in first language acquisition). The way to elicit an acceptability judgment is to first provide a discourse context to the consultant, then present an utterance, and ask whether the utterance is acceptable in that context. The consultant’s answer potentially gives us information about the truth-value of the sentence in that context, and about the pragmatic felicity of the sentence in that context (e.g., whether its presuppositions are satisfied). The relation of the acceptability judgment task to truth values derives from Grice’s (1975) Maxim of Quality: we assume that a speaker will only accept a sentence S in a discourse context C if S is true in C. Conversely, if S is false in C, a speaker will reject S in C. From this it follows that acceptance of a sentence gives positive information about truth value (the sentence is true in this discourse context), but rejection of a sentence gives only partial information: the sentence may be false, but it may also be rejected on other grounds. Article 103 (Crain) Meaning in first language acquisition offers useful discussion of the influence of the Cooperative Principle on responses to acceptability judgment tasks. And see von Fintel (2004) for the claim that speakers will assign the truthvalue ‘false’ to sentences which involve presupposition failure, and which are therefore actually infelicitous (and analyzed as having no truth value). When conducting an acceptability judgment task, it is important to present the consultant with the discourse context before presenting the object language sentence(s). If the consultant hears the sentence in the absence of a context, s/he will spontaneously think of a suitable context or range of contexts for the sentence. If the speaker’s imagined
13. Methods in cross-linguistic semantics context differs from the one the researcher is interested in, false negatives can arise, particularly if the researcher is interested in a dispreferred or non-obvious reading. The other important question is how to present the discourse context. In section 3.4 I will discuss the use of non-verbal stimuli when presenting contexts. Concentrating for now on verbal strategies, we have two choices: explain the context in the object language, or in a meta-language (a language the researcher is fluent in). Either method can work, and I believe that there is little danger in using a meta-language to describe a context. The context is merely background information; its linguistic features are not relevant. In fact, presenting the context in a meta-language has the advantage that the consultant is unlikely to copy structural features from the context-description to the sentence being judged. (In Matthewson 2004, I argue that even in a translation task, the influence of meta-language syntax and semantics is less than has sometimes been feared. Based on examples drawn from Salish, I argue that consultants will only copy structures up to the point allowed by their native grammar, and that any problems of cross-language influence are easily solvable by means of follow-up elicitation.) See article 12 (Krifka) Varieties of semantic evidence for further discussion of issues in presenting discourse contexts.
3.3. Eliciting translations Translations have the same status as paraphrases or additional comments about meaning offered by the consultant: they are a clue to meaning, and should be taken seriously, but they are not primary data. A translation represents the consultant’s best effort to express the same truth conditions in another language – but often, there is no way to express exactly the same truth conditions in two different languages. Even if we assume effability, i.e. that every proposition expressible in one language is expressible in any language (Katz 1976), a translation task will usually fall short of pairing a proposition in one language with a truth-conditionally identical counterpart in the other language. The reasons for this include the fact that what is easily expressible in one language may be expressible only with difficulty in another, or that what is expressed using an unambiguous proposition in one language is best rendered in another language by an utterance which is ambiguous or vague. Furthermore, what serves as a good translation of a sentence in one discourse context may be a poor or inappropriate translation of the same sentence in a different discourse context. As above, the only direct evidence about truth conditions are acceptability judgments in particular contexts. The problem of ambiguity is acute when dealing with a translation or paraphrase task. If an object-language sentence is ambiguous, its translation or paraphrase often only produces the preferred reading. And a consultant will often want to explicitly disambiguate; after all, Grice’s Manner Maxim exhorts a cooperative speaker to avoid ambiguity. If one suspects that a sentence may be ambiguous, one should not simply give the sentence and ask what it means. Instead, present a discourse context, and elicit an acceptability judgment. Eliciting the dispreferred reading first is a good idea (or, if the status of the readings is not known, using different elicitation orders on different days). It is also a good idea to construct discourse contexts which pragmatically favour the dispreferred reading. Manipulating pragmatic plausibility can also establish the absence of ambiguity: if a situation pragmatically favours a certain possible reading, but the consultant rejects the sentence in that context, one can be pretty sure that the reading is absent.
271
272
III. Methods in semantic research When dealing with presuppositions, or other aspects of meaning which affect felicity conditions, translations provide notoriously poor information. Felicity conditions are usually ignored in the translation process; the reader is referred to Matthewson (2004) for examples of this. When eliciting translations, the basic rule is only to ask for translations of complete, grammatical sentences. (See Nida 1947: 140 for this point; Harris & Voegelin 1953: 70–71, on the contrary, advise that one begin by asking for translations of single morphemes.) Generally speaking, any sub-sentential string will probably not be translatable with any accuracy by a native speaker. A sub-sentential translation task rests on the faulty assumption that the meaning which is expressed by a certain constituent in one language is expressible by a constituent in another language. Even if the syntactic structures are roughly parallel in the two languages, the meaning of a sub-sentential string may depend on its environment (cf. for example languages where bare noun phrases are interpreted as specific or definite only in certain positions). The other factor to be aware of when eliciting translations is that a sentence which would result in ungrammaticality if it were translated in a structure-preserving way, will be restructured. In Matthewson (2004) I provide an example of this involving testing for Condition C violations in Salish. Due to pro-drop and free word order, the St’át’imcets (Lillooet Salish, British Columbia) sentence which translates Mary saw her mother is structurally ambiguous with She saw Mary’s mother. The wrong way to test for potential Condition C-violating structures (which has been tried!) is to ask for translations of the potentially ambiguous sentence into English. Even if the object language does allow Condition C violations (i.e., does allow the structure Shei saw Maryi’s mother), the consultant will never translate the sentence into English as She saw Mary’s mother. (See Davis 2009 for discussion of strategies for dealing with the particular elicitation problems of Condition C.)
3.4. Elicitation using non-verbal stimuli It was once widely believed that elicitation should proceed by means of non-verbal stimuli, which are used to generate spontaneous speech in the object language, and that a metalanguage should be entirely avoided (see for example Hayes 1954, Yegerlehner 1955, or Aitken 1955). It will already be clear that I reject this view, as do most modern researchers. Elicitation using only visual stimuli cannot provide negative evidence, for example. Visual stimuli are routinely used in language acquisition experiments; see Crain & Thornton (1998), among others. The methodology is equally applicable to fieldwork with adults. The stimuli can be created using computer technologies, but they do not need to be high-tech; they can involve puppets, small toys, or line-drawings. Usually, these methodologies are not intended to replace verbal elicitation, but serve as a support and enhancement for it. For example, a picture or a video will be shown, and may be used to elicit spontaneous discourse, but is also usually followed up with judgment questions. The advantage of visual aids is obvious: the methodology avoids potential interference from linguistic features of the context description, and it can potentially more closely approximate a real-life discourse context. This is particularly relevant for pragmatically sensitive questions, where it is almost impossible to hold cues in the metalanguage (e.g., intonation) constant. This methodology also allows standardization across different elicitation sessions, different fieldworkers, and different languages. When researchers share
13. Methods in cross-linguistic semantics their video clips, for example, we can be certain that the same contexts are being tested and can compare cross-linguistic results with more confidence. Another advantage of visual stimuli concerns felicity conditions. It is quite challenging to elicit information about presupposition failure using only verbal descriptions of discourse contexts, as the necessary elements of the context include information about the interlocutors’ belief or knowledge states. However, video, animation or play-acting give a potential way to represent a character’s knowledge state, and if carefully constructed, the stimulus can rule out potential presupposition accommodation. The consultant can then judge the acceptability of an utterance by one of the characters in the video/play which potentially contains presupposition failure. The main disadvantage of these methodologies is logistics: compared to verbally describing a context, visually representing it takes more time and effort. Many hours can go into planning (let alone filming or animating) a video for just one discourse context. In many cases, verbal strategies are perfectly adequate and more efficient. For example, when trying to establish whether a particular morpheme encodes past tense, it is relatively simple to describe a battery of discourse contexts for a single sentence and obtain judgments for each context. One can quickly and unambiguously explain that an event took place yesterday, is taking place at the utterance time, or will take place tomorrow, for example. In sum, verbal elicitation still remains an indispensable and versatile methodology. The next section turns to semantic universals. We first discuss what semantic universals look like, and then how one gets from cross-linguistic data to the postulation of universals.
4. What counts as a universal? Universals come in two main flavours: absolute/unconditional (‘all languages have x’), and implicational (‘if a language has x, it has y’). Examples of each type are given in (1) and (2); all of these except (1b) can be found in the Konstanz Universals Archive. (1) Absolute/unconditional universals: a. Every language has an existential element such as a verb or particle (Ferguson 1972: 78–79). b. Every language has expressions which are interpreted as denoting individuals (Article 95 (Bach & Chao) Semantic types across languages). c. Every language has N’s and NPs of the type e → t (Partee 2000). d. The simple NP of any natural language express monotone quantifiers or conjunctions of monotone quantifiers (Barwise & Cooper 1981: 187). (2) Implicational universals: a. If a language has adjectives for shape, it has adjectives for colour and size (Dixon 1977). b. A language distinguishes between count and mass nouns iff a language possesses configurational NPs (Gil 1987). c. There is a simple NP which expresses the [monotone decreasing] quantifier ~Q if and only if there is a simple NP with a weak non-cardinal determiner which expresses the [monotone increasing] quantifier Q (Barwise & Cooper 1981: 186).
273
274
III. Methods in semantic research Within the typological literature, ‘universals’ are often actually tendencies (‘statistical universals’). Many of the entries in the Konstanz Universals Archive contain phrases like ‘with more than chance frequency’, ‘more likely to’, ‘most often‘, ‘typically’ or ‘tend to’. For obvious reasons, one does not find much attention paid to statistical universals in the generativist literature. Another difference between typological and formal research relates to whether more importance is attached to implicational or unconditional universals. Typological research finds the implicational kind more interesting, and there is a strand of belief according to which it is difficult to distinguish unconditional universals from the preconditions of one’s theory. For example, take (1c) above. Is this a semantic universal, or a theoretical decision under which we describe the semantic behaviour of languages? (Thanks to a reviewer for discussion of this point.) There are certainly unconditional universals which seem to be empirically unfalsifiable. One example is semantic compositionality, which is assumed by many formal semanticists to hold universally. As discussed in von Fintel & Matthewson (2008) and references therein, compositionality is a methodological axiom which is almost impossible to falsify empirically. On the other hand, many unconditional universals are falsifiable; for example, Barwise and Cooper’s NP-Quantifier Universal, introduced in section 5 below. Even (1c) does make a substantive claim, although worded in theoretical terms. It asserts that all languages share certain meanings; it is in a sense irrelevant if one doesn’t agree that we should analyze natural language using the types e and t. If typologists are suspicious of unconditional universals, formal semanticists may tend to regard implicational universals as merely a stepping stone to a deeper discovery. Many of the implications as they are stated are probably not primitively part of Universal Grammar. They may either have a functional explanation, or they may be descriptive generalizations which should be derivable from the theory rather than forming part of it. For example, take the semantic implicational universal in (3) (provided by a reviewer): (3) If a language has a definite article, then a type shift from to is not freely available (without using the article). (3) is ideally merely one specific instance of a general ban on type-shifting when there is an overt element which does the job. This in turn may follow from some general economy principle, and not need to be stated separately. A final point of debate relates to level of abstractness. Formal semanticists often propose constraints which are only statable at a high level of abstractness, but some researchers believe that universals should always be surface testable. For example, Comrie (1989) argues that abstractly-stated universals have the drawback that they rely on analyses which may be faulty. And Levinson (2003: 320) argues for surface-testability because “highly abstract generalizations … are impractical to test against a reasonable sample of languages.” Abstractness is required in the field of semantics because superficial statements almost never reveal what is really going on. This becomes clear if one considers how much reliable information about meaning can be gleamed from sources like descriptive grammars. In fact, the inadequacy of superficial studies is illustrated by the very examples cited by Levinson (2003) in support of his claim that “Whorf’s (1956: 218) emphasis on the ‘incredible degree of linguistic diversity of linguistic systems over the globe’ looks considerably better informed than the opinions of many contemporary thinkers.” Levinson’s
13. Methods in cross-linguistic semantics examples include variation in lexical categories, and the fact that some languages have no tense (Levinson 2003: 328). However, these are areas where examination of superficial data can lead to hasty conclusions about variation, and where deeper investigation often reveals underlying similarities. With respect to lexical categories, Jelinek (1995) famously advanced the strong and interesting hypothesis that Straits Salish (Washington and British Columbia) lacks categorial distinctions. The claim was based on facts such as that any open-class lexical item can function as a predicate in Straits, without a copular verb. However, subsequent research has found, among other things, that the head of a relative clause cannot be occupied by just any open-class lexical item. Instead, this position is restricted to elements which correspond to nouns in languages like English (Demirdache & Matthewson 1995, Davis & Matthewson 1999, Davis 2002; see also Baker 2003). This is evidence for categorial distinctions, and Jelinek herself has since rejected the category-neutral view of Salish (orally, at the 2002 International Conference on Salish and Neighbouring Languages; see also Montler 2003). As for tense, while tenseless languages may exist, we cannot determine whether a language is tensed or not based on superficial evidence. I will illustrate this based on St’át’imcets; see Matthewson (2006b) for details. St’át’imcets looks like a tenseless language on the surface, as there is no obligatory morphological encoding of the distinction between present and past. (4) illustrates this for an activity predicate; the same holds for all aspectual classes: (4) mets-cál=lhkan write-act=1sg.subj ‘I wrote / I am writing.’ St’át’imcets does have obligatory marking for future interpretations; (4) cannot be interpreted as future. The main way of marking the future is with the future modal kelh. (5) mets-cál=lhkan=kelh write-act=1sg.subj=FUT ‘I will write.’ The analytical problem is non-trivial here: the issue is whether there is a phonologically unpronounced tense in (4). The assumption of null tense would make St’át’imcets similar to English, and therefore would be the null hypothesis (see section 5.1). The hypothesis is empirically testable language-internally, but only by looking beyond obvious data such as that in (4)–(5). For example, it turns out that St’át’imcets and English behave in a strikingly parallel manner with respect to the temporal shifting properties of embedded predicates. In English, when a stative past-tense predicate is embedded under another past tense, a simultaneous reading is possible. The same is true in St’át’imcets, as shown in (6). (6) tsút=tu7 s=Pauline [kw=s=guy’t-ál’men=s=tu7] say=then nom=Pauline [det=nom=sleep-want=3poss=then] ‘Pauline said that she was tired.’ OK: Pauline said at a past time t that she was tired at t
275
276
III. Methods in semantic research On the other hand, in English a future embedded under another future does not allow a simultaneous reading, but only allows a forward-shifted reading (Enç 1996, Abusch 1998, among others). The same is true of St’át’imcets, as shown in (7). Just as in English, the time of Pauline’s predicted tiredness must be later than the time at which Pauline will speak. (7) tsút=kelh s=Pauline [kw=s=guy’t-ál’men=s=kelh] say=FUT nom=Pauline [det=nom=sleep-want=3poss=FUT] ‘Pauline will say that she will be tired.’ *Pauline will say at a future time t that she is tired at t. OK: Pauline will say at a future time t that she will be tired at t’ after t. The parallel behaviour of the two languages can be explained if we assume that the St’át’imcets matrix clauses in (6)–(7) contain a phonologically covert non-future tense morpheme (Matthewson 2006b). The embedding data can then be afforded the same explanation as the corresponding facts in English (see e.g., Abusch 1997, 1998, Ogihara 1996 for suggestions). We can see that there is good reason to believe that St’át’imcets is tensed, but the evidence for the tense morpheme is not surface-obvious. Of course, there are other languages which have been argued to be tenseless (see e.g., Bohnemeyer 2002, Bittner 2005, Lin 2006, Ritter & Wiltschko 2005, and article 97 (Smith) Tense and aspect). However, I hope to have shown that it is useless to confine one’s examination of temporal systems to superficial data. Doing this would lead to the possibly premature rejection of core universal properties of temporal systems in human language. This in turn suggests that we cannot say that semantic universals must be surface-testable.
5. How do we find universals? Given that we must postulate universals based on insufficient data (as data is never available for all human languages), we employ a range of strategies to get from data to universals. I will not discuss sources of evidence such as language change or cognitivepsychological factors; the reader is referred to article 99 (Fritz) Theories of meaning change, article 100 (Geeraerts) Cognitive approaches to diachronic semantics, and article 27 (Talmy) Cognitive Semantics for discussion. Article 12 (Krifka) Varieties of semantic evidence is also relevant here.
5.1. Assume universality One approach to establishing universals is simply to assume them, based on study of however many languages one has data from. This is actually what all universals research does, as no phenomenon has ever been tested in all of the world’s languages. A point of potential debate is the number of languages which need to be examined before a universal claim should be made. This issue arises for semanticists because the nature of the fieldwork means that testing semantic universals necessarily proceeds quite slowly.
13. Methods in cross-linguistic semantics My own belief is that one should always assume universality in the absence of evidence to the contrary – and then go and look for evidence to the contrary. In earlier work I dubbed this strategy the ‘No variation null hypothesis’ (Matthewson 2001); see also article 95 (Bach & Chao) Semantic types across languages. The assumption of universality is a methodological strategy which tells us to begin with the strongest empirically falsifiable hypothesis. It does not entail that there is an absence of cross-linguistic variation in the semantics, and it does not mean that our analyses of unfamiliar languages must look parallel to those of English. On the contrary, work which adopts this strategy is highly likely to uncover semantic variation, and to help establish the parameters within which natural language semantics may vary. The null hypothesis of universality is assumed by much work within semantics. In fact, some of the most influential semantic universals to have been proposed, those of Barwise & Cooper (1981), advance data only from English in support. (See also Keenan & Stavi 1986 for related work, and article 95 (Bach & Chao) Semantic types across languages for further discussion.) We saw two of Barwise and Cooper’s universals above; here are two more for illustration. U1 is a universal semantic definition of the noun phrase (DP, in modern terminology): U1. NP-Quantifier universal (Barwise & Cooper 1981: 177) Every natural language has syntactic constituents (called noun-phrases) whose semantic function is to express generalized quantifiers over the domain of discourse.
U3 contains the well-known conservativity constraint, as well as asserting that there are no languages which lack determiners in the semantic sense. U3. Determiner universal (Barwise & Cooper 1981: 179) Every natural language contains basic expressions, (called determiners) whose semantic function is to assign to common count noun denotations (i.e., sets) A a quantifier that lives on A.
Barwise and Cooper do not offer data from any non-English language, yet postulate universal constraints. This methodology was clearly justified, and the fact that not all of their proposed universals have stood the test of time is largely irrelevant. By proposing explicit and empirically testable constraints, Barwise and Cooper’s work inspired a large amount of subsequent cross-linguistic research, which has expanded our knowledge of quantificational systems in languages of the world. For example, several of the papers in the Bach et al. volume (1995) specifically address the NP-Quantifier Universal; see also Bittner & Trondhjem (2008) for a recent criticism of standard analyses of generalized quantifiers. It is sometimes asserted that the strategy of postulating universals as soon as one can, or even of claiming that an unfamiliar language shares the same analysis as English, is euro-centric (see e.g., Gil 2001). But this is not correct. Since English is a well-studied language, it is legitimate to compare lesser-studied languages to analyses of English. It is our job to see whether we can find evidence for similarities between English and unfamiliar languages, especially when the evidence may not be superficially obvious. The important point here is that the null hypothesis of universality does not assign analyses of any language any priority over language-faithful analyses of any other languages. The null hypothesis is that all languages are the same – not that all languages are like English. Thus, it is just as likely that study of a non-Indo-European language will cause us
277
278
III. Methods in semantic research to review and revise previous analyses of English. One example of this is Bittner’s (2008) analysis of temporal anaphora. Bittner proposes a universal system of temporal anaphora which “instead of attempting to extend an English-based theory to a typologically distant language … proceeds in the opposite direction – extending a Kalaallisut-based theory to English” (Bittner 2008: 384). Another example is the analysis of quantification presented in Matthewson (2001). I observe there that the St’át’imcets data are incompatible with the standard theory of generalized quantifiers, because St’át’imcets possesses no determiners which create a generalized quantifier by operating on a common noun denotation, as English every or most do. (The insight that Salish languages differ from English in their quantificational structures is originally due to Jelinek 1995.) I propose that the analysis suggested by the St’át’imcets data may be applicable also to English. In a similar vein, see Bar-el (2005) for the claim that the aspectual system of Skwxwú7mesh invites us to reanalyze the aspectual system of English. The null hypothesis of universality guides empirical testing in a fruitful way; it gives us a starting hypothesis to test, and it allows us to use knowledge gained from study of one language in study of another. It inspires us to look beyond the superficial for underlying similarities, yet it does not force us to assume similarity where none exists. It also explicitly encodes the belief that there are limits on cross-linguistic variation. In contrast, an extreme non-universalist position (cf. Gil 2001) would essentially deny the idea that we have any reason to expect similarity between languages. This can lead to premature acceptance of exotic analyses.
5.2. Language acquisition and learnability Generative linguists are interested in universals which are likely to have been provided by Universal Grammar, and clues about this can come from acquisition research and learnability theory. As argued by Crain & Pietroski (2001: 150) among others, the features which are most likely to be innately specified by UG are those which are shared across languages but which are not accessible to a child in the Primary Linguistic Data (the linguistic input available to the learner). Crain and Pietroski offer the examples of coreference possibilities for pronouns and strong crossover; the restrictions found in these areas are by hypothesis not learnable based on experience. The discussion of tense above is another case in point, as the data required to establish the existence of the null tense morpheme are not likely to be often presented to children in their first few years of life.
5.3. A typological survey It may seem that a typological survey is at the opposite end of the spectrum from the research approaches discussed so far. However, even nativists should not discount the benefits of typological research. Such studies can help us gain perspective on which research questions are of central interest from a cross-linguistic point of view. One example of this involves determiner semantics. A preliminary study by Matthewson (2008) (based on grammars of 33 languages from 25 language families) suggests that English is in many ways typologically unusual with respect to the syntax and semantics of determiners. For example, cross-linguistically, articles are the exception rather than the rule. Demonstratives are present in almost all languages, but rarely seem
13. Methods in cross-linguistic semantics to occupy determiner position as they do in English; strong quantifiers often appear to occupy a different position again. In many languages, the only strong quantifiers are universals. While distributive universal quantifiers are frequently reduplications or affixes, non-distributive ones are not. What we see is that a broad study can lead us to pose questions that would perhaps not otherwise be addressed, such as what the reason is for the apparent correlation between the syntax of a universal quantifier and its semantics. There are many examples of typological studies which have inspired formal semantic analyses (e.g., Cusic 1981, Comrie 1985, Corbett 2000, to name a few, and see also Haspelmath et al. 2001 for a comprehensive overview of typology and universals).
6. Semantic universals in the literature It is occasionally implied that since there is no agreed-upon set of semantic universals, they must not exist. For example, Levinson (2003: 315) claims: “There are in fact very few hypotheses about semantic universals that have any serious, cross-linguistic backing.” However, I believe that the reason there is not a large number of agreed-upon semantic universals is simply that semantic universals have so far only infrequently been explicitly proposed and subjected to cross-linguistic testing. And the situation is changing: there is a growing body of work which addresses semantic variation and its limits, and the literature has reached a level of maturity where the relevant questions can be posed in an interesting way. (Some recent dissertations which employ semantic fieldwork are Faller 2002, Wilhelm 2003, Bar-el 2005, Gillon 2006, Tonhauser 2006, Deal 2010, Murray 2010, Peterson 2010, among others.) Many more semantic universals will be proposed as the field continues to conduct in-depth, detailed study of a wide range of languages. In this section I provide a very brief overview of some universals within semantics, illustrating universals from a range of different domains. In the area of lexical semantics we have for example Levinson’s (2003: 315) proposal restricting elements encoding spatial relations. He argues that universally, there are at most three frames of reference upon which languages draw, each of which has precise characteristics that could have been otherwise. Levinson’s proposals are discussed further in article 107 (Landau) Space in semantics and cognition; see also article 27 (Talmy) Cognitive Semantics for discussion of location. A proposal which restricts composition methods and semantic rules is that of Bittner (1994). Bittner claims that there are no language-specific or constructionspecific semantic rules. She proposes a small set of universal operations which essentially consist of functional application, type-lifting, lambda-abstraction, and a rule interpreting empty nodes. Although her proposal may seem to be worded in a theorydependent way, claims about possible composition rules can be, and have been, challenged on at least partly empirical grounds; see for example Heim & Kratzer (1998), Chung & Ladusaw (2003) for challenges to the idea that the only compositional rule is functional application. Another proposal of Bittner’s involves a set of universals of temporal anaphora, based on comparison of English with Kalaallisut. These include statements about the location of states, events, processes and habits relative to topical instants or periods, and restrictions on the default topic time for different event-types. The constraints are explicitly formulated independently of syntax: “Instead of aligning LF structures, this strategy aligns communicative functions” (Bittner 2008: 383).
279
280
III. Methods in semantic research A relatively common type of semantic universal involves constraints on semantic types. For example, Landman (2006) proposes that universally, traces and pro-forms can only be of type e. This is a purely semantic constraint: it is independent of whether the syntax of a language allows elements occupying the relevant positions to be of different categories. A universal constraint on the syntax-semantics mapping is proposed by Gillon (2006). Gillon argues that universally, all and only elements which occupy D(eterminer) position introduce a contextual domain restriction variable (von Fintel 1994). (Gillon has to claim, contra Chierchia 1998, that languages like Mandarin have null determiners.) An interesting constructional universal is suggested by Faller (2007). She argues that the semantics of reciprocal constructions (involving plurality, distinctness of co-arguments, universal quantification and reflexivity) may be cross-linguistically uniform.
7. Variation The search for universals is a search for limits on variation. If we assume a null hypothesis of universality, any observed variation necessitates a rejection of our null hypothesis and leads us to postulate restrictions on the limits of variation. Restrictions on variation are known in the formal literature as ‘parameters’, but the relation between parameters and universals is so tight that restricted sets of options from which languages choose can be classified as a type of universal (cf. article 95 (Bach & Chao) Semantic types across languages). One prime source for semantic variation is independent syntactic variation. Given compositionality, the absence of certain structures in a language may result in the absence of certain semantic phenomena. Within the semantics proper, one well-known parameter is Chierchia’s (1998) Nominal Mapping Parameter, which states that languages vary in the denotation of their NPs. A language may allow NPs to be mapped into predicates, into arguments (i.e., kinds), or both. These choices correlate with a cluster of co-varying properties across languages. The Nominal Mapping Parameter represents a weakening of the null hypothesis that the denotations of NPs will be constant across languages. Whether or not it is correct is an empirical matter, and Chierchia’s work has inspired a large amount of discussion (e.g., Cheng & Sybesma 1999, Schmitt & Munn 1999, Chung 2000, Longobardi 2001, 2005, Doron 2004, Dayal 2004, Krifka 2004, among others). As with Barwise and Cooper’s universals, Chierchia’s parameter has done a great deal to advance our understanding of the denotations of NPs across languages. The Nominal Mapping Parameter also illustrates the restricted nature of the permitted variation; there are only three kinds of languages. Relevant work on restricted variation within semantics is discussed in article 96 (Doetjes) Count/mass distinction, article 98 (Smith) Tense and aspect, and article 98 (Pederson) The expression of space. There is often resistance to the postulation of semantic parameters. One possible ground for this resistance is the belief that semantics should be more cross-linguistically invariant than other areas of the grammar. For example, Chomsky (2000: 185) argues that “there are empirical grounds for believing that variety is more limited for semantic than for phonetic aspects of language.” It is not clear to me, however, that we have empirical grounds for believing this. We are still in the early stages of research into semantic variation; we probably do not know enough yet to say whether semantics varies less than other areas of the grammar.
13. Methods in cross-linguistic semantics Another possible ground for skepticism about semantic parameters is the belief that semantic variation is not learnable. However, there is no reason why this should be so. Chierchia, for example, outlines a plausible mechanism by which the Nominal Mapping Parameter is learnable, with Chinese representing the default setting in line with Manzini & Wexler’s (1987) Subset Principle (Chierchia 1998: 400–401). Chierchia argues that his parameter “is learned in the same manner in which every other structural difference is learned: through its overt morphosyntactic manifestations. It thus meets fully the reasonable challenge that all parameters must concern live morphemes.” A relatively radical semantic parameter is the proposal that some languages lack pragmatic presuppositions in the sense of Stalnaker (1974) (Matthewson 2006a, based on data from St’át’imcets). This parameter fails to be tied to specific lexical items, as it is stated globally. However, the presupposition parameter places languages in a subsetsuperset relation, and as such is potentially learnable, as long as the initial setting is the St’át’imcets one. A learner who assumes that her language lacks pragmatic presuppositions can learn that English possesses such presuppositions on the basis of overt evidence (such as ‘Hey, wait a minute!’ responses to presupposition failure, cf. von Fintel 2004). One common way in which languages vary in their semantics is in degree of underspecification. For example, I argued above that St’át’imcets shares basic tense semantics with English. The languages differ in that the St’át’imcets tense morpheme is semantically underspecified, failing to distinguish past from present reference times. Another case of variation in underspecification involves modality. According to Rullmann, Matthewson & Davis (2008), St’át’imcets modals allow similar interpretations to English modal auxiliaries. However, the interpretations are distributed differently across the lexicon: while in English, a single modal allows a range of different conversational backgrounds (giving rise to deontic, circumstantial or epistemic interpretations, Kratzer 1991), in St’át’imcets, the conversational background is lexically encoded in the choice of modal. Conversely, while in English, the quantificational force of a modal is lexically encoded, in St’át’imcets it is not; each modal is compatible with both existential and universal interpretations. One strong hypothesis would be that all languages possess the same functional categories, but that languages differ in the levels of underspecification in the lexical entries of the various morphemes, and in the way in which they ‘bundle’ the various aspects of meaning into morphemes. As an example of the latter case, Lin’s (2006) analysis of Chinese involves partially standard semantics for viewpoint aspect and for tense, but a different bundling of information from that found in English (for example, the perfective aspect in Chinese includes a restriction that the reference time precedes the utterance time). In terms of lexical aspect, there appears to be cross-linguistic variation in the semantics of basic aspectual classes. Bar-el (2005) argues that stative predicates in Skwxwú7mesh Salish include an initial change-of-state transition. Bar-el implies that while the basic building blocks of lexical entries are provided (e.g., that events can either begin or end with BECOME transitions (cf. Dowty 1979, Rothstein 2004), languages combine these in different ways to give different semantics for the various aspectual classes. Thus, the semantics we traditionally attribute to ‘accomplishments’ or to ‘states’ are not primitives of the grammar. While the Salish/English aspectual differences appear not to be reducible to underspecification or bundling, further decompositional analysis may reveal that they are.
281
282
III. Methods in semantic research I would like to thank Henry Davis for helpful discussion during the writing of this article, and Klaus von Heusinger and Noor van Leusen for helpful feedback on the first draft.
8. References Abusch, Dorit 1997. Sequence of tense and temporal de re. Linguistics & Philosophy 20, 1–50. Abusch, Dorit 1998. Generalizing tense semantics for future contexts. In: S. Rothstein (ed.). Events and Grammar. Dordrecht: Kluwer, 13–33. Aitken, Barbara 1955. A note on eliciting. International Journal of American Linguistics 21, 83. Bach, Emmon et al. (eds.) 1995. Quantification in Natural Languages. Dordrecht: Kluwer. Baker, Mark 2003. Lexical Categories: Verbs, Nouns and Adjectives. Cambridge: Cambridge University Press. Bar-el, Leora 2005. Aspectual Distinctions in Skwxwú7mesh. Ph.D. dissertation. University of British Columbia, Vancouver, BC. Barwise, Jon & Robin Cooper 1981. Generalized quantifiers and natural language. Linguistics & Philosophy 4, 159–219. Bittner, Maria 1987. On the semantics of the Greenlandic antipassive and related constructions. International Journal of American Linguistics 53, 194–231. Bittner, Maria 1994. Cross-linguistic semantics. Linguistics & Philosophy 17, 53–108. Bittner, Maria 2005. Future discourse in a tenseless language. Journal of Semantics 22, 339–388. Bittner, Maria 2008. Aspectual universals of temporal anaphora. In: S. Rothstein (ed.). Theoretical and Crosslinguistic Approaches to the Semantics of Aspect. Amsterdam: Benjamins, 349–385. Bittner, Maria & Naja Trondhjem 2008. Quantification as reference: Kalaallisut Q-verbs. In: L. Matthewson (ed.). Quantification: A Cross-Linguistic Perspective. Amsterdam: Emerald, 7–66. Bohnemeyer, Jürgen 2002. The Grammar of Time Reference in Yukatek Maya. Munich: Lincom Europa. Carden, Guy 1970. A note on conflicting idiolects. Linguistic Inquiry 1, 281–290. Cheng, Lisa & Rint Sybesma 1999. Bare and not-so-bare nouns and the structure of NP. Linguistic Inquiry 30, 509–542. Chierchia, Gennaro 1998. Reference to kinds across languages. Natural Language Semantics 6, 339–405. Chomsky, Noam 2000. New Horizons in the Study of Language and Mind. Cambridge: Cambridge University Press. Chung, Sandra 2000. On Reference to kinds in Indonesian. Natural Language Semantics 8, 157–171. Chung, Sandra & William Ladusaw 2003. Restriction and Saturation. Cambridge, MA: The MIT Press. Comrie, Bernard 1985. Tense. Cambridge: Cambridge University Press. Comrie, Bernard 1989. Language Universals and Linguistic Typology: Syntax and Morphology. Oxford: Blackwell. Corbett, Greville 2000. Number. Cambridge: Cambridge University Press. Crain, Stephen & Paul Pietroski 2001. Nature, nurture, and universal grammar. Linguistics & Philosophy 24, 139–186. Crain, Stephen & Rosalind Thornton 1998. Investigations in Universal Grammar: A Guide to Experiments on the Acquisition of Syntax. Cambridge, MA: The MIT Press. Cusic, David 1981. Verbal Plurality and Aspect. Ph.D. dissertation. Stanford University, Stanford, CA. Davis, Henry 2002. Categorial restrictions in St’át’imcets (Lillooet) relative clauses. In: C. Gillon, N. Sawai & R. Wojdak (eds.). Papers for the 37th International Conference on Salish and Neighbouring Languages. Vancouver, BC: University of British Columbia, 61–75.
13. Methods in cross-linguistic semantics Davis, Henry 2009. Cross-linguistic variation in anaphoric dependencies: Evidence from the Pacific Northwest. Natural Language and Linguistic Theory 27, 1–43. Davis, Henry & Lisa Matthewson 1999. On the functional determination of lexical categories. Révue Québecoise de Linguistique 27, 27–67. Dayal, Veneeta 2004. Number marking and (in)definiteness in kind terms. Linguistics & Philosophy 27, 393–450. Deal, Amy Rose 2010. Topics in the Nez Perce Verb. Ph.D. dissertation. University of Massachusetts, Amherst, MA. Demirdache, Hamida & Lisa Matthewson 1995. On the universality of syntactic categories. In: J. Beckman (ed.). Proceedings of the North Eastern Linguistic Society (= NELS) 25. Amherst, MA: GLSA, 79–93. Dimmendaal, Gerrit 2001. Places and people: Field sites and informants. In: P. Newman & M. Ratliff (eds.). Linguistic Fieldwork. Cambridge: Cambridge University Press, 55–75. Dixon, Robert M.W. 1977. Where have all the adjectives gone? Studies in Language 1, 19–80. Doron, Edit 2004. Bare singular reference to kinds. In: R. Young & Y. Zhou (eds.). Proceedings of Semantics and Linguistic Theory (= SALT) XIII. Ithaca, NY: Cornell University, 73–90. Dowty, David 1979. Word Meaning in Montague Grammar. Dordrecht: Reidel. Enç, Murvet 1996. Tense and modality. In: S. Lappin (ed.). The Handbook of Contemporary Semantic Theory. Oxford: Blackwell, 345–358. Faller, Martina 2002. Semantics and Pragmatics of Evidentials in Cuzco Quechua. Ph.D. dissertation. Stanford University, Los Angeles, CA. Faller, Martina 2007. The ingredients of reciprocity in Cuzco Quechua. Journal of Semantics 243, 255–288. Ferguson, Charles 1972. Verbs of ‘being’ in Bengali, with a note on Amharic. In: J. Verhaar (ed.). The Verb ‘Be’ and its Synonyms: Philosophical and Grammatical Studies, 5. Dordrecht: Reidel, 74–114. von Fintel, Kai 1994. Restrictions on Quantifier Domains. Ph.D. dissertation. University of Massachusetts, Amherst, MA. von Fintel, Kai 2004. Would you believe it? The king of France is back! Presuppositions and truth value intuitions. In: A. Bezuidenhout & M. Reimer (eds.). Descriptions and Beyond. Oxford: Oxford University Press, 261–296. von Fintel, Kai & Lisa Matthewson 2008. Universals in semantics. The Linguistic Review 25, 49–111. Gil, David 1987. Definiteness, noun phrase configurationality, and the count-mass distinction. In: E. Reuland & A. ter Meulen (eds.). The Representation of (In)Definiteness. Cambridge, MA: The MIT Press, 254–269. Gil, David 2001. Escaping Euro-centrism: Fieldwork as a process of unlearning. In: P. Newman & M. Ratliff (eds.). Linguistic Fieldwork. Cambridge: Cambridge University Press, 102–132. Gillon, Carrie 2006. The Semantics of Determiners: Domain Restriction in Skwxwú7mesh. Ph.D. dissertation. University of British Columbia, Vancouver, BC. Grice, H. Paul 1975. Logic and conversation. In: P. Cole & J. L. Morgan (eds.). Syntax and Semantics 3: Speech Acts. New York: Academic Press, 41–58. Reprinted in: S. Davis (ed.). Pragmatics. Oxford: Oxford University Press, 1991, 305–315. Harris, Zellig S. & Charles F. Voegelin 1953. Eliciting in linguistics. Southwestern Journal of Anthropology 9, 59–75. Haspelmath, Martin et al. (eds.) 2001. Language Typology and Language Universals: An International Handbook (HSK 20.1–2). Berlin: de Gruyter. Hayes, Alfred 1954. Field procedures while working with Diegueño. International Journal of American Linguistics 20, 185–194. Heim, Irene & Angelika Kratzer 1998. Semantics in Generative Grammar. Oxford: Blackwell.
283
284
III. Methods in semantic research Jelinek, Eloise 1995. Quantification in Straits Salish. In: E. Bach et al. (eds.). Quantification in Natural Languages. Dordrecht: Kluwer, 487–540. Katz, Jerrold 1976. A hypothesis about the uniqueness of human language. In: S. Harnad, H. Steklis & J. Lancaster (eds.). Origins and Evolution of Language and Speech. New York: New York Academy of Sciences, 33–41. Keenan, Edward & Jonathan Stavi 1986. A semantic characterization of natural language determiners. Linguistics & Philosophy 9, 253–326. Konstanz Universals Archiv 2002–. The Universals Archive, maintained by the University of Konstanz, Department of Linguistics. http://typo.uni-konstanz.de/archive/intro/, March 5, 2009. Kratzer, Angelika 1991. Modality. In: D. Wunderlich & A. von Stechow (eds.). Semantics: An International Handbook of Contemporary Research (HSK 6). Berlin: de Gruyter, 639–650. Krifka, Manfred 2004. Bare NPs: Kind-referring, indefinites, both, or neither? Empirical issues in syntax and semantics. In: R. Young & Y. Zhou (eds.). Proceedings of Semantics and Linguistic Theory (= SALT) XIII. Ithaca, NY: Cornell University, 180–203. Landman, Meredith 2006. Variables in Natural Language. Ph.D. dissertation. University of Massachusetts, Amherst, MA. Levinson, Stephen 2003. Space in Language and Cognition: Explorations in Cognitive Diversity. Cambridge: Cambridge University Press. Lin, Jo-Wang 2006. Time in a language without tense: The case of Chinese. Journal of Semantics 23, 1–53. Longobardi, Giuseppe 2001. How comparative is semantics? A unified parametric theory of bare nouns and proper names. Natural Language Semantics 9, 335–361. Longobardi, Giuseppe 2005. A minimalist program for parametric linguistics? In: H. Broekhuis et al. (eds.). Organizing Grammar: Linguistic Studies for Henk van Riemskijk. Berlin: de Gruyter, 407–414. Manzini, M. Rita & Kenneth Wexler 1987. Parameters, binding theory, and learnability. Linguistic Inquiry 18, 413–444. Matthewson, Lisa 1998. Determiner Systems and Quantificational Strategies: Evidence from Salish. The Hague: Holland Academic Graphics. Matthewson, Lisa 2001. Quantification and the nature of cross-linguistic variation. Natural Language Semantics 9, 145–189. Matthewson, Lisa 2004. On the methodology of semantic fieldwork. International Journal of American Linguistics 70, 369–415. Matthewson, Lisa 2006a. Presuppositions and cross-linguistic variation. In: C. Davis, A. Deal & Y. Zabbal (eds.). Proceedings of the North Eastern Linguistic Society (= NELS) 36. Amherst, MA: GLSA, 63–76. Matthewson, Lisa 2006b. Temporal semantics in a superficially tenseless language. Linguistics & Philosophy 29, 673–713. Matthewson, Lisa 2008. Strategies of quantification in St’át’imcets and the rest of the world. To appear in Strategies of Quantification. Oxford: Oxford University Press. Montler, Timothy 2003. Auxiliaries and other categories in Straits Salishan. International Journal of American Linguistics 69, 103–134. Murray, Sarah 2010. Evidentiality and the Structure of Speech Acts. Ph.D. dissertation. Rutgers University, New Brunswick, NJ. Nida, Eugene 1947. Field techniques in descriptive linguistics. International Journal of American Linguistics 13, 138–146. Ogihara, Toshi-Yuki 1996. Tense, Attitudes and Scope. Dordrecht: Kluwer. Partee, Barbara 2000. ‘Null’ determiners vs. no determiners. Lecture read at the 2nd Winter Typology School, Moscow. Peterson, Tyler 2010. Epistemic Modality and Evidentiality in Gitksan at the Semantics-Pragmatics Interface. Ph.D. dissertation. University of British Columbia, Vancouver, BC.
14. Formal methods in semantics
285
Ritter, Elizabeth & Martina Wiltschko 2005. Anchoring events to utterances without tense. In: J. Alderete, C-h. Han & A. Kochetov (eds.). Proceedings of the 24th West Coast Conference on Formal Linguistics (= WCCFL). Somerville, MA: Cascadilla Proceedings Project, 343–351. Rothstein, Susan 2004. Structuring Events. Oxford: Blackwell. Rullmann, Hotze, Lisa Matthewson & Henry Davis 2008. Modals as distributive indefinites. Natural Language Semantics 16, 317–357. Schmitt, Cristina & Alan Munn 1999. Against the nominal mapping parameter: Bare nouns in Brazilian Portuguese. In: P. Tamanji, M. Hirotani & N. Hall (eds.). Proceedings of the North Eastern Linguistic Society (=NELS) 29. Amherst, MA: GLSA, 339–353. Schütze, Carson 1996. The Empirical Base of Linguistics: Grammaticality Judgments and Linguistic Methodology. Chicago, IL: The University of Chicago Press. Stalnaker, Robert 1974. Pragmatic presuppositions. In: M.K. Munitz & P.K. Unger (eds.). Semantics and Philosophy. New York: New York University Press, 197–213. Tonhauser, Judith 2006. The Temporal Semantics of Noun Phrases: Evidence from Guaraní. Ph.D. dissertation. Stanford University, Stanford, CA. Whorf, Benjamin 1956. Language, Thought and Reality: Selected Writings of Benjamin Lee Whorf. Edited by J. Carroll. Cambridge, MA: The MIT Press. Wilhelm, Andrea 2003. Telicity and Durativity: A Study of Aspect in Dene Suline (Chipewyan) and German. Ph.D. dissertation, University of Calgary, Calgary, AL. Yegerlehner, John 1955. A note on eliciting techniques. International Journal of American Linguistics 21, 286–288.
Lisa Matthewson, Vancouver (Canada)
14. Formal methods in semantics 1. 2. 3. 4. 5. 6. 7. 8. 9.
Introduction First order logic and natural language Formal systems, proofs and decidability Semantic models, validity and completeness Formalizing linguistic methods Linguistic applications of syntactic methods Linguistic applications of semantic methods Conclusions References
Abstract Covering almost an entire century, this article reviews in general and non-technical terms how formal, logical methods have been applied to the meaning and interpretation of natural language. This paradigm of research in natural language semantics produced important new linguistic results and insights, but logic also profited from such innovative applications. Semantic explanation requires properly formalized concepts, but only provides genuine insight when it accounts for linguistic intuitions on meaning and interpretation or the results of empirical investigations in an insightful way. The creative tension between the linguistic demand for cognitively realistic models of human linguistic competence and the logicians demand for a proper and explicit account of all and only the Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 285–305
286
III. Methods in semantic research valid reasoning patterns initially led to an interesting divergence of methods and associated research agendas. With the maturing of natural language semantics as a branch of cognitive science an increasing number of logicians trained in linguistics and linguists apt in using formal methods are developing more convergent empirical issues in interdisciplinary research programs.
1. Introduction The scientific analysis of patterns of human reasoning properly belongs to the ancient discipline of logic, bridging more than twenty centuries from its earliest roots in the ancient Greek philosophical treatises of Plato and Aristotle on syllogisms to its contemporary developments in connecting dynamic reasoning in context to underlying neurobiological and cognitive processes. In reasoning with information from various sources available to us, we systematically exploit (i) the meaning of the words, (ii) the way they are put together in clauses, as well as (iii) the relations between these clauses and (iv) the circumstances in which we received the information in order to arrive at a conclusion (Frege 1892; Tarski 1956). The formal analysis of reasoning patterns not only offers an important window on the meaning and interpretation of logical languages, but also of ordinary, acquired, i.e. natural languages. It constitutes a core component of cognitive science, providing the proper scientific methods to model human information processing as constitutive structure and form. Any explanatory scientific theory of the meaning and interpretation of natural language must at some level aim to characterize all and only those patterns of reasoning that guarantee to preserve in one way or another the assumed truth of the information on which the conclusions are based. In analyzing patterns of inferences, content and specific aspects of the interpretation can only be taken into consideration, if they can be expressed in syntactic form and constitutive structure or in the semantic meta-language. The contemporary research program of natural language semantics has significantly expanded the expressions of natural language to be subjected to such formal methods of logical analysis to cover virtually all syntactic categories, as well as relations between sentences and larger sections of discourse or text, mapping syntactic, configurational structures to sophisticated notions of semantic content or information structure (Barwise & Perry 1983; Chierchia 1995; Cresswell 1985; Davidson & Harman 1972; Dowty, Wall & Peters 1981; Gallin 1975; Partee, ter Meulen & Wall 1990). The classical division of labor between syntactic and semantic theories of reasoning is inherited from the mathematical logical theories developed in the early twentieth century, when logical, i.e. unnatural and purposefully designed languages were the primary subject of investigation. Syntactic theories of reasoning exploit as explanatory tools merely constitutive, configurational methods, structural, i.e. formal operations such as substitution and pure symbol manipulation of the associated proof theories. The semantic theories of reasoning require an interpretation of such constitutive, formal structure in models to characterize truth-conditions and validity of reasoning as systematic interaction between form and meaning (Boolos & Jeffrey 1980). The syntactic, proof theoretic strategy with its customary disregard for meaning as intangible, has originally been pursued most vigorously for natural language in the research paradigm of generative grammar. Its earliest mathematical foundational research in automata theory,
14. Formal methods in semantics formal grammars and their associated design languages regarded structural operations as the only acceptable formal methods. Reasoning or inference was as such not the target of their investigations, as grammars were principally limited to characterize sentence internal properties and their computational complexity (Chomsky 1957; Chomsky 1959; Chomsky & Miller 1958, 1963; Hopcroft & Ullman 1979; Gross & Lentin 1970). The semantic, model-theoretic strategy of investigating human reasoning has been pursued most vigorously in natural language semantics, Lambek and Montague grammars and game theoretic semantics, and their 21st century successors, the various dynamic theories of meaning and interpretation founded on developments in intensional and epistemic logics of (sharing) belief and knowledge (Barwise 1989; van Benthem 1986; van Benthem & ter Meulen 1997; Cresswell 1985; Davidson & Harman 1972; Kamp 1981; Kamp & Reyle 1993; Lambek 1958; Lewis 1972, 1983; Montague 1974; Stalnaker 1999). Formal methods deriving from logic are also applied in what has come to be known as formal pragmatics, where parameters other than worlds or situations, such as context or speaker/hearer, time of utterance or other situational elements may serve in the models to determine meaning and situated inference. This article will address formal methods in semantics as main topic, as formal pragmatics may be considered a further generalization of these methods to serve wider linguistic applications, but as such does not in any intrinsic way differ from the formal methods used in semantics. In both logical and natural languages, valid forms of reasoning in their most general characteristics exploit the information available in the premises, assumed to be true in an arbitrary given model, to draw a conclusion, guaranteed to be also true in that model. Preserving this assumed truth of the premises is a complex process that may be modeled in various formal systems, but if their admitted inference rules are somehow violated in the process the conclusion of the inference is not guaranteed to be true. This common deductive approach accounts for validity in forms or patterns of reasoning based on the stable meaning of the logical vocabulary, regardless of the class of models under consideration. It has constituted the methodological corner stone of the research program of natural language semantics, where ordinary language expressions are translated into logical expressions, their ‘logical form’, to determine their truth conditions in models as a function of their form and subsequently characterize their valid forms of reasoning (May 1985; Montague 1974; Dowty, Wall & Peters 1981). Some important formal methods of major schools in this research program are reviewed, mostly to present their conceptual foundations and discuss their impact on linguistic insights, while referring the reader for more technical expositions and formal details to the relevant current literature and articles 10 (Newen & Schröder) Logic and semantics, 11 (Kempson) Formal semantics and representationalism, 33 (Zimmermann) Model-theoretic semantics, 43 (Keenan) Quantifiers, 37 (Kamp & Reyle) Discourse Representation Theory and 38 (Dekker) Dynamic semantics (see also van Benthem & ter Meulen 1997; Gabbay & Guenthner 1983; Partee, ter Meulen & Wall 1990). Excluded from consideration as formal methods in the sense intended here are other mathematical methods, such as statistical, inductive inference systems, where the assumptions and conclusions of inferences are considered more or less likely, or various quantitative approaches to meaning based on empirical studies, or optimality systems, which ordinarily do not account for inference patterns, but rank possible interpretations according to a given set of constraints of diverse kinds. In such systems form does not directly and functionally determine meaning and hence the role of inference patterns, if
287
288
III. Methods in semantic research any, is quite different from the core role in logical, deductive systems of reasoning which constitute our topic. Of course, as any well defined domain of scientific investigation, such systems too may be further formalized and perhaps eventually even axiomatized as a logical, formal system. But in the current state of linguistics their methods are often informal, appealing to semantic notions as meaning only implicitly, resulting in fragmented theories, which, however unripe for formalization, may still provide genuinely interesting and novel linguistic results. In section 2 of this article the best known logical system of first order logic is discussed. It served as point of departure of natural language semantics, in spite of its apparent limitations and idealizations. In section 3 the classical definition of a formal system is specified with its associated notions of proof and theorem, and these are related to its early applications in linguistic grammars. The hierarchy of structural complexity generated by the various kinds of formal grammars is still seen to direct the current quest in natural language semantics for proper characterizations of cognitive complexity. Meta-logical properties such as decidability of the set of theorems are introduced. In section 4 the general notion of a semantic model with its definition of truth conditions is presented, without formalization in set-theoretic notation, to serve primarily in conceptually distinguishing contingent truth (at a world) in a model from logical truth in regardless which model to represent valid forms of reasoning. Completeness is presented as a desirable, but perhaps not always feasible property of logical systems that have attained a perfect harmony between their syntactic and semantic sides. Section 5 discusses how the syntactic and semantic properties of pronouns first provided a strong impetus for using formal methods in separate linguistic research programs, that have later converged on a more integrated account of their behavior, currently still very much under investigation. In section 6 we review a variety of proof theoretic methods that have been developed in logic to understand how each of them had an impact on formal linguistic theories later. This is where constraints on derivational complexity and resource bounded generation of expressions is seen to have their proper place, issues that have only gained in importance in contemporary linguistic research. Section 7 presents an overview of the semantic methods that have been developed in logic over the past century to see how they have influenced the development of natural language semantics as a flourishing branch of cognitive science, where formal methods provide an important contribution to their scientific methods. The final section 8 contains the conclusion of this article stating that the application of formal methods, deriving from logical theories, has greatly contributed to the development of linguistics as an independent academic discipline with an integrated, interdisciplinary research agenda. Seeking a continued convergence of syntactic and semantic methods in linguistic applications will serve to develop new insights in the cognitive capacity underlying much of human information processing.
2. First order logic and natural language Many well known systems of first order logic (FOL), in which quantifiers may only range over individuals, i.e. not over properties or sets of individuals, have been designed to study inference patterns deriving from the meaning of the classical Boolean connectives of conjunction (… and …), disjunction (… or …), negation (not …) and conditionals (if … then …) and biconditionals (… if and only if …), besides the universal (every N) and existential (some N) quantifiers. Syntactically these FOL systems differed in the number
14. Formal methods in semantics of axioms, logical vocabulary or inference rules they admitted, some were optimal for simplicity of the proofs, others more congenial to the novel user, relying on the intuitive meaning of the logical vocabulary (Boolos & Jeffrey 1980; Keenan & Faltz 1985; Link 1991). The strongly reformist attitudes of the early 20th century logicians Bertrand Russell, Gottlob Frege, and Alfred Tarski, today considered the great-grandfathers of modern logic, initially steered FOL developments away from natural language, as it was regarded as too ambiguous, hopelessly vague, or content- and context-dependent. Natural language was even considered prone to paradox, since you can explicitly state that something is or is not true. This is what is meant when natural language is accused of “containing its own truth-predicate”, for the semantics of the predicate “is true” cannot be formulated in a non-circular manner for a perfectly grammatical, but self-referential sentence like This statement is false, which is clearly true just in case it is false. Similarly treacherous forms of self-referential acts with circular truth-conditions are found in simple statements such as I am lying, resembling the ancient Cretense Liar Paradox, and the syntactically overtly self-referential, but completely comprehensible The claim that this sentence contains eleven words is false, that lead later model theoretic logicians to develop innovative models of self-reference and logically sound forms of circularity, abandoning the need for rock bottom atomic elements of sets, as classical set theory had always required (Barwise & Etchemendy 1987). Initially FOL was advocated as a ‘good housekeeping’ act in understanding elementary forms of reasoning in natural language, although little serious attention was paid on just how natural language expressions should be systematically translated into FOL, given their syntactic constituent structure. This raised objections of what linguists often called ‘miraculous translation’, appealing to the implicit intuitions on truth-functional meaning only trained logicians apparently had easy access to. Although this limited logical language of FOL was never intended to come even close to modeling the wealth of expressive power in natural languages, it was considered the basic Boolean algebraic core of the logical inference engine. The descriptive power of FOL systems was subsequently importantly enriched to facilitate the Fregean adagio of compositional translation by admitting lambda abstraction, representing the denotation of a predicate as a set A by a characteristic function f that tells you for each element d in the underlying domain D whether or not d is an element of that set A, i.e. fA(d) is true if and only if d is an element of the set A (Partee, ter Meulen & Wall 1990). Higher order quantification with quantifiers ranging over sets of properties, or sets of those ad infinitum, required a type structure, derived from the abstract lambda calculus, a universal theory of functional structure. Typing formal languages resembled in some respects Bertrand Russell’s solution to avoid vicious circularity in logic (Barendregt 1984; Carpenter 1997; Montague 1974; Link 1991). More complex connectives or other truth functional expressions and operators were added to FOL, besides situations or worlds to the models to analyze modal, temporal or epistemic concepts (van Benthem 1983; Carnap 1947; Kripke 1972; Lewis 1983; McCawley 1981). The formal methods of FOL semantics varied from classical Boolean full bi-valuation with total functions that ultimately take only true and false as values, to three or more valued models in which formulas may not always have a determined truth value, initially proposed to account for presupposition failure of definite descriptions that did not have a referent in the domain of the intended model. Weaker logical systems
289
290
III. Methods in semantic research were also proposed, admitting fewer inference rules. The intuitionistic logics are perhaps the best known of these, rejecting for philosophical and perhaps conceptual reasons the classical law of double negation and the correlated rule of inference that allowed you to infer a conclusion, if its negation had been shown to lead to contradictions. In classical FOL the truth functional definition of the meaning of negation as set-theoretic complement made double negation logically equivalent to no negation at all, e.g. It is not the case that every student did not hand in a paper should at least truth-functionally mean the same as Some student handed in a paper (Partee, ter Meulen & Wall 1990). Admitting partial functions that allowed quantifiers to range over possible extensions of their already fixed, given range, ventured into for logic also innovative higher order methods, driven by linguistic considerations of pronoun resolution in discourse and various sophisticated forms of quantification found in natural language, to which we return below (Chierchia 1995; Gallin 1975; Groenendijk & Stokhof 1991; Kamp & Reyle 1993).
3. Formal systems, proofs and decidability Whatever its exact language and forms of reasoning characterized as valid, any particular logical system must adhere to some very specific general requirements, if it is to count as a formal system. A formal system must consist of four components: a lexicon specifying the terminal expressions or words, and a set of non-terminal symbols or categories, (ii) a set of production rules which determine how the well formed expressions of any category of the formal language may be generated, (iii) a set of axioms or expressions of the lexicon that are considered primitive, (iv) a set of inference rules, determining how expressions may be manipulated.
(i)
A formal system may be formulated purely abstractly without being intended as representation of anything, or it may be designed to serve as a description or simulation of some domain of real phenomena or, as intended in linguistics, modeling aspects of empirical, linguistic data. A formal proof is the product of a formal system, consisting of (i) axioms, expressions of the language that serve as intuitively obvious or in any case unquestionable first principles, assumed to be true or taken for granted no matter what, and (ii) applications of the rules of inference that generate sequences of steps in the proof, resulting in its conclusion, the final step, also called a theorem. The grammar of a language, whether logical or natural, is a system of rules that generates all and only all the grammatical or well-formed sentences of the language. But this does not mean that we can always get a definite answer to the general question whether an arbitrary string belongs to a particular language, something a child learning its first language may actually need. There is no general decision procedure determining for any arbitrary given expression whether it is or is not derivable in any particular formal system (Arbib 1969; Davis 1965; Savitch, Bach & Marsh 1987). However, this question is provably decidable for sizable fragments of natural language, even if some form of higher order quantification is permitted (Nishihara, Morita & Iwata 1990; Pratt-Hartmann 2003; Pratt-Hartmann & Third 2006). Current research on generative complexity is focused on expanding fragments of natural language that are known to be decidable to capture realistic limitations on search
14. Formal methods in semantics complexity, in attempting to characterize formal counterparts to experimentally obtained results on human limitations of cognitive resources, such as memory or processing time. Automated theorem provers certainly assist people in detecting proofs for complex theorems or huge domains. People actually use smart, but still little understood heuristics in finding derivations, trimming down the search space of alternative variable assignments, for example, by marked prosody and intonational meaning. Motivating much of the contemporary research in applying formal methods to natural language is seeking to restrain the complexity or computational power of the formal systems to less powerful, learnable and decidable fragments. In such cognitively realistic systems the inference rules constrain the search space of valuation functions or exploit limited resources in linguistically interesting and insightful ways. The general research program of determining the generative or computational complexity of natural languages first produced in the late 1950s the well known Chomsky Hierarchy of formal language theory, which classifies formal languages, their corresponding automata and the phrase-structure grammar that generate the languages as regular, context-free, context-sensitive or unrestricted rewrite systems (Arbib 1969; Gross & Lentin 1970; Hopcroft & Ullman 1979). Initially, Chomsky (1957, 1959) claimed that the rich structure of natural languages with its complex forms of agreement required the strength of unrestricted rewrite systems, or Turing machines. Quickly linguists realized that for grammars to be learnable and to claim to model in any cognitively realistic way our human linguistic competence, whether or not innate, their generative power should be substantially restricted (Peters & Ritchie 1973). Much of the contemporary research on resource-bounded categorial grammars and the economy of derivation in minimalist generative grammar, comparing formal complexity of derivations, is still seeking to distill universal principles of natural languages, considered cognitive constants of human information processing, linguistically expressed in structurally very diverse phenomena (Ades & Steedman 1982; Moortgat 1996; Morrill 1994, 1995; Pentus 2006; Stabler 1997).
4. Semantic models, validity and completeness On the semantic side of formal methods, the notion of a model M plays a crucial role in the definition of truth-conditions for formulas generated by the syntax. The meaning of the Boolean connectives is considered not to vary from one model to a next one, as it is separated in the vocabulary as the closed class of expressions of the logical vocabulary. The conjunction (… and …) is true just in case each of the conjuncts is, the disjunction (… or …) is false only in the case neither disjunct is true. The conditional is only false in the case the antecedent (if … clause) is true, but the consequent (then … clause) is false. The bi-conditional (… if and only if …) is true just in case the two parts have the same truth value. Negation (not …) simply reverses the truth value of the expression it applies to. For the interpretation of the quantifiers an additional tool is required, a variable assignment function f, which assigns to each variable x in the vocabulary of variables a referent d, an element of the domain D of the model M. Proper names and the descriptive vocabulary containing all kinds of predicates, i.e. adjectives, nouns, and verbs, are interpreted by a function P, given with model M, that specifies who was the bearer of the name, or who had a certain property corresponding to a one place predicate, or stood in a certain relation for relational predicates in the given model M.
291
292
III. Methods in semantic research Linguists were quick to point out that in natural language at least conjunctions and disjunctions connect not only full clauses, but also noun phrases, which is not always reducible to a sentential connective, e.g. John and Mary met ≠ John met and Mary met. This kind of linguistic criticism quickly led to generalizations of Boolean operations in a logical system, where variable assignment functions are generalized and given the flexibility to assign referents of appropriate complex types in a richer higher order logic, but such complexities need not concern us here. A formal semantic model M consists hence of: (i) a domain D of semantic objects, sometimes classified into types or given a particular internal structure, and (ii) a function P that assigns appropriate denotations to the descriptive vocabulary of the language interpreted. If the language also contains quantifiers, the model M comes equipped with a given variable assignment function g, often considered arbitrary, to provide a referent, which is an element of D, for all free variables. The given variable assignment function is sometimes considered to represent the current context in some contemporary systems that investigate context dependencies and indexicals. In FOL the universal quantifier every x [N(x)] is interpreted as true in the model M, if all possible alternative assignment functions g to the variable x that it binds provide a referent d in the denotation of N. So not only the given variable assignment function g provides a d in the denotation of N, but all alternative assignments g that may provide another referent, but could equally well have been considered the given one, also do. For the existential quantifier some x [N(x)] only one such variable assignment, the given one or another alternative variable assignment function, suffices to interpret the quantifier as true in the model M. An easy way to understand the effect of the interpretation of quantifiers in clauses is to see that for a universal NP the set denoted by N should be a subset of the set denoted by the VP, e.g. every student sings requires that the singers include all the students. Similarly, the existential quantifier requires that the intersection of the denotation of the N and the denotation of the VP is not empty, e.g. some student sings means that among the singers there is at least one student. Intensional models generalize this elementary model theory for extensional FOL to characterize truth in a model relative to a possible world, situation or some other index, representing, for instance, temporal or epistemic variability. Intensional operators typically require a clause in its scope to be true at all or at only some such indices, mirroring strong, universal and existential quantifiers respectively. The domain of intensional models must be enriched with the interpretation of the variables referring to such indices, if such meta-variables are included in the language to be interpreted. Otherwise they are considered to be external to the model, added as a set of parameters indexing the function interpreting the language. Domains may also be structured as (semi)lattices or other partial orders, which has proven useful for the semantics of plurals, mass terms and temporal reference to events, but such further variations on FOL must remain outside the scope of the elementary exposition in this article. To characterize logical truths, reflecting the valid reasoning patterns, one simply generalizes over all formulas true in all logically possible models, to obtain those formulas that must be true as a matter of necessity or form only, due to the meaning assigned to their logical vocabulary, irrespective of what is actually considered to be factually the case in the models. By writing out syntactic proofs with all their premises conjoined as antecedents of a conditional of which the conclusion is the consequent one may test in
14. Formal methods in semantics a semantic way the validity of the inference. If such a conditional cannot be falsified, the proof is valid and vice versa. A semantic interpretation of a language, natural or otherwise, is considered formal only if it provides such precise logical models in which the language can be systematically interpreted. Accordingly, contingent truths, which depend for their truth on what happens to be the case in the given model, are properly distinguished from logical truths, which can never be false in any possible model, because the meaning of the logical vocabulary is fixed outside the class of models and hence remains invariable. A logical system in which every proof of a theorem can be proven to correspond to a semantically valid inference pattern and vice versa is called a complete system, as FOL is. If a formal system contains statements that are true in every possible model but cannot be proven as theorems within that system by the admitted rules of inference and the given axiom base, the system is considered incomplete. Familiar systems of arithmetic have been proven to be incomplete, but that does not disqualify them from their sound use in practice and they definitely still serve as a valuable tool in applications.
5. Formalizing linguistic methods Given the fundamental distinction between syntactic, proof-theoretic and semantic, model-theoretic characterizations of reasoning in theories that purport to model meaning and interpretation, the general question arises what (dis)advantages these two methodologically distinct approaches respectively may have for linguistic applications. Although in its generality this question may not be answerable in a satisfactory way, it is clear that at least in the outset of linguistics as its own, independent scientific discipline, quite different and mostly disconnected, if not antagonizing research communities were associated with the two strategies. This separation of minds was only too familiar to logicians from the early days of modern logic, where proof theorists and model theoretic semanticists often drew blood in their disputes on the priority, conceptual or otherwise, of their respective methods. Currently seeking convergence of research issues in syntax and semantics is much more en vogue and an easy go-between in syntactic and semantic methods has already proven to pay off in obtaining the best linguistic explanations and new insights. One of the best examples of how linguistic questions could fruitfully be addressed both by syntactic and semantic formal methods, at first separately, but later in tandem, is the thoroughly studied topic of binding pronouns and its associated concept of quantifier scope. Syntacticians focused primarily on the clear configurational differences between free (1a) and bound pronouns (1b), and reflexive pronouns (1c), which all depend in different ways on the subject noun phrase that precedes and commands them in the same clause. Co-indexing was their primary method of indicating binding, though no interpretive procedure was specified with it, as meaning was considered elusive (Reinhart 1983a, 1983b). (1) a. [Every student]i who knows [John/a professor]j loves [him]*i, j. b. [Every student]i loves [[his]i teacher]*i, j. c. [Every student]i who knows [John/a professor]j loves [himself]i, *j.
293
294
III. Methods in semantic research This syntactic perspective on the binding behavior of pronouns within clauses had deep relations to constraints on movement as a transformation on a string, to which we return below. It limited its consideration of data to pronominal dependencies among clauses within sentences, disregarding the fact that singular universal quantifiers cannot bind pronouns across sentential boundaries (2a,b), but proper names, plurals and indefinite noun phrases typically do (2c). (2) a. [Every student]i handed in a paper. [He]*i, j passed the exam. b. [Every student]i who handed in a paper [ t ]i passed the exam. c. [John/A student/All students]i handed in a paper. [He/they]i, j passed the exam. Semanticists had to understand how pronominal binding in natural language was in some respects similar, but in other respects quite different from the ordinary variable binding of FOL. From a semantic perspective the first puzzle was how ordinary proper names and other referring expressions, including freely referring pronouns, that were considered to have no logical scope and hence could not enter into scope ambiguities, could still force pronouns to corefer (3a), even in intensional contexts (3b). (3) a. John/he loves his mother. b. John believes that Peter loves his mother. c. Every student believes that Peter loves his mother. In (3a) the reference of the proper name John or the contextually referring free pronoun he fixes the reference of the possessive pronoun his. In (3b) the possessive pronoun his in the subordinate clause could be interpreted as dependent on Peter, but equally easily as dependent upon John in the main clause. If FOL taught semanticists to identify bound variables with those variables that were syntactically within the scope of a existential or universal quantifier, proper names had to be reconsidered as quantifiers having scope, yet referring rigidly to the same individual, even across intensional contexts. By considering quantifying in as a primary semantic technique to bind variables simultaneously, first introduced in Montague (1974), semanticists fell into the logical trap of identifying binding with configurational notions of linear scope. This fundamental connection ultimately had to be abandoned, when the infamous Geach sentence (4) (4) Every farmer who owns a donkey beats it. made syntacticians as well as semanticists realize that existential noun phrases in restrictive relative clauses of universal subject noun phrases could inherit, as it were, their universal force, creating for the interpretation of (4) cases of farmers and donkeys over which the quantifier was supposed to range. This is called unselective binding, introduced first by David Lewis (Lewis 1972, 1983), but brought to the front of research in semantic circles in Kamp (1981). Generalizing quantifying in as a systematic procedure to account also for intersentential binding of pronouns, as in (2), was soon also realized to produce counterintuitive results. This motivated an entirely new development of dynamic semantics, where the interpretation of a given sentence would partially determine the interpretation of the next sentence in a text and pronominal binding was conceptually once
14. Formal methods in semantics and for all separated from the logical or configurational notion of scope (Groenendijk & Stokhof 1991; Kamp & Reyle 1993). The interested reader is referred to article 38 (Dekker) Dynamic semantics and article 40 (Büring) Pronouns for further discussion and details of the resulting account.
6. Linguistic applications of syntactic methods In logic itself quite a few different flavors of formal systems had been developed based on formal proof-theoretic (syntactic) or model-theoretic (semantic) methods. The best known on the proof-theoretic side are axiomatic proof theory (Jeffrey 1967), Gentzen sequent calculus (Gentzen 1934), combinatorial logic (Curry 1961), and natural deduction (Jeffrey 1967; Partee, ter Meulen & Wall 1990), each of which have made distinct and significant contributions in various developments in semantics. Of the more semantically flavored developments most familiar are Tarski’s classical notion of satisfaction in models (Tarski 1956), the Lambek and other categorial grammars (Ajdukiewicz 1935; Bar Hillel 1964; van Benthem 1987, 1988; Buszkowski 1988; Buszkowski & Marciszewski 1988; Lambek 1958; Oehrle, Bach & Wheeler 1988), tightly connecting syntax and semantics via type theory (Barendregt 1984; van Benthem 1991; Carpenter 1997; Morrill 1994), Beth’s tableaux method (Beth 1970), game theoretic semantics (Hintikka & Kulas 1985), besides various intensional logical systems, enriched with indices representing possible worlds or other modal notions, which each also have led to distinctive semantic applications (Asher 1993; van Benthem 1983; Cresswell 1985; Montague 1974). The higher order enrichment of FOL with quantification over sets, properties of individuals or properties of properties in Montague Grammar (Barwise & Cooper 1981; van Benthem & ter Meulen 1985; Montague 1974) at least initially directed linguists’ attention away from the global linguistic research program of seeking to constrain the generative capacity and computational complexity of the formal methods in order to model realistically the cognitive capacities of human language users, as it focused everyone’s attention on compositionality, type theory and type shifting principles, and adopted a fully generalized functional structure. The next two sections selectively review some of the formal methods of these logical systems that have led to important semantic applications and innovative developments in linguistic theory. An axiomatic characterization of a formal system is the best way to investigate its logic, as it matches its semantics to prove straightforwardly its soundness and completeness, i.e. demonstrating that every provable theorem is true in all models (semantically valid) and vice versa. For FOL a finite, in fact small number of logically true expressions suffice to derive all and only all valid expressions, i.e. FOL is provably complete. But in actually constructing new proofs an axiomatic characterization is much less useful, for it does not offer any reliable heuristics as guidance for an effective proof search. The main culprit is the rule of inference guaranteeing the transitivity of composition, i.e. to derive A → C from A → B and B → C requires finding an expression B which has no trace in the conclusion A → C. Since there are infinitely many possible such Bs, you cannot exhaustively search for it. The Gentzen sequent calculus (Gentzen 1934) is known to be equivalent to the axiomatic characterization of FOL, and it proved that any proof of a theorem using the transitivity of composition, or its equivalent, the so called Cut inference in Gentzen sequent calculus, may be transformed into a proof that avoids using this
295
296
III. Methods in semantic research rule. Therefore, transitivity of composition is considered ‘logically harmless’, since the Gentzen sequent calculus effectively limits searching for a proof of any theorem to the expressions constituting the theorem you want to derive, called the subformula property. The inference rules in Gentzen sequent calculus are hence guaranteed to decompose the complexity of the expressions in a derivation, making the question whether an expression is a theorem decidable. But from the point of view of linguistic applications the original Gentzen calculus harbored another drawback as generative system. It allowed structural inference rules that permuted the order of premises in a proof or rebracketed any triple (associativity), making it hard to capture linguistically core notions of dominance, governance or precedence between constituents of a sentence to be proven grammatical. To restore constitutive order to the premises, the premises had to be regarded to create an linearly ordered sequence or n-tuple, or a multiset (in mathematics, a multiset (or bag) is a generalization of a set. A member of a multiset can have more than one instances, while each member of a set has only one, unique instance). This is now customary within the current categorical grammars deriving from Gentzen’s system, reviewed below in the semantic methods that affected developments in natural language semantics (Moortgat 1997). Perhaps the most familiar proof theoretic characterization of FOL is Natural Deduction, in which rules of inference systematically introduce or eliminate the connectives and quantifiers (Jeffrey 1967; Partee, ter Meulen & Wall 1990). This style of constructing proofs is often taught in introductory logic classes, as it does provide a certain intuitive heuristics in constructing proofs and closely follows the truth-conditional meaning given to the logical vocabulary, while systematically decomposing the conclusion and given assumptions until atomic conditions are obtained. Proving a theorem with natural deduction rules still requires a certain amount of ingenuity and insight, which may be trained by practice. But human beings will never be perfected to attain logical omniscience, i.e. the power to find each and every possible proof of a theorem. No actual success in finding a proof may mean either you have to work harder at finding it or that the expression is not a theorem, but you never know for sure which situation you are in (Boolos & Jeffrey 1980; Stalnaker 1999). The fundamental demand that grammars of natural languages must realistically model the human cognitive capacities to produce and understand language has led to a wealth of developments in searching how to cut down on the generative power of formal grammars and their corresponding automata. Early in the developments of generative grammar, the unrestricted deletion transformation was quickly considered the most dangerously powerful operation in an unrestricted rewrite system or Turing machine, as it permitted the deletion of any arbitrary expressions that were redundant in generating the required surface expression (Peters & Ritchie 1973). Although deletion as such has still not been eliminated altogether as possible effect of movement, it is now always constrained to leave a trace or some other formal expression, making deleted material recoverable. Hence no expressions may simply disappear in the context of a derivation. The Empty Category Principle (ECP) substantiated this requirement further, stating that all traces of moved noun phrases and variables must be properly governed (Chomsky 1981; Haegeman 1991; van Riemsdijk & Williams 1986). This amounts to requiring them to be c-commanded by the noun phrase interpreted as binding them (c-command is a binary relation between nodes in a tree structure defined as follows: Node A c-commands node B iff A ≠ B, A does not dominate B and B does
14. Formal methods in semantics not dominate A, and every node that dominates A also dominates B.). Extractions of an adjunct phrase out a wh-island as in (5) *Howi did Mary ask whether someone had fixed the car ti? or moving wh-expressions out of a that-clause as in (6) *Whoi does Mary believe that ti will fix the car? are clearly ungrammatical, because they violate this ECP condition, as the traces ti are co-indexed with and hence intended to be interpreted as bound by expressions outside their proper governance domain. The classical logical notion of quantifier scope is much less restricted, as ordinarily quantifiers may bind variables in intensional context without raising any semantic problems of interpretation, as we saw above in (3c) (Dowty, Wall & Peters 1981; Partee, ter Meulen & Wall 1990). For instance, in intensional semantics the sentence (7) (7) Mary believes that someone will fix the car. has at least one so called ‘de re’ interpretation in which someone is ‘quantified in’ and assigned a referent in the actual world, of whom Mary believes that he will fix the car in a future world, where Mary’s beliefs have come true. Such wide scope interpretations of noun phrases occurring inside a complementizer that-CP is considered in generative syntax a form of opaque or hidden movement at LF, regarded a matter of semantic interpretation, and hence external to grammar, i.e. not a question of derivation of syntactic surface word order (May 1985). Surface wide scope wh-quantifiers in such ‘de re’ constructions binding overt pronouns occurring within the intensional context i.e. within the that-clause are perfectly acceptable, as in (8). (8) Of whomi does Mary believe that hei will fix the car? Anyone intending to convey that Mary’s belief regarded a particular person, rather than someone hypothetically assumed to exist, would be wise to use such an overt wide scope clause as in (8), according to a background pragmatic view that speakers should select the optimal syntactic form to express their thought and to avoid any possible misunderstanding. This theoretical division of labor between tangible, syntactic movement to generate proper surface word order and intangible movement to allow disambiguation of the semantic scope of quantifiers has perhaps been the core bone of contention over many years between on the one hand the generative formal methods, in which semantic ambiguity does not have to be syntactically derived, and on the other hand, categorial grammar and its later developments in Montague grammar, which required full compositionality, i.e. syntactic derivation must determine semantic interpretation, disambiguating quantifier scope by syntactic derivation. In generative syntax every derivational difference had to be meaningful, however implicit this core notion remained, but in categorial grammars certain derivational differences could be provably semantically equivalent and hence meaningless, often denigratingly called the problem of spurious ambiguities. To
297
298
III. Methods in semantic research characterize the logical equivalence of syntactically distinguished derivations required an independent semantic characterization of their truth conditions, considered a suitable task of logic, falling outside the scope of linguistic grammar proper, according to most generativists. This problem of characterizing which expressions with different derivational histories would be true in exactly the same models, hence would be logically equivalent, simply does not arise in generative syntax, as it hides semantic ambiguities as LF movement not reflected in surface structure, avoiding syntactic disambiguation of semantic ambiguities. A much stronger requirement on grammatical derivations is to demand methodologically that all and only the constitutive expressions of the derived expression must be used in a derivation, often called surface compositionality (Cresswell 1985; Partee 1979). This research program is aiming to eliminate from linguistic theory anything that is not absolutely necessary. Chomsky (1995) claimed that both deep structure, completely determined by lexical information, and surface structure, derived from it by transformations, may be dispensed with. Given that a language consists of expressions which match sound structure to representations of their meaning, Universal Grammar should consist merely of a set of phonological, semantic, and syntactic features, together with an algorithm to assemble features into lexical expressions and a small set of operations, including move and merge, that constitute syntactic objects, the computational system of human languages. The central thesis of this minimalist framework is that the computational system is the optimal, most simple, solution to legibility conditions at the phonological and semantic interface. The goal is to explain all the observed properties of languages in terms of these legibility conditions, and properties of the computational system. Often advocated since the early 1990s is the Lexicalist Hypothesis requiring that syntactic transformations may operate only on syntactic constituents, and can only insert or delete designated elements, but cannot be used to insert, delete, permute, or substitute parts of words. This Lexicalist Hypothesis, which is certainly not unchallenged even among generativists, comes in two versions: (a) a weak one, prohibiting transformations to be used in derivational morphology, and (b) a strong version prohibiting use of transformations in inflection. It constitutes the most fundamentally challenging attempt from a syntactic perspective to approach surface compositionality as seen in the well-known theories of natural language semantics to which we now turn.
7. Linguistic applications of semantic methods The original insight of the Polish logician Alfred Tarski was that the truth conditional semantics of any language must be stated recursively in a distinct meta-language in terms of satisfaction of formulas consisting of predicates and free variables to avoid the paradoxical forms of self-reference alluded to above (Tarski 1956; Barwise & Etchemendy 1987). By defining satisfaction directly, and deriving truth conditions from it, a proper recursive definition could be formulated for the semantics of any complex expression of the language. For instance, in FOL an assignment satisfies the complex sentence S and S if and only if it satisfies S and it also satisfies S. For universal quantification it required an assignment f to satisfy the sentence ‘Every x sings’ if and only if for every individual that some other assignment f assigns to the variable x, while assigning the same things as f to the other variables, f satisfies sings(x), i.e. the value of every such f (x) is an element in the set of singers. Tarski’s definition of
14. Formal methods in semantics satisfaction is compositional, since for an assignment to satisfy a complex expression depends only on the syntactic composition of its constituents and their semantics, as Gottlob Frege had originally required (Frege 1892). Truth conditions can subsequently be stated relative to a model and an arbitrary given assignment, assigning all free variables their reference. Truth cannot be compositionally defined directly for ‘Every x sings’ in terms of the truth of sings(x), because sings(x) has a free variable x, so its truth depends on which assignment happens to be the given one. The Tarskian truthconditional semantics of FOL also provided the foundation for natural language semantics, limited to fragments that do not contain any truth or falsity predicate, nor verbs like to lie, nor other expressions directly concerned with veridicality. The developments of File Change Semantics (Heim 1982), Discourse Representation Theory (Kamp 1981; Kamp & Reyle 1993), Situation Theory (Barwise & Perry 1983; Seligman & Moss 1997) and dynamic Montague Grammar (Chierchia 1995; Groenendijk & Stokhof 1991), that all allowed free variables or reference markers representing certain use of pronouns to be interpreted as if bound by a widest scope existential quantifier, even if they occurred in different sentences, fully exploit this fundamental Tarskian approach to compositional semantics by satisfaction. Other formal semantics methods for FOL were subsequently developed in the second half of the 20th century as alternatives to Tarskian truth-conditional semantics. Beth (1970) designed a tableaux method in which a systematic search for counterexamples to the assumed validity of a reasoning pattern seeking to verify the premises, but falsify its conclusion leads in a finite number of decompositional steps either to such a counterexample, if one exists, or to closure, tantamount to the proof that no such counterexample exists (Beth 1970; Partee, ter Meulen & Wall 1990). This semantic tableaux method provided a procedure to enumerate the valid theorems of FOL, because it only required a finite number of substitutions in deriving a theorem: (i) the expression itself, (ii) all of its constituent expressions, and (iii) certain simple combinations of the constituents depending on the premises. Hence any tableau for a valid theorem eventually closes, and the method produces a positive answer. It does not however constitute a decision procedure for testing the validity of any derivation, since it does not enumerate the set of expressions that are not theorems of FOL. Game-theoretic semantics characterizes the semantics of FOL and richer, intensional logics in terms of rules for playing a verification game between a truth-seeking player and falsification seeking, omniscient Nature (Hintikka & Kulas 1985; Hintikka & Sandu 1997; Hodges 1985). Its interactive and epistemic flavor made it especially suitable for the semantics of interrogatives in which requests for information are acts of inquiry resolved by the answerer, providing the solicited information (Hintikka 1976). Such information-theoretic methods are currently further explored in the generalized context of dynamic epistemic logic, where communicating agents each have access to partial, private and publicly shared information and seek to share or hide information they may have depending on their communicative needs and intentions (van Ditmarsch, van der Hoek & Kooi 2007). Linguistic applications to the semantics of dialogue or multi-agent conversations in natural language already seem promising (Ginzburg & Sag 2000). It was first shown in Skolem (1920) how second order methods could provide novel tools for logical analysis by rewriting any linear FOL formula with an existential quantifier in the scope of a universal quantifier into a formula with a quantification prefix consisting of existential quantifiers ranging over assignment functions, followed by
299
300
III. Methods in semantic research only monadic (one place) universal quantifiers binding individual variables. The dependent first order existential quantifier is eliminated by allowing such quantification over second-order choice functions that assign the value of the existentially quantified dependent variable as a function of the referent assigned to the monadic, universally quantified individual variable preceding it. Linguistic applications using such Skolem functions have been given in the semantics of questions (Engdahl 1986) and the resolution of functional pronouns (Winter 1997). The general strategy to liberate FOL from the linear dependencies of quantifiers by allowing higher order quantification or partially ordered, i.e. branching quantifier prefixes, was linguistically exploited in the semantic research on branching quantifiers (Hintikka & Sandu 1997; Barwise 1979). From a linguistic point of view the identification of linear quantifier scope with bound occurrences of variables in their bracketed ranges never really seemed justified, since informational dependencies such as coreference of pronouns bound by an indefinite noun phrase readily cross sentential boundaries, as we saw in (2c). Furthermore, retaining perfect information on the referents already assigned to all preceding pronouns smells of unrealistic logical omniscience, where human memory limitations and contextual constraints are disregarded. It is obviously much too strong as epistemic requirement on ordinary people sharing their necessarily always limited, partial information (Seligman & Moss 1997). Game-theoretic semantics rightly insisted that a proper understanding of the logic of information independence and hence of the lack of information was just as much needed for natural language applications, as the logic of binding and other informational dependencies. Such strategic reconsiderations of the limitations of foundational assumptions of logical systems have prompted innovative research in logical research programs, considerably expanding the formal methods available in natural language semantics (Muskens, van Benthem & Visser 1997). By exploiting the full scale higher order quantification of the type-theoretic categorial grammars Montague Grammar first provided a fully compositional account of the translation of syntactically disambiguated natural language expressions to logical expressions by treating referential noun phrases semantically on a par with quantificational ones as generalized quantifiers denoting properties of sets of individuals. This was accomplished obviously at the cost of generating spurious ambiguities ad libitum, giving up on the program of modeling linguistic competence realistically (van Benthem & ter Meulen 1985; Keenan & Westerståhl 1997; Link 1991; Montague 1974; Partee 1979). Its type theory, based only on two primitive types, e for individual denoting expressions and t for truthvalue denoting expressions, forged a perfect fit between the syntactic categories and the function-argument structure of their semantics. For instance, all nouns are considered syntactic objects that require a determiner on their left side to produce a noun phrase and semantically denote a set of entities, of type , which is an element in the generalized quantifier of type denoted by the entire noun phrase. Proper names, freely referring pronouns, universal and existential NPs are hence treated semantically on a par as denoting a set of sets of individuals. This fruitful strategy led to a significant expansion of the fragments of natural languages that were provided with a compositional model-theoretic semantics, including many kinds of adverbial phrases, degree and measurement expressions, unusual and complex quantifier phrases, presuppositions, questions, imperatives, causal and temporal expressions, but also lexical relations that affected reasoning patterns (Chierchia 1995; Krifka 1989). Logical properties of generalized quantifiers prove to be very useful in explaining, for instance, not only which noun
14. Formal methods in semantics phrases are acceptable in pleonastic or existential contexts, but also why the processing time of noun phrases may vary in a given experimental situation and how their semantic complexity may also constrain their learnability. In pressing on for a proper, linguistically adequate account of pronouns in discourse, and for a cognitively realistic logic of information sharing in changing contexts, new tools that allowed for non-linear structures to represent information content play an important conceptually clarifying role in separating quantifier scope from the occurrence of variables in the linear or partial order of formulas of a logical language, while retaining the core model-theoretic insights in modeling inference as concept based on Tarskian satisfaction conditions.
8. Conclusions The development of formal methods in logic has contributed essentially to the emancipation of linguistic research into an academic community where formal methods were given their proper place as explanatory tool in scientific theories of meaning and interpretation. Although logical languages are often designed with a particular purpose in mind, they reflect certain interesting computational or semantic properties also exhibited, though sometimes implicitly, in natural languages. The properties of natural languages that lend themselves for analysis and explanation by formal methods have increased steadily over the past century, as the formal tools of logical systems were more finely chiseled to fit the purpose of linguistic explanation better. Even more properties will most likely become accessible for linguistic explanation by formal methods over the next century. The issues of cognitive complexity, characterized at many different levels from the neurobiological, molecular structure detected in neuro-imaging to interactive behavioral studies, and experimental investigations of processing time provide a new set of empirical considerations in the application of formal methods to natural language. They drive experimental innovations and require an interdisciplinary research agenda to integrate the various modes of explanation into a coherent model of human language use and communication of information. The current developments in dynamic natural language semantics constitute major improvements in expanding linguistic application to a wider range of discourse phenomena. The forms of reasoning in which context dependent expressions may change their reference during the processing of the premises are now considered to be interesting aspects of natural languages, that logical systems are challenged to simulate, rather than avoid, as our great-grandfathers’ advice originally directed us to do. There is renewed attention to limit in a principled and empirically justified way the search space complexity to decidable fragments of FOL and to restrict the higher order methods in order to reduce the complexity to model cognitively realistic human processing power. Such developments in natural language will converge eventually with the syntactic research programs focusing on universals of language as constants of human linguistic competence.
9. References Ades, Anthony E. & Mark J. Steedman 1982. On the order of words. Linguistics & Philosophy 4, 517–558. Ajdukiewicz, Kazimierz 1935. Die syntaktische Konnexität. Studia Philosophica 1, 1–27. English translation in: S. McCall (ed.). Polish Logic. Oxford: Clarendon Press, 1967.
301
302
III. Methods in semantic research Arbib, Michael A. 1969. Theories of Abstract Automata. Englewood Cliffs, NJ: Prentice Hall. Asher, Nicholas 1993. Reference to Abstract Objects in Discourse. Dordrecht: Kluwer. Bar-Hillel, Yehoshua 1964. Language and Information: Selected Essays on their Theory and Application. Reading, MA: Addison-Wesley. Barendregt, Hendrik P. 1984. The Lambda Calculus. Amsterdam: North-Holland. Barwise, Jon 1979. On branching quantifiers in English. Journal of Philosophical Logic 8, 47–80. Barwise, Jon 1989. The Situation in Logic. Stanford, CA: CSLI Publications. Barwise, Jon & Robin Cooper 1981. Generalized quantifiers and natural language. Linguistics & Philosophy 4, 159–219. Barwise, Jon & John Etchemendy 1987. The Liar. An Essay on Truth and Circularity. Oxford: Oxford University Press. Barwise, Jon & John Perry 1983. Situations and Attitudes. Cambridge, MA: The MIT Press. van Benthem, Johan 1983. The Logic of Time: A Model-Theoretic Investigation into the Varieties of Temporal Ontology and Temporal Discourse. Dordrecht: Reidel. van Benthem, Johan 1986. Essays in Logical Semantics. Dordrecht: Reidel. van Benthem, Johan 1987. Categorial grammar and lambda calculus. In: D. Skordev (ed.). Mathematical Logic and its Applications. New York: Plenum, 39–60. van Benthem, Johan 1988. The Lambek calculus. In: R.T. Oehrle, E. Bach & D. Wheeler (eds.). Categorial Grammars and Natural Language Structures. Dordrecht: Reidel, 35–68. van Benthem, Johan 1991. Language in Action: Categories, Lambdas and Dynamic Logic. Amsterdam: North-Holland. van Benthem, Johan & Alice ter Meulen 1985. Generalized Quantifiers in Natural Language. Dordrecht: Foris. van Benthem, Johan & Alice ter Meulen 1997. Handbook of Logic and Language. Amsterdam: Elsevier. Beth, Evert 1970. Aspects of Modern Logic. Dordrecht: Reidel. Boolos, George & Richard Jeffrey 1980. Computability and Logic. Cambridge: Cambridge University Press. Buszkowski, Wojciech 1988. Generative power of categorial grammars. In: R.T. Oehrle, E. Bach & D. Wheeler (eds.). Categorial Grammars and Natural Language Structures. Dordrecht: Reidel, 69–94. Buszkowski, Wojciech & Witold Marciszewski 1988. Categorial Grammar. Amsterdam: Benjamins. Carnap, Rudolf 1947. Meaning and Necessity. Chicago, IL: The University of Chicago Press. Carpenter, Bob 1997. Type-Logical Semantics. Cambridge, MA: The MIT Press. Chierchia, Gennaro 1995. Dynamics of Meaning: Anaphora, Presupposition, and the Theory of Grammar. Chicago, IL: The University of Chicago Press. Chomsky, Noam 1957. Syntactic Structures. The Hague: Mouton. Chomsky, Noam 1959. On certain formal properties of grammars. Information and Control 2, 137–167. Chomsky, Noam 1981. Lectures on Government and Binding. Dordrecht: Foris. Chomsky, Noam 1995. The Minimalist Program. Cambridge, MA: The MIT Press. Chomsky, Noam & George A. Miller 1958. Finite-state languages. Information and Control 1, 91–112. Chomsky, Noam & George A. Miller 1963. Introduction to the formal analysis of natural languages. In: R.D. Luce, R. Bush & E. Galanter (eds.). Handbook of Mathematical Psychology, vol. 2. New York: Wiley, 269–321. Cresswell, Max J. 1985. Structured Meanings: The Semantics of Propostitional Attitudes. Cambridge, MA: The MIT Press. Curry, Haskell B. 1961. Some logical aspects of grammatical structure. In: R. Jakobson (ed.). Structure of Language and its Mathematical Aspects. Providence, RI: American Mathematical Society, 56–68. Davidson, Donald & Gilbert Harman 1972. Semantics of Natural Language. Dordrecht: Reidel.
14. Formal methods in semantics Davis, Martin 1965. The Undecidable: Basic Papers on Undecidable Propositions, Unsolvable Problems and Computable Functions. Hewlett, NY: Raven Press. van Ditmarsch, Hans, Wiebe van der Hoek & Barteld Kooi 2007. Dynamic Epistemic Logic. Dordrecht: Springer. Dowty, David, Robert Wall & Stanley Peters 1981. Introduction to Montague Semantics. Dordrecht: Reidel. Engdahl, Elisabeth 1986. Constituent Questions. Dordrecht: Reidel. Frege, Gottlob 1892. Über Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik 100, 25–50. Reprinted in: G. Patzig (ed.). Funktion, Begriff, Bedeutung. Fünf logische Studien, 3rd edn. Vandenhoeck Hoeck & Ruprecht: Göttingen, 1969, 40–65. English translation in: P. Geach & M. Black (eds.). Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1980, 56–78. Gabbay, Dov & Franz Guenthner 1983. Handbook of Philosophical Logic, vol. 1–4. Dordrecht: Reidel. Gallin, Daniel 1975. Intensional and Higher-Order Modal Logic. With Applications to Montague Semantics. Amsterdam: North-Holland. Gentzen, Gerhard 1934. Untersuchungen über das logische Schließen I & II. Mathematische Zeitschrift 39, 176–210, 405–431. Ginzburg, Johnathan & Ivan Sag 2000. Interrogative Investigations: The Form, Meaning and Use of English Interrogatives. Stanford, CA: CSLI Publications. Groenendijk, Jeroen & Martin Stokhof 1991. Dynamic Predicate Logic. Linguistics & Philosophy 14, 39–100. Gross, Maurice & Andre Lentin 1970. Introduction to Formal Grammars. Berlin: Springer. Haegeman, Liliane 1991. Introduction to Government and Binding Theory. Oxford: Blackwell. Heim, Irene R. 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation. University of Massachusetts, Amherst, MA. Reprinted: Ann Arbor, MI: University Microfilms. Hintikka, Jaakko 1976. The Semantics of Questions and the Questions of Semantics: Case Studies in the Interrelations of Logic, Semantics, and Syntax. Amsterdam: North-Holland. Hintikka, Jaakko & Jack Kulas 1985. Anaphora and Definite Descriptions: Two Applications of Game-Theoretical Semantics. Dordrecht: Reidel. Hintikka, Jaakko & Gabriel Sandu 1997. Game-theoretical semantics. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 361–410. Hodges, Wilfrid 1985. Building Models by Games. Cambridge: Cambridge University Press. Hopcroft, John E. & Jeffrey D. Ullman 1979. Introduction to Automata Theory, Languages and Computation. Reading, MA: Addison-Wesley. Jeffrey, Richard 1967. Formal Logic: Its Scope and Limits. New York: McGraw-Hill. Kamp, Hans 1981. A theory of truth and semantic representation. In: J. Groenendijk (ed.). Formal Methods in the Study of Language. Amsterdam: Mathematical Centre, 277–322. Kamp, Hans & Uwe Reyle 1993. From Discourse to Logic. Dordrecht: Kluwer. Keenan, Edward L. & Leonard M. Faltz 1985. Boolean Semantics for Natural Language. Dordrecht: Reidel. Keenan, Edward & Dag Westerståhl 1997. Generalized quantifiers in linguistics and logic. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 837–893. Krifka, Manfred 1989. Nominal reference, temporal constitution and quantification in event semantics. In: R. Bartsch, J. van Benthem & P. van Emde Boas (eds.). Semantics and Contextual Expressions. Dordrecht: Foris, 75–115. Kripke, Saul A. 1972. Naming and necessity. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 253–355 and 763–769. Lambek, Joachim 1958. The mathematics of sentence structure. American Mathematical Monthly 65, 154–170.
303
304
III. Methods in semantic research Lewis, David 1972. General semantics. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 169–218. Lewis, David 1983, Philosophical Papers, vol. 1. Oxford: Oxford University Press. Link, Godehard 1991. Formale Methoden in der Semantik. In: A. von Stechow & D. Wunderlich (eds.). Semantik. Ein internationales Handbuch der zeitgenössischen Forschung (HSK 6). Berlin: de Gruyter, 835–860. May, Robert 1985. Logical Form: Its Structure and Derivation. Cambridge, MA: The MIT Press. McCawley, James D. 1981. Everything that Linguists Have Always Wanted to Know About Logic But Were Ashamed to Ask. Chicago, IL: The University of Chicago Press. Montague, Richard 1974. Formal Philosophy. Selected Papers of Richard Montague. Edited and with an introduction by Richard H. Thomason. New Haven, CT: Yale University Press. Moortgat, Michael 1996. Multimodal linguistic inference. Journal of Logic, Language and Information 5, 349–385. Moortgat, Michael 1997. Categorial type logics. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 93–179. Morrill, Glyn 1994. Type Logical Grammar. Categorial Logic of Signs. Dordrecht: Kluwer. Morrill, Glyn 1995. Discontinuity in categorial grammar. Linguistics & Philosophy 18, 175–219. Muskens, Reinhard, Johan van Benthem & Albert Visser 1997. Dynamics. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 587–648. Nishihara, Noritaka, Kenichi Morita & Shigenori Iwata 1990. An extended syllogistic system with verbs and proper nouns, and its completeness proof. Systems and Computers in Japan 21, 96–111. Oehrle, Richard T., Emmon Bach & Deirdre Wheeler 1988. Categorial Grammars and Natural Language Structures. Dordrecht: Reidel. Partee, Barbara 1979. Semantics – mathematics or psychology? In: R. Bäuerle, U. Egli & A. von Stechow (eds.). Semantics from Different Points of View. Berlin: Springer. Partee, Barbara, Alice ter Meulen & Robert Wall 1990. Mathematical Methods in Linguistics. Dordrecht: Kluwer. Pentus, Mati 2006. Lambek calculus is NP-complete. Theoretical Computer Science 357, 186–201. Peters, Stanley & Richard Ritchie 1973. On the generative power of transformational grammars. Information Sciences 6, 49–83. Pratt-Hartmann, Ian 2003. A two-variable fragment of English. Journal of Logic, Language & Information 12, 13–45. Pratt-Hartmann, Ian & Allan Third 2006. More fragments of language. Notre Dame Journal of Formal Logic 47, 151–177. Reinhart, Tanya 1983a. Anaphora and Semantic Interpretation. London: Croom Helm. Reinhart, Tanya 1983b. Coreference and bound anaphora: A restatement of the anaphora question. Linguistics & Philosophy 6, 47–88. van Riemsdijk, Henk & Edwin Williams 1986. Introduction to the Theory of Grammar. Cambridge, MA: The MIT Press. Savitch, Walter J., Emmon Bach & Wiliam Marsh 1987. The Formal Complexity of Natural Language. Dordrecht: Reidel. Seligman, Jerry & Lawrence S. Moss 1997. Situation theory. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 239–280. Skolem, Thoralf 1920. Logisch-kombinatorische Untersuchungen über die Erfüllbarkeit oder Beweisbarkeit mathematischer Sätze nebst einem Theorem über dichte Mengen. Videnskapsselskapets Skrifter 1, Matem-naturv. Kl.I 4, 1–36. Stabler, Edward 1997. Derivational minimalism. In: C. Retoré (ed.). Logical Aspects of Computational Linguistics. Berlin: Springer, 68–95. Stalnaker, Robert 1999. Context and Content Essays on Intentionality in Speech and Thought. Oxford: Oxford University Press.
15. The application of experimental methods in semantics
305
Tarski, Alfred 1956. Logic, Semantics, Metamathematics: Papers from 1923 to 1938. Oxford: Clarendon Press. Winter, Yoad 1997. Choice functions and the scopal semantics of indefinites. Linguistics & Philosophy 20, 399–467.
Alice G.B. ter Meulen, Geneva (Switzerland)
15. The application of experimental methods in semantics 1. 2. 3. 4. 5. 6. 7.
Introduction The stumbling blocks Off-line evidence for scope interpretation Underspecification vs. full interpretation On-line evidence for representation of scope Conclusions References
Abstract The purpose of this paper is twofold. On the methodological side, we shall attempt to show that even relatively simple and accessible experimental methods can yield significant insights into semantic issues. At the same time, we argue that experimental evidence, both the type collected in simple questionnaires and measures of on-line processing, can inform semantic theories. The specific case that we address here concerns the investigation of quantifier scope. In this area, where judgements are often subtle and controversial, the gradient data that psycholinguistic experiments provide can be a useful tool to distinguish between competing approaches, as we demonstrate with a case study. Furthermore, we describe how a modification of existing experimental methods can be used to test predictions of underspecification theories. The program of research we outline here is not intended to be a prescriptive set of instructions for researchers, telling them what they should do; rather it is intended to illustrate some problems an experimental semanticist may encounter but also the profit of this enterprise.
1. Introduction A wide range of data types and sources are used in the field of semantics, as is demonstrated by the related article 12 (Krifka) Varieties of semantic evidence in this volume. The aim of this article is to show with an example research study series what sort of questions can be addressed with experimental tools and suggest that these methods can deliver valuable data which is relevant to basic assumptions in semantics. This text also attempts to address the constraints on and limits to such an approach. These are both methodological and theoretical: it has long been recognized that links between empirical measures and theoretical constructs require careful argumentation to establish. Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 305–321
306
III. Methods in semantic research The authors therefore have two aims: one related to experimental methodologies and the other to do with the value of processing data. They first seek to show that even relatively simple and accessible experimental methods can yield significant insights into semantic issues. They second wish to illustrate that experimental evidence such as that gathered in their eye-tracking study has the potential to inform semantic theory. Semanticists have of course always sought confirmatory evidence to support their analyses. There is, on the one hand, fairly extensive use of computational techniques and corpus data in the field, and a growing body of experimental work on semantic processing, language acquisition, and pragmatics, but in the area of theoretical and formal semantics the experimental methods are less frequently employed. Now there are good reasons for this. There are inherent factors related to the accessibility of the relevant measures why controlled data gathering techniques are still somewhat less frequent in this field than in some others. We shall discuss what these reasons are and demonstrate with a case study what constraints they place on empirical studies, particularly experimental studies. The example research program that we shall report is thus not simply a recipe for others for what should be done, rather it is an illustration of the difficulties involved, which aims to explore some of the boundaries of what is accessible to experimental studies. The specific case that we address here concerns the investigation of quantifier scope, a perennial issue in semantics. Previous attempts to account for the complex data patterns to be found in natural languages have met with the difficulty that the causal factors and preferences need first to be identified before a realistic model can be developed. This requires as an initial step the capture and measurement of the relevant effects and their interactions, which is no trivial task. The next section lays out a range of reasons why semanticists do not routinely seek to test the empirical bases of their theories with simple experiments. Section 3 reports the series of empirical investigations on quantifier scope carried out by Bott and Radó in ongoing research. Section 4 lays out some of the theoretical background and importance of these studies for current theory (the underspecification debate). The final section takes as a starting point Bott and Radó (2009) to suggest how some of the problems noted in section 3 may be overcome with a more sophisticated experimental procedure.
2. The stumbling blocks As Manfred Krifka notes in his neighbouring article 12 (Krifka) Varieties of semantic evidence, a major problem with investigating meaning is that we cannot yet fully define what it is. This is indeed a root cause of difficulty, but here we shall attempt to illustrate in more practical detail what effects this has on attempts to conduct experiments in this field.
2.1. Specifying meaning without using language The essential feature distinguishing experiment procedure is control. In language experiments we may distinguish three (sets of) variables: linguistic form, context, and meaning. In the typical experiment we will keep two of them constant and systematically vary the other. Much semantic research concerns the systematic interdependence of form, context, and meaning. These issues can be investigated for example by:
15. The application of experimental methods in semantics a) keeping form and context constant, manipulating meaning systematically, and measuring the felicity of the outcome (in judgements, or reaction times, or processing effort), or b) manipulating (at least one of) form and context, and measuring perceived meaning. The first requires the experimenter to manipulate meaning as a variable, which entails expressing meaning in a form other than language, (pictures, situation descriptions, etc); the second requires the experimenter to measure perceived meaning, which again normally demands reference to meanings captured in non-linguistic form. But precisely this expression of tightly constrained meaning in non-linguistic form is very difficult. To show how this factor affects studies in semantics disproportionately, it is worth noting how this makes controlled studies in semantics more challenging than in syntax. Work in experimental syntax is often interested in addressing precisely those effects of form change which are independent of meaning. The variable meaning can thus be held constant, but this does not require it to be exactly specified. Thus only the syntactic analysis need be controlled, which makes empirical studies in syntax a whole parameter less difficult than those in semantics.
2.2. The boundaries of form, context, and meaning A further problem of exact studies concerning meaning is that the three variables are not always clearly distinguished, in part because they systematically covary, but also in part because linguists do not always agree about the boundaries. This is particularly visible when we seek to identify where an anomaly lies. Views have changed over time in linguistics about the nature and location of ill-formedness (e.g. the discussion of the status of I am lurking in a culvert in Ross 1970) but the fundamental ambiguity is still with us. For example, Weskott & Fanselow (2009) give the following examples and judgements of syntactic and semantic well-formedness: (1a) is syntactically ill-formed (*), (1b) is semantically ill-formed (#), and (1c) is ill-formed on both accounts (*#). (1) a. *Die the b. #Der the c.
Suppe wurde gegen versalzen. soup was against oversalted Zug train
*#Das Eis the ice
wurde gekaut. was chewed wurde seit was since
entzündet. inflamed
Our own judgements suggest that the structures in (1-a) and (1-c) have no acceptable syntactic analysis, and therefore no semantic analysis can be constructed – they are thus both syntactically and semantically ill-formed. Crucially, the semantic anomaly is dependent upon the syntactic problem; the lack of a recognizable compositional interpretation is a result of the lack of a possible structural analysis. We would therefore regard these examples as primarily syntactically unacceptable. This contrasts with (1-b), which we regard as well-formed on both parameters, being merely implausible, except in a small child’s playroom, where a train being chewed is an entirely normal situation (cf. Hahne & Friederici 2002).
307
308
III. Methods in semantic research
2.3. Plausibility Such examples highlight another problem in manipulating meaning as an experimental variable: the human demand to make sense of linguistic forms. We associate possible meanings with things that we can accept as being true or plausible. So ‘the third-floor appartment reappeared today’, which is both syntactically and semantically flawless, will cause irrelevant experimental effects since subjects will find it difficult to fit the meaning into their mental model of the world. Zhou & Gao (2009) for example argue that participants interpret Every robber robbed a bank in the surface scope reading because it is more plausible that each robber robbed a different bank. This links in to a wider discussion of the role of plausibility as a factor in semantic processing and as a filter on possible readings. Zhou & Gao (2009) claim that such doubly quantified sentences are ambiguous in Mandarin, since their experimental evidence suggests that both interpretations are built up in parallel, but one reading is subsequently filtered out by plausibility, which accounts for the contrary judgements in work on semantic theory (e.g. Huang 1982, Aoun & Li 1989).
2.4. Meaning as a complex measure The meaning of a structure is not fixed or unique, even when linguistic, social, and discourse context are fixed. First, a single expression may have multiple readings, which compete for dominance. Often a specific relevant reading of a structure needs to be forced in an experiment. Some readings of theoretical interest may be quite inaccessible, though nevertheless real. This raises the issue of expert knowledge, which again contrasts with the situation in syntax. Syntactic well-formedness judgements are generally available and accessible to any native speaker and require no expertise. On the other hand, it can require specialist knowledge to ‘get’ some readings since the access to variant readings is usually via different analyses. This is a crucial point in semantics, since it reduces the likelihood that the intuitions of the naïve native speaker can be the final arbiter in this field, as they can reasonably be argued to be in syntax (Chomsky 1965). A fine example of this is from Hobbs & Schieber (1987): (2) Two representatives of three companies saw most samples. They claim that this sentence is five-ways ambiguous. Park (1995) however denies the existence of one of these readings (three > most > two). It is doubtful whether this question is solvable by asking naïve informants. Even within a given analysis of a construction, the meaning may not be fully determined. Aspects of meaning are left unspecified, which means that two different perceivers can interpret a single structure in different ways. This too requires great care and attention to detail when designing experiments which aim to be exact.
2.5. The observer’s paradox A frequent aim in semantic experiments is to discover how subjects interpret linguistic input under normal conditions. A constant problem is how experimenters can access this information, because whatever additional task we instruct the subjects to carry out renders the conditions abnormal. For example, if we ask them to choose which one of a
15. The application of experimental methods in semantics pair of pictures illustrates the interpretation that they have gathered, or even if we just observe their eye movements, the very presence of two pictures is likely to make them more aware that more than one interpretation is possible, thus biasing the results. Even a single picture can alter or trigger the accessibility of a reading.
2.6. Inherent meaning and inferred meaning One last linguistic distinction which we should note here is that between the inherent meaning of an expression (“what is said”) and the inferred meaning of a given utterance of an expression. This distinction is fundamental in the division of research into meaning into separate fields, but it is in practice very difficult to apply in experimental work, since naïve informants do not naturally differentiate the two. The recent ‘literal Lucy’ approach of Larson et al. (2010) is a promising solution to this problem; in this paradigm participants must report how ‘literal Lucy’, who only ever perceives the narrowly inherent meaning of utterances and makes no inferences, would understand example sentences. This distinction is particularly important when an experimental design requires a disambiguation, and extreme care must be taken that its content is not only inferred. For example, in (3), it is implicated that every rugby player broke one of their own fingers, but this is not necessarily the case. This example cannot thus offer watertight disambiguation. (3) Every rugby player broke a finger. Implication: Every rugby player broke one of their own fingers.
2.7. Experimental measures and the object of theory As a rule, semantic theory makes no predictions about semantic processing. Instead it concerns itself with the final stable interpretation which is achieved after a whole linguistic expression, usually at the sentence level, has been processed and all reanalyses, for example as a result of garden paths, have been resolved. It fundamentally concerns the stative, holistic result of the processing of an expression, indeed many theoretical approaches regard meaning as only coming about in a full sentence (cf. article 8 (Meier-Oeser) Meaning in pre-19th century thought). But the processing of a sentence is made up of many steps which are incremental and which interact strongly with each other, partly predicting, partly parsing input as it arrives, partly confirming or revising previous analyses. Much of the experimental evidence available to us provides direct evidence only of these processing steps. It thus follows that for many semantics practitioners much of the empirical evidence which we can gather concerns at best our predictions about what the sentence is going to mean, not really aspects of its actual meaning. The time course of our arriving at a particular reading, whether it be remote or readily accessible, has no direct implications for the theory, since this makes no predictions about processing speed (cf. Phillips & Wagers 2007). One aim of this article is to show that experimental techniques can deliver data which can contribute to theory building.
2.8. Categorical predictions and gradient data Predictions of semantic theories typically concern the availability of particular interpretations. Experiments deliver more fine-grained data that reflect the relative preferences
309
310
III. Methods in semantic research among the interpretations. Mapping these gradient data onto the categorical predictions, that is, drawing the line between still available and impossible readings is a non-trivial task. At the same time, the ability to distinguish preferences among the “intermediate” interpretations may be highly relevant for testing predictions concerning readings that fall between the clearly available and the clearly impossible.
2.9. Outlook In the remainder of this paper we will discuss two ways in which systematically collected experimental data can contribute to semantic theorizing. We will use quantifier scope as an example of a phenomenon where results of psycholinguistic experiments can make significant contributions to the theoretical discussions. We will not attempt to review here the considerable psycholinguistic literature on the processing of quantifiers (for a comprehensive survey cf. article 102 (Frazier) Meaning in psycholinguistics). Instead we will concentrate on a small set of studies that show the usefulness of end-of-sentence judgements in establishing the available interpretations of quantified sentences. Then we will sketch an experiment to address aspects of the unfolding interpretation of quantifier scope which are of interest to theoretical semanticists as well.
3. Off-line evidence for scope interpretation Semantic theories are typically based on introspective judgements of a handful of theoreticians. The judgements concern available readings of a sentence, possibly ranked as to how easily available these readings are. Not surprisingly, judgements of this sort are subtle and often controversial. For instance, the sentence Everyone loves someone has been alternately considered to only allow the wide-scope universal reading (e.g. Hornstein 1995; Beghelli & Stowell 1997) or to be fully ambiguous (May 1977, 1985; Hornstein 1984; Higginbotham 1985). Example (2) above illustrates the same point. Park (1995) and Hobbs & Shieber (1987) disagree about the number of available readings. The data problem has been known for a long time. Studies as early as Ioup (1975) and VanLehn (1978) have used the intuitions of naïve speakers in developing an empirically motivated theory. However, it has been clear from the beginning that “obvious” tasks such as paraphrasing a presumably ambiguous doubly-quantified sentence or asking informants to choose a (preferred) paraphrase are rather complex and that linguistically untrained participants may not be able to carry them out reliably. Another purely linguistic task has been problematic for a different reason. Researchers have tried to combine the quantified sentence with a disambiguating continuation, as in (4). (4) Every kid climbed a tree. (a) The tree was full of apples. (b) The trees were full of apples. Disambiguation of this type was used by Gillen (1991), Kurtzman & MacDonald (1993), Tunstall (1998) and Filik, Paterson & Liversedge (2004), for instance. Here the plural continuation is only acceptable if multiple trees are instantiated, that is, the wide-scope universal interpretation (every kid > a tree) is chosen, whereas the singular continuation
15. The application of experimental methods in semantics is intended to only fit the wide-scope existential interpretation (a tree > every kid). Unfortunately the singular continuation fails to disambiguate the sentence, as Tunstall (1998) points out: the tree (4a) can easily be taken to mean the tree the kid climbed, thus making it compatible with the wide-scope universal interpretation as well (see also Bott & Radó 2007 and article 102 (Frazier) Meaning in psycholinguistics). Problems of these kinds have prompted researchers to look for non-linguistic methods of disambiguation. Gillen (1991) used, among other methods, simple pictures resembling set diagrams. In her experiments subjects either drew diagrams to represent the meaning of quantified sentences, chose the diagram that corresponded to the (preferred) reading or judged how well the situation depicted in the diagram fitted the sentence. Bott & Radó (2007) tested a somewhat modified form of the last of these methods using diagrams like those in Fig. 15.1, to see whether they constitute a reliable mode of disambiguation that naïve informants can use easily. They found that participants consistently delivered the expected judgements both for scopally unambiguous quantified sentences (i.e. sentences where one scope reading was excluded due to an intervening clause boundary) and for ambiguous quantified sentences where expected preferences could be determined based on theoretical considerations and corpus studies. These results show that there is no a priori reason to exclude the judgements of non-linguist informants from consideration. A) exactly one > each
B) each > exactly one
Fig. 15.1: Disambiguating diagrams for the sentence: “Exactly one novel was read by each student”
For informative experiments, however, we need to be able to derive testable hypotheses based on existing semantic proposals. Although semantic theories are not formulated to make predictions about processing, it is still possible to identify areas where different approaches lead to different predictions concerning the judgement of particular constructions. The interpretation of quantifiers provides an example here as well. One possible way of classifying theories of quantifier scope has to do with the way different factors are supposed to affect the scope properties of quantifiers. In configurational models such as Reinhart (1976, 1978, 1983, 1995) and Beghelli & Stowell (1997), quantifiers move to/are interpreted in different structural positions. A quantifier higher in the (syntactic) tree will always outscope lower ones. The absolute position in the tree is irrelevant; what matters is the position relative to the other quantifier(s). While earlier proposals only considered syntactic properties of quantifiers, Beghelli and Stowell also include semantic factors in the hierarchy of quantifier positions. Taking distributivity as an example, assuming that a +dist quantifier is interpreted in Spec,QP which is the
311
312
III. Methods in semantic research highest position available for quantifiers, Q1 will outscope Q2 if only Q1 is +dist, regardless of what other properties Q1 or Q2 may have. An effect of other factors will only become apparent if neither of the quantifiers is +dist. By contrast, the basic assumption in multi-factor theories of quantifier scope is that each factor has a certain amount of influence on quantifier scope regardless of the presence or absence of other factors (cf. Ioup 1975; Kurtzman & MacDonald 1993; Kuno 1991 and Pafel 2005). The effects of different factors can be combined, resulting in greater or lesser preference for a particular interpretation. Theories differ in whether one of the readings disappears when it is below some threshold, or whether sentences with multiple quantifiers are always necessarily ambiguous. Let us assume that the two scope-relevant factors we are interested in are distributivity and discourse-binding, the latter indicated by the partitive NP one of these N, see (6). Crossing these factors yields four possible combinations: +dist/+d-bound, +dist/-dbound, -dist/+d-bound, and -dist/-d-bound. In a configurational theory presumably there will be a structural position reserved for discourse-bound phrases. Let us consider the case where this position is lower than that for +dist, but higher than the lowest scope position available for quantifiers. Thus Q1 should outscope Q2 in the first two configurations, Q2 should outscope Q1 in the third, and the last one may in fact be fully scope ambiguous unless some additional factors are at play as well. Moreover, as configurational theories of scope have no mechanism to predict relative strength of scope preference, the first two configurations should show the same size preference for a wide-scope interpretation of Q1. In statistical terms, we expect an interaction: d-binding should have an effect when Q1 is -dist, but not when it is +dist. In multi-factor theories, on the other hand, the prediction would usually be that the effects of the different factors should add up. That is, the difference in scope bias between a d-bound and a non-d-bound +dist quantifier should be the same as between a d-bound and a non-d-bound -dist quantifier. A given factor should be able to exert its influence regardless of the other factors present. Bott and Radó have been testing these predictions in on-going work. In two questionnaire studies subjects read doubly-quantified German sentences and used magnitude estimation to indicate how well disambiguating set diagrams fitted the interpretation of the sentence. Experiment 1 manipulated distributivity and linear order and used materials like (5). Experiment 2 tested the factors distributivity and d-binding using sentences like (6). (5) a. Genau einen dieser Professoren haben alle Studentinnen verehrt. Exactly one these professors have all female students adored. All female students adored exactly one of these professors. b. Genau Exactly
einen dieser Professoren one these professors
hat has
jede Studentin verehrt. each female students adored.
Each female student adored exactly one of these professors. c.
Alle Studentinnen haben genau einen All female students have exactly one
dieser these
Professoren professors
All female students adored exactly one of these professors.
verehrt. adored.
15. The application of experimental methods in semantics d. Jede Studentin Each female student
hat has
genau exactly
einen one
313 dieser Professoren verehrt. these professors adored.
Each female student adored exactly one of these professors. (6) a. Genau einen Professor Exactly one professor
haben have
alle diese all these
Studentinnen verehrt. female students adored.
All of these female students adored exactly one professor. b. Genau einen dieser Professoren haben alle Studentinnen verehrt. Exactly one these professors have all female students adored. All female students adored exactly one of these professors. c.
Genau einen Exactly one
Professor professor
hat has
jede each
dieser these
Studentinnen verehrt. female students adored.
Each of these female students adored exactly one professor. d. Genau einen dieser Professoren hat Exactly one these professors has
jede Studentin each female student
verehrt. adored.
Each female student adored exactly one of these professors. Bott and Radó found clear evidence for the influence of all three factors. The distributive quantifier jeder took scope more easily than alle, d-binding of a quantifier and linear precedence both resulted in a greater tendency to take wide scope. Crucially, the effects were additive, which is compatible with the predictions of multi-factor theories but unexpected under configurational approaches. These results show that even simple questionnaire studies can deliver theoretically highly relevant data. This is particularly important in an area like quantifier scope, where the judgements are typically subtle and not always accessible to introspection. Of course the study reported here cannot address all possible questions concerning the interpretation of quantified sentences like those in (5)–(6). It cannot for example clarify whether the processor initially constructs a fully specified representation of quantifier scope or whether it first builds only a underspecified structure which is compatible with both possible readings, an outstanding question of much current interest in semantics. The data that we have presented so far is off-line, in that it measures preferences only at the end of the sentence, when its content has been disambiguated. In section 5 we present an experimental design which allows investigation of the on-going (on-line) processing of scope ambiguities. In the next section we relate the semantic issue of underspecification to experimental data and predictions for on-line processing.
4. Underspecification vs. full interpretation It is generally agreed that syntactic processing is incremental in nature (e.g. van Gompel & Pickering 2007) i.e. a full-fledged syntactic representation is assigned to every incoming word. Whether semantic processing is incremental in the strict sense, is far from being
314
III. Methods in semantic research beyond dispute and is still an empirical question. To formulate hypotheses about the time-course of semantic processing, we will now look at the on-going debate in semantic theory on underspecification in semantic representations. Underspecified semantic representations are a tool intended to handle the problem of ambiguity. The omission of parts of the semantic information allows one single representation to be compatible with a whole set of different meanings (for an overview of underspecification approaches, see e.g. Pinkal 1999; articles 24 (Egg) Semantic underspecification and 108 (Pinkal & Koller) Semantics in computational linguistics). It is thus an economical method of dealing with ambiguity in that it avoids costly reanalysis, used above all in computational applications. Taking the psycholinguistic perspective, one would predict that constructing underspecified representations in semantically ambiguous regions of a sentence avoids processing difficulties in ambiguous regions and at the point of disambiguation (Frazier & Rayner 1990). Underspecification can be contrasted with an approach that assumes strict incrementality and thus immediate full interpretation even in ambiguous regions. This would predict processing difficulties in cases of disambiguations to non-preferred readings. A candidate for a semantic processing principle guiding the choice of one specified semantic representation would be a complexity-sensitive one (for example: “Avoid quantifier raising” captured in Tunstall’s Principle of Scope Interpretation 1998 and Anderson’s 2004 Processing Scope Economy). In the psycholinguistic investigation of coercion phenomena, the experimental evidence is interpreted along these lines. Processing difficulties at the point of disambiguation are taken as evidence for full semantic interpretation (see e.g. Piñango, Zurif & Jackendoff 1999; Todorova, Straub, Badecker & Frank 2000) whereas the lack of measurable effects is seen as support for an underspecified semantic representation (see e.g. Pylkkänen & McElree 2006; Pickering, McElree, Frisson, Chen & Traxler 2006). Analogously, in the processing of quantifier scope ambiguities, experimental evidence for processing difficulties at the point of disambiguation will be interpreted as support for full interpretation. However, this need not be taken as final. If we look at underspecification approaches in semantics, non-semantic factors are mentioned which might explain (and predict) difficulties in processing local scope ambiguities (see article 24 (Egg) Semantic underspecification, section 6.4.1.). And these are exactly the factors which are assumed by multi-factor theories to have an impact on quantifier scope: syntactic structure and function, context, and type of quantifier. The relative weighting and interaction of these factors are not made fully explicit, however. For the full picture, it would be necessary to examine not only the point of disambiguation but also the ambiguous part of the input, for it is there that the effects of these factors might be identified. Underspecification is normally only temporary, however, and a full interpretation will presumably be constructed at some stage (but see Sanford & Sturt 2002). This might be recognizable for example in behavioral measures, but the precise predictions of underspecification theory are not always clear. For example, it might be assumed that even representations which are never fully specified by the input signal (or context) do receive more specific interpretations at some later stage. This of course raises the question what domains of interpretation are relevant here (sentence boundary, utterance, ...). In the next section we present experimental work which may offer a starting point for the empirical investigation of such issues.
15. The application of experimental methods in semantics
315
5. On-line evidence for representation of scope The underspecification view would predict that relative scope should remain underspecified as long as neither interpretation is forced. Indeed there should not even be any preference for one reading. The results of the questionnaire studies reported in Section 3 already indicate that this view cannot be right: a particular combination of factors was found to systematically support a certain reading. Furthermore it is unlikely that the task itself introduced a preference towards one interpretation – although the diagram representing the wide-scope existential reading was somewhat more complex, this did not seem to interfere with participants’ performance. The observed preferences must thus be due to the experimental manipulation. That is, even if all possible interpretations are available up to the point where disambiguating information arrives, there must be some inherent ranking of the various scope-determining factors that results in certain interpretations being more activated than others. Off-line results such as those discussed above are thus equally compatible with two different explanations; one where quantifier scope is fully determined (at least) by the end of the sentence, and another one where several (presumably all combinatorially possible) interpretations are available but weighted differently. A different methodology is needed to find out whether there is any psycholinguistic support for an underspecified view of quantifier scope. As it turns out, the currently existing results of on-line studies are no more able to distinguish the two alternatives than are offline studies. In on-line experiments a scopeambiguous initial clause is followed by a second one that is only compatible with one scope reading. An indication of difficulty during the processing of the second sentence is typically taken as evidence that the disambiguation is incompatible with the (sole) interpretation that had been entertained up to that point. However, there is another way to look at such effects. When the disambiguation is encountered, the underspecified representation needs to be enriched to allow only one reading and exclude all others. It is conceivable that updating the representation may require more or less effort depending on the ultimate interpretation that is required. This situation poses a dilemma for researchers investigating the interpretation of quantifier scope. If explicit disambiguation is provided we can only test how easily the required reading is available – the results don’t tell us what other reading(s) may have been constructed. Without explicit disambiguation, however, reading time (or other) data cannot be interpreted, since we do not know what reading(s) the participants had in mind. Bott & Radó (2009) approached this problem by tracking the eye-movements of participants while they read ambiguous sentences and then asking them to report the interpretation they computed. Although their results are only partly relevant for the underspecification debate, we will describe the experiment in some detail, since it provides a good starting point for a more conclusive investigation. We will then sketch a modification of the method that makes it possible to avoid some problems with the original study. The scope-ambiguous sentences in Bott and Radó’s study were instructions like those in (7): (7) a. Genau ein Exactly one
Tier animal
auf on
jedem each
Name exactly one animal from each picture!
Bild picture
sollst should
du you
nennen! name!
316
III. Methods in semantic research b. Genau ein Exactly one
Tier animal
auf on
allen all
Bildern pictures
sollst should
du you
nennen! name!
Name exactly one animal from all pictures!
Fig. 15.2: Display following inverse linking constructions
The first quantifier (Q1) was always the indefinite genau ein “exactly one”. The second (Q2) was either distributive (jeder) or not (alle). In one set of control conditions Q1 was replaced by a definite NP (das Tier “the animal”). In another set of control conditions the two possible interpretations of (7) (one animal that is present in all fields vs. a possibly different animal from each field on a display) were expressed by scope-unambiguous quantified sentences, as in (8). (8) a. Name exactly one animal that is found on all pictures. b. From each picture name exactly one animal. In each experimental trial participants first read one of these instruction sentences and their eye-movements were monitored. Then the instruction sentence disappeared and a picture display as in Fig. 15.2, replaced it. Participants inspected this and had to provide an answer within four seconds. The displays were constructed to be compatible with both possible readings: the wide-scope universal reading where the participant should select any one animal per field, but also the wide-scope existential reading where the one element common to all fields must be named (e.g. the monkey in Fig. 15.2). To make the quantifier exactly one felicitous, the critical displays always allowed two potential answers for the wide-scope existential interpretation. The scope-ambiguous instructions were so-called inverse linking constructions, in which the two quantifiers are contained within one NP. It has been assumed (e.g. May &
15. The application of experimental methods in semantics
317
Bale 2006) that in inverse linking constructions the linearly second quantifier preferentially takes scope over the first. The purpose of the study was to test this prediction and to investigate to what extent the distributivity manipulation is able to modulate it. Based on earlier results (Bott & Radó 2007) it was assumed that jeder would prefer wide scope, which should further enhance the preference for the inverse reading. When alle occurred as Q2, there should be a conflict between the preferences inherent to the construction and those arising from the particular quantifiers. The experimental setup made it possible to look at both the process of computing the relative scope of the quantifiers (eye-movement behavior while reading the instructions) and at the final interpretation (the answer participants gave) without providing any disambiguation. Thus the answers could be taken to reflect the scope preferences at the end of the sentence, whereas processing difficulty during reading would serve as an indication that scope preferences are computed at a point where no decision is yet required. The off-line answers showed the expected effects. There was an overall preference for the inverse scope reading, which was significantly stronger with jeder than with alle. Crucially, the reading time data showed clear evidence of a conflict between the scope factors: there was a significant slow-down at the second quantifier in (7b). The effect was present already in first-pass reading times, suggesting that scope preferences were computed immediately. Bott and Radó interpret these results as strong indication that readers regularly disambiguate sentences during normal reading. However, this conclusion may be too strong. In Bott and Radó’s experiment participants had to choose a particular interpretation in order to carry out the instructions (i.e. name an animal). Although they did not have to settle on that interpretation while they were reading the instruction, they had to make a decision as to the preferred reading immediately after the end of the sentence. This may have caused them to disambiguate constructions that are typically left ambiguous during normal interpretation. Moreover, the instructions used in the experiment were highly predictable in structure: they always contained a complex NP with two quantifiers (experimental items), a definite NP1 followed by a quantified NP2 (fillers A), or else an unambiguous sentence with two quantifiers. Although the content of NP1 (animal, vehicle, flag) and distributivity of Q2 was varied, the rest of the instruction was the same: sollst du nennen ‘you should name’. This pattern was easy to recognize and may have resulted in a strategy of starting to compute the scope preferences as soon as the second NP had been received. To rule out this explanation Bott and Radó compared responses provided in the first and the last third of each experimental session and failed to find any indication of strategic behavior. Still the possibility remains that consistent early disambiguation in the experiment resulted from the task of having to choose a reading quickly in order to provide an answer. The ultimate test of underspecification would have to avoid such pressure to disambiguate fast. We are currently conducting a modification of Bott and Radó’s experiment that may not only avoid this pressure but actually encourage participants to delay disambiguation. In this experiment participants have to judge the accuracy of sentences like those in (9): (9) a. Genau eine geometrische Form auf Exactly one geometrical shape on
allen Bildern ist all pictures is
Exactly one geometrical shape on all pictures is rectangular.
rechteckig. rectangular.
318
III. Methods in semantic research b. Genau eine geometrische Form auf Exactly one geometrical shape on
jedem each
Bild ist rechteckig. picture is rectangular.
Exactly one geometrical shape on each picture is rectangular. A) wide scope existential disambiguation
B) wide scope universal disambiguation
Fig. 15.3: Disambiguating displays in the proposed experiment
The experiment procedure is as before. The sentences will be paired with unambiguous displays supporting either the wide-scope universal or the wide-scope existential reading (Fig. 15.3.). In (9) full processing of the semantic content is not possible until the critical information (rechteckig) has been received. Since the display following the sentence is only compatible with one reading which the participant cannot anticipate, they are better off waiting to see which interpretation will be required for the answer. If underspecification is indeed the preferred strategy, there should be no difference in reading times across the different conditions, nor should there be any difficulty in judging any kind of sentence-display pair. Assuming immediate full specification of scope, however, we would expect the same pattern of results as in Bott and Radó’s study: slower reading times in (9a) than in (9b) at the second quantifier, as well as slower responses to displays requiring the wide-scope existential interpretation, the latter presumably modulated by distributivity of Q2. This experiment should be able to distinguish intermediate positions between the two extremes of complete underspecification and immediate full interpretation. It is conceivable, for instance, that scope interpretation is only initiated when the perceiver can be reasonably sure that they have received all (or at least sufficient) information. This would correspond to the same reading time effects (and same answering behavior) as predicted under immediate full interpretation, but the effects would be somewhat delayed. Another possibility is an initial underspecification of scope, but the construction of a fully specified interpretation at the boundary of some interpretation domain such as the clause boundary. That would predict a complete lack of reading time effects but answer times showing the same incompatibility effects as under versions of the full interpretation approach. It is worth emphasizing how this design differs from existing studies. First, it looks at the ambiguous region and not just the disambiguation point. Second, it differs from
15. The application of experimental methods in semantics Filik, Paterson & Liversedge (2004), who also measured reading times in the ambiguous region, but who used the kind of disambiguation that we criticized in section 3.
6. Conclusions In this article we have attempted to show that experimentally obtained data can, in spite of certain complicating and confounding factors, be of relevance to semantic theory and provide both support for and in some cases falsification of its assumptions and constructs. In section 2 we noted that the field of theoretical semantics has made less use of experimental verification of its analyses and assumptions. We have seen that there are some quite good reasons for this and laid out what some of the problematic factors are. While some of these are shared to a greater or lesser degree with other branches of linguistics, some of them are peculiar to semantics or are especially severe in this case. The main part of our paper reports a research program addressing the issue of relative scope in doubly quantified sentences. We present this work as an example of the ways in which experimental approaches can contribute to the development of theory. They also illustrate some of the practical constraints upon such studies. For example, we have seen that clear disambiguation is not always easy to achieve, in particular, it is difficult to achieve without biasing the interpretational choices of the experiment participant. The use of eye-tracking and fully ambiguous picture displays is a real advance on previous practice (Bott & Radó 2009). Section 3 shows how experimental procedures which are simple enough for nonspecialist experimenters can nevertheless yield evidence of value for the development of semantic theories: a carefully constructed and counter-balanced design can produce data of sufficient quality to answer outstanding questions with some degree of finality. In this particular case the configurational account of scope can be seen as failing to account for data that the multi-factor account succeeds in capturing. The unsupported account is demonstrated to need adaptation or development. Experimentation can make the field of theory more dynamic and adaptive; an account which repeatedly fails to capture evidence gathered in controlled studies and which cannot economically be extended to do so will eventually need to be reconsidered. In section 5 we describe an experiment designed to provide evidence which distinguishes between two accounts (section 4) of the way that perceivers deal with ambiguity in the input signal: underspecification vs. full interpretation. This is an example of how processing data can under certain circumstances provide decisive evidence which distinguishes between theoretical accounts. It is of course often the case that theory does not make any direct predictions about psycholinguistically testable measures of processing. The collaboration of psycholinguists and semanticists may yet reveal testable predictious more often than has sometimes been assumed. We therefore argue for experimental linguists and semanticists to cooperate more and take more notice of each other’s work for their mutual benefit. Semanticists will gain additional ways to falsify theoretical analyses or aspects of them, which can deliver a boost to theory development. This will be possible, because experimenters can tailor experimental methods, tasks, and designs to their specific requirements. Experimenters for their part will benefit by having the questioning eye of the semanticist look over their experimental materials, which will surely avoid many experiments
319
320
III. Methods in semantic research being carried out whose materials fail to uniquely fulfill the requirements of the design. An example of this is the mode of disambiguation which we discussed in section 3. Further to this, experimenters will doubtless be able to derive more testable predictions from semantic theories, if they discuss the finer workings of these with specialist semanticists. We might mention here the example of semantic underspecification: can we find evidence for its psychological reality? Further questions might be: if some feature of an expression remains underdetermined by the input, how long can the representation remain underspecified? Is it possible for a final representation of a discourse to have unspecified features and nevertheless be fully meaningful? We conclude, therefore, that controlled experimentation can provide a further source of evidence for semantics. This data can under certain circumstances give a more detailed picture of the states of affairs which theories aim to account for. This additional evidence could be the catalyst for some advances in semantic theory and explanation, in the same way that it has in syntactic theory.
7. References Anderson, Catherine 2004. The Structure and Real-time Comprehension of Quantifier Scope Ambiguity. Ph.D. dissertation. Northwestern University, Evanstone, IL. Aoun, Joseph & Yen-hui Audrey Li 1989. Scope and constituency. Linguistic Inquiry 16, 623–637. Beghelli, Filippo & Tim Stowell 1997. Distributivity and negation: The syntax of each and every. In: A. Szabolcsi (ed.). Ways of Scope Taking. Dordrecht: Kluwer, 71–107. Bott, Oliver & Janina Radó 2007. Quantifying quantifier scope. In: S. Featherston & W. Sternefeld (eds.). Roots. Linguistics in Search of its Evidential Base. Berlin: Mouton de Gruyter, 53–74. Bott, Oliver & Janina Radó 2009. How to provide exactly one interpretation for every sentence, or what eye movements reveal about quantifier scope. In: S. Winkler & S. Featherston (eds.). The Fruits of Empirical Linguistics, Volume 1: Process. Berlin: de Gruyter, 25–46. Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge, MA: The MIT Press. Filik, Ruth, Kevin B. Paterson & Simon P. Liversedge 2004. Processing doubly quantified sentences: Evidence from eye movements. Psychonomic Bulletin & Review 11, 953–959. Frazier, Lyn & Keith Rayner 1990. Taking on semantic commitments: Processing multiple meanings vs. multiple senses. Journal of Memory and Language 29, 181–200. Gillen, Kathryn 1991. The Comprehension of Doubly Quantified Sentences. Ph.D. dissertation. Durham University. van Gompel, Roger P.G. & Martin J. Pickering 2007. Syntactic parsing. In: G. Gaskell (ed.). The Oxford Handbook of Psycholinguistics. Oxford: Oxford University Press. 455–504. Hahne, Anja & Angela D. Friederici 2002. Differential task effects on semantic and syntactic processes as revealed by ERPs. Cognitive Brain Research 13, 339–356. Higginbotham, James 1985. On semantics. Linguistic Inquiry 16, 547–594. Hobbs, Jerry & Stuart M. Shieber 1987. An algorithm for generating quantifier scopings. Computational Linguistics 13, 47–63. Hornstein, Norbert 1984. Logic as Grammar. Cambridge, MA: The MIT Press. Hornstein, Norbert 1995. Logical Form: From GB to Minimalism. Oxford: Blackwell. Huang, Cheng-Teh James 1982. Logical Relations in Chinese and the Theory of Grammar. Ph.D. dissertation. MIT, Cambridge, MA. Ioup, Georgette 1975. The Treatment of Quantifier Scope in Transformational Grammar. Ph.D. dissertation. The University of New York, New York. Kuno, Susumu 1991. Remarks on quantifier scope. In: H. Nakajima (ed.). Current English Linguistics in Japan. Berlin: Mouton de Gruyter, 261–287.
15. The application of experimental methods in semantics Kurtzman, Howard S. & Maryellen C. MacDonald 1993. Resolution of quantifier scope ambiguities. Cognition 48, 243–279. Larson, Meredith, Ryan Doran, Yaron McNabb, Rachel Baker, Matthew Berends, Alex Djalali & Gregory Ward 2010. Distinguishing the said from the implicated using a novel experimental paradigm. In: U. Sauerland & K. Yatsushiro (eds.). Semantics and Pragmatics. From Experiment to Theory. Houndmills: Palgrave Macmillan. May, Robert 1977. The Grammar of Quantification. Ph.D. dissertation. MIT, Cambridge, MA. Reprinted: Bloomington, IN: Indiana University Linguistics Club, 1982. May, Robert 1985. Logical Form: Its Structure and Derivation. Cambridge, MA: The MIT Press. May, Robert & Alan Bale 2006. Inverse linking. In: M. Everaert & H. van Riemsdijk (eds.). Blackwell Companion to Syntax. Oxford: Blackwell, 639–667. Pafel, Jürgen 2005. Quantifier Scope in German. (Linguistics Today 84). Amsterdam: Benjamins. Park, Jong C. 1995. Quantifier scope and constituency. In: H. Uszkoreit (ed.). Proceedings of the 33rd Annual Meeting of the Association of Computational Linguistics. Boston, MA: Morgan Kaufmann, 205–212. Phillips, Colin & Matthew Wagers 2007. Relating structure and time in linguistics and psycholinguistics. In: M.G. Gaskell (ed.). The Oxford Handbook of Psycholinguistics. Oxford: Oxford University Press, 739–756. Pickering, Martin J., Brian McElree, Steven Frisson, Lilian Chen & Matthew J. Traxler 2006. Underspecification and aspectual coercion. Discourse Processes 42, 131–155. Piñango, Maria, Edgar Zurif & Ray Jackendoff 1999. Real-time processing implications of enriched composition at the syntax-semantics interface. Journal of Psycholinguistic Research 28, 395–414. Pinkal, Manfred 1999. On semantic underspecification. In: H. Bunt & R. Muskens (eds.). Computing Meaning, Dordrecht: Kluwer, 33–55. Pylkkänen, Liina & Brian McElree 2006. The syntax-semantics interface: On-line composition of meaning. In: M. A. Gernsbacher & M. Traxler (eds.). Handbook of Psycholinguistics. 2nd edn. New York: Elsevier, 537–577. Reinhart, Tanya 1976. The Syntactic Domain of Anaphora. Ph.D. dissertation. MIT, Cambridge, MA. Reinhart, Tanya 1978. Syntactic domains for semantic rules. In: F. Guenthner & S.J. Schmidt (eds.). Formal Semantics and Pragmatics for Natural Language. Dordrecht: Reidel, 107–130. Reinhart, Tanya 1983. Anaphora and Semantic Interpretation. London: Croom Helm. Reinhart, Tanya 1995. Interface Strategies (OTS Working Papers in Linguistics). Utrecht: Utrecht University. Ross, John R. 1970. On declarative sentences. In: R. Jacobs & P. Rosenbaum (eds). Readings in English Transformational Grammar. Waltham, MA: Ginn, 222–272. Sanford, Anthony J. & Patrick Sturt 2002. Depth of processing in language comprehension: Not noticing the difference. Trends in Cognitive Neuroscience 6, 382–386. Todorova, Marina, Kathy Straub, William Badecker & Robert Frank 2000. Aspectual coercion and the online computation of sentential aspect. In: L. R. Gleitman & A. K. Joshi (eds.). Proceedings of the 22nd Annual Conference of the Cognitive Science Society. Mahwah, NJ: Lawrence Erlbaum Associates, 3–8. Tunstall, Susanne L. 1998. The Interpretation of Quantifiers: Semantics and Processing. Ph.D. dissertation. University of Massachussetts, Amherst, MA. VanLehn, Kurt A. 1978. Determining the Scope of English Quantifiers. Technical Report (AI-TR 483). Cambridge, MA, Artificial Intelligence Laboratory, MIT. Weskott, Thomas & Gisbert Fanselow 2009. Scaling issues in the measurement of linguistic acceptability. In: S. Winkler & S. Featherston (eds.). The Fruits of Empirical Linguistics, Volume 1: Process. Berlin: de Gruyter, 229–246. Zhou, Peng & Liqun Gao 2009. Scope processing in Chinese. Journal of Psycholinguistic Research 38, 11–24.
Oliver Bott, Sam Featherston, Janina Radó, and Britta Stolterfoht, Tübingen (Germany)
321
IV. Lexical semantics 16. Semantic features and primes 1. 2. 3. 4. 5. 6. 7. 8.
Introduction Background assumptions The form of semantic features Semantic aspects of morpho-syntactic features Interpretation of semantic features Kinds of elements Primes and universals References
Abstract Semantic features or primes, like phonetic and morpho-syntactic features, are usually considered as basic elements from which the structure of linguistic expressions is built up. Only a small set of semantic features seems to be uncontroversial, however. In this article, semantic primes are considered as basic elements of Semantic Form, the interface level between linguistic expressions and the full range of mental structures representing the content to be expressed. Semantic primes are not just features, but elements of a functorargument-structure, on which the internal organization of lexical items and their combinatorial properties, including their Thematic Roles, is based. Three types of semantic primes are distinguished: Systematic elements, that are related to morpho-syntactic conditions like tense or causativity; idiosyncratic features, not corresponding to grammatical distinctions, but likely to manifest primitive conceptual conditions like color or taste; and a large range of elements called dossiers, which correspond to hybrid mental configurations, integrating varying modalities, but providing unified conceptual entities. (Idiosyncratic features and dossiers together account for what is sometimes called distinguishers or completers.) A restricted subsystem of semantic primes can reasonably be assumed to be directly fixed by Universal Grammar, while the majority of semantic primes is presumably due to general principles of mental organization and triggered by experience.
1. Introduction The concept of semantic features, although frequently used in pertinent discussion, is actually in need of clarification with respect to both of its components. The term feature, referring in ordinary discourse to a prominent or distinctive aspect, quality or characteristic of something, became a technical term in the structural linguistics of the 1930s, primarily in phonology, where it identified the linguistically relevant properties as opposed to other aspects of sound shape. The systematic extension of the concept from phonology to other components of linguistic structure led to a wide variety of terms with similar but not generally identical interpretation, including component, category, atom, feature value, attribute, primitive element and a number of others. The qualification semantic, which competes with a smaller number of alternatives, conceptual being one of them, is Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 322–357
16. Semantic features and primes not less in need of elucidation though, depending among others on the delimitation of the relevant domains, including syntactic, semantic, conceptual, discourse and pragmatic structure, and the intricate distinction between linguistic and encyclopedic knowledge. In what follows, I will use the term semantic feature in the sense of basic component of linguistic meaning, adding amendments and changes, where necessary. Different approaches and theoretical frameworks of linguistic analysis recognized the need for an overall concept of basic elements in terms of which linguistic phenomena are to be analyzed. Hjelmslev (1938) for instance based his theory of Glossematics on the assumption that linguistic expressions are made up of ultimate, irreducible invariants called glossemes. Following the Saussurean view of content and expression as the interdependent planes of linguistic structure, he furthermore distinguished kenemes and pleremes as the glossemes of the expression and the content plane, respectively. A recent and rather different case in point is the Minimalist Program discussed in Chomsky (1995, 2000), where Universal Grammar is assumed to make available “a set F of features (linguistic properties) and operations CHL (the computational procedure for human language) that access F to generate expressions”, which are pairs 〈PF, LF〉 of Phonetic Form and Logical Form, determining the sound shape and the meaning of linguistic expressions. The general notion of features as primitive components constituting the structure of language must not obscure the fundamental differences between basic elements of the phonetic, morphological, syntactic and semantic aspect of linguistic expressions. On the phonetic side, the nature of distinctive features as properties of segments and perhaps syllables is fairly clear in principle and subject to dispute only with respect to interesting detail. The nature and role of primitive elements on the semantic side however is subject to problems that are unsolved in crucial respects. A number of questions immediately arise: (1) a. b. c. d.
What is the formal character of basic semantic elements? Can linguistic meaning be exhaustively reduced to semantic features? Is there a fixed set of semantic features? What is the origin and interpretation of semantic features?
Any attempt to deal with these questions must obviously rely on general assumptions about the framework which at least makes it possible to formulate the problems.
2. Background assumptions An indispensable assumption of all approaches concerns the fact that a natural language L provides a sound-meaning correspondence, relating an unlimited range of signals to an equally unlimited range of objects, situations, and conditions the expressions are about. What linguistics is concerned with, however, is neither the set of signals nor the range of things they are about, but rather the invariant patterns on which the production and recognition of signals is based and the distinctions, in terms of which things and situations are experienced, organized or imagined and related to the linguistic expressions. Schematically, these general assumptions can be represented as in (2), where PF and SF (for Phonetic Form and Semantic Form, respectively) indicate the structure of the sound shape and meaning, with A-P (for Articulation and Perception) and C-I (for Conceptualization and Intention) abbreviating the complex mental systems by which linguistic
323
324
IV. Lexical semantics expressions are realized and related to their intended interpretations. Although I will comply with general terminology as far as possible, tension cannot always be avoided. Thus, the Semantic Form SF corresponds in many respects to the Logical Form LF of Chomsky (1986 and subsequent work), to the Conceptual Structure CS of Jackendoff (1984 and later work), and to the Discourse Representation Structure DRS of Kamp and Reyle (1993), to mention just a few comparable approaches. (cf. also article 30 (Jackendoff) Conceptual Semantics, article 31 (Lang & Maienborn) Two-level Semantics and article 37 (Kamp & Reyle) Discourse Representation Theory). (2) Signal
PF
Morpho-Syntax
A-P
SF C-I
External and Internal Environment
L Mind/Brain
This schema is a simplification in several respects, but it enables us to fix a number of relevant points. First, PF and SF represent the conditions that L imposes on or extracts from extra-linguistic systems. Hence PF and SF are what is often called interfaces, by which language interacts with other mental systems, abridged here as A-P and C-I. Semantic and phonetic features can now be identified as the primitive elements of SF and PF, respectively. Hence their formal and substantive nature is determined by the character and role of these interfaces. Second, the function of language, to systematically assign meaning to signals, is accomplished by the component indicated here as Morpho-Syntax, which establishes the connection between PF and SF. For obvious reasons, the rules and principles of MorphoSyntax are the primary concern of all pertinent theories, which in spite of differences that will not be dealt with here agree on the observation that the relation between PF and SF depends on the lexical system LS of L and is organized by rules and principles which make up the Grammar G of L. The content of Morpho-Syntax is relevant in the present context to the extent to which it relies on morphological and syntactic features that relate to features of SF. Third, according to (2), PF and SF and their primes are abstract in the sense that they do not reflect properties of the signal or the external world directly, but represent the units and configurations in terms of which the mental mechanisms of P-A and C-I perceive or organize external phenomena. Features of PF should according to Halle (1983) most plausibly be construed as instructions for vocal gestures in terms of which speech sounds are articulated and perceived. In a similar vein, features of SF must be construed not as elements of the external reality, but as conditions according to which elements, properties, and situations are experienced or construed. Hence semantic features as components of SF require an essentially mentalistic approach to meaning. This is clearly at variance with various theories of meaning, among them in particular versions of formal or model theoretic semantics like, e.g., Lewis (1972) or Montague (1974), which consider semantics as the relation of linguistic expressions to strictly non-mental, external entities and conditions. It should be emphasized, though, that the mentalistic view underlying (2) does not deny external objects and their properties, corresponding to structures in SF under appropriate conditions. The crucial point is that it is the mental organization of C-I which provides the correspondence to the external reality – if the correspondence
16. Semantic features and primes actually obtains. Analogous considerations apply, by the way, to PF and its relation to the external signal. For further discussion of these matters and the present position see, e.g., Jackendoff (2002, chapter 10). It must finally be noted that the symmetrical status of PF and SF suggested in (2) is in need of modification in order to account for the fundamental differences between the two interfaces. While the mechanisms of A-P, which PF has access to, constitute a complex but highly specialized mental system, the range of capacities abbreviated as C-I is not restricted in any comparable way. As a matter of fact, SF has access to the whole range of perceptual modalities, spatial orientation, motor control, conceptual organization, and social interaction, in short: to all aspects of experience. This is not merely a quantitative difference in diversity, size, and complexity of the domains covered, it raises problems for the very status of SF as interface, as we will see. In any case, essential differences concerning the nature of basic elements and their combination in PF and SF have to be recognized. Two remarks must be added about the items of the lexical system LS. First, whether elements like killed, left, rose, or went are registered in LS as possibly complex but nevertheless fixed lexical items or are derived by morphological rules of G from the underlying elements kill, leave, rise, go plus abstract morphemes for Person, Number, and Tense is a matter of dispute with different answers in different theories. In any case, regular components like kill and -ed as well as idiosyncratic conditions holding for left, rose, and went need to be captured, and it must be acknowledged that the items in question consist of component parts. This is related to, but different from, the question whether basic lexical items like kill, leave, or go are themselves complex structures. As to PF, the analysis of words into segments and features is obvious, but for SF their decomposition is a matter of debate, to which we will return in more detail, claiming that lexical items are semantically complex in ways that clearly bear on the nature of semantic features. The second remark concerns the question whether and to what extent rules and principles of G determine the internal structure of lexical items and of the complex expressions made up of them. Elements of LS are stored in long-term memory and thus plausibly assumed to be subject to conditions that differ from those of complex expressions. Surprisingly, though, the character of features and their combination at least seems to be the same within and across lexical items, as we will see. As to notation, I will adopt the following standard conventions: Features of PF will be enclosed in square brackets like [ + nasal ], [ – voiced ], etc.; morphological and syntactic features are enclosed in square brackets and marked by initial capitals like [ + Past ], [ – Plural ], [ + Nominal ], etc.; and semantic features are given in small capitals, like [human ], [ male ], [ cause ].
3. The form of semantic features 3.1. Features in phonology and morpho-syntax It is useful to first consider the primitive elements of the phonetic side, where the conditions determining their formal character are fairly clear. PF is based on sequentially organized segments, which represent abstract time slots corresponding to the temporal structure of the phonetic signal. Now features are properties of segments, i.e. one-place predicates specifying articulatory (and perceptual) conditions on temporal units. The
325
326
IV. Lexical semantics predicates either do or do not apply to a given segment, whence basic elements of PF are mostly conceived as binary features. Thus, a feature combination like [+ nasal, + labial, + voiced] would specify a segment by means of the simultaneous articulatory conditions indicated as nasal, labial, and voiced. Systematic consequences and extensions of these conditions need to be added, such as the difference between presence and absence of a property, marked by the plus- and minus-value of the features leading to the markednessasymmetry in Chomsky & Halle (1968). Further extensions include relations between features in the feature geometry of Clements (1985), the predictability of unspecified features, the supra-segmental properties of stress and intonation in Halle & Vergnaud (1987), and the arrangement of segmental properties along different tiers, leading to three-dimensional representations, to mention the more obvious extensions of PF. These extensions, however, do not affect the basic conditions on phonetic features, which can be summarized as follows: (3) Phonetic features are a. binary one-place predicates, represented as [ ± F ]; b. simultaneous conditions on sequentially connected segments (or syllables); c. interpreted as instructions on the mechanisms in A-P. Some of the considerations supporting distinctive features in phonology have also been applied to syntax and morphology. Thus in Jakobson (1936), Hjelmslev (1938), or Bierwisch (1967) categories like Case, Number, Gender, or Person are characterized in terms of binary features. Similarly, Chomsky (1970) reduced categories like Noun, Verb, Adjective, and their projections Noun Phrase etc. to binary syntactic features. While features of this kind are one-place predicates and thus in line with condition (3a), they clearly don’t meet condition (3b): morphological and syntactic features are not conditions on segments of PF connected by sequential ordering, but on lexical and syntactic units, which are related by syntactic conditions like dominance or constituency according to the morpho-syntactic rules of G. The most controversial aspect of morpho-syntactic features is their interpretation analogously to (3c). As these features do not belong to PF or SF directly, but are essentially elements of the mediation between PF and SF, they can only indirectly participate in the interface structures and their interpretation. But they do affect PF as well as SF, although in rather different ways. As to PF, morphological features determine the choice of segments or features by means of inflectional rules like (4), where / – d / abbreviates the features identifying the past tense suffix d in cases like climbed, rolled, etc. (4) [ + Past ] → / – d / Actual systems of rules are more complex, taking into account intricate dependencies between different categories, features, and syntactic conditions. For systematic accounts of these matters see e.g., Bierwisch (1967), Halle & Marantz (1993), or Wunderlich (1997a). In any case, morphological or syntactic features are not part of PF, but can only influence its content via particular grammatical rules. As to SF, it is often assumed that features of morphological categories like tense or number are subject to direct conceptual interpretation, alongside with other semantic elements, and are therefore considered as a particular type of semantic features. This
16. Semantic features and primes was the position of Jakobson (1936), Hjelmslev (1938), and related work of early structuralism, where in fact no principled distinction between morphological and semantic features was made, supporting the notion of semantic primes as binary features. This view cannot be generally upheld for various reasons, however. To be sure, a feature like [+ Past ] has a more stable and motivated relation to the temporal condition attached to it than to its various phonetic realizations e.g., in rolled, left, went, or was, still there is no reasonable way to treat morphological or syntactic features as elements of SF, as will be discussed in section 4.
3.2. Features and types Turning to the actual elements of SF, we notice that they differ from the features of PF in crucial respects. To begin with, condition (3a), according to which features are binary oneplace predicates, seems to be satisfied by apparently well-established ordinary predicates like [ male ], [ alive ] or [ open ]. There are, however, equally well-established elements like [ parent-of ], [ part-of ], [ perceive ], [ before ], or [ cause ], representing relations or functions of a different type, which clearly violate this condition. More generally, SF must be based on an overall system of types integrating the different kinds of properties, relations, and operators. This leads directly to condition (3b), concerning the overall organization of PF in terms of sequentially ordered segments. It should be obvious that SF cannot rely on this sort of organization: it does neither consist of segments, nor does it exhibit linear ordering. Whatever components one might identify in the meaning of e.g., nobody arrived in time, or he needs to know it, or any other expression, there is no temporal or other sequential ordering among them, neither for the meaning of words nor their components. Although semantic processing in comprehension and production does of course have temporal characteristics, they do not belong to the resulting structure of meaning, in contrast to the linear structure related to phonetic processing. See Bierwisch & Schreuder (1992) for further discussion of this point. Hence SF is organized according to other conditions, which naturally follow from the type system just mentioned: Elements of SF, including in particular its primitive elements, are connected by the functor-argument relation based on the type-structure of the elements being combined. Notice that this is the minimal assumption to be made, just like the linear organization of PF, with no arbitrary stipulations. We now have the following conditions on semantic features, corresponding to the conditions (3a) and (3b) on the phonetic side: (5) Semantic elements, including irreducible features, are a. members of types determining their combinatorial conditions; b. participants of type-based hierarchical functor-argument structures. These are just two interdependent aspects of the general organization of linguistic meaning, which is, by the way, the counterpart of linear concatenation of segments in PF. There are various notational proposals to make these conditions explicit, usually extending or adapting basic conventions of predicate logic. The following considerations seem to be indispensable in order to meet the requirements in (5). First, each element of SF must belong to some specific type; second, if the element is a functor, its type determines two things: (a) the type of the elements it combines with, and (b) the type the
327
328
IV. Lexical semantics resulting combination belongs to. Thus functor-types must be of the general form 〈α,β〉, where α is the type of the required argument, and β that of the resulting complex. This kind of type-structure was introduced in Ajdukiewicz (1935) in a related but different context, and has since been adapted in various notational ways by Lewis (1972), Cresswell (1973), Montague (1974), to mention just a few; cf. also article 85 (de Hoop) Type shifting. Following standard assumptions, at least two basic types are to be recognized: e (for entity, i.e. objects in the widest sense) and t (for truth-value bearer, roughly propositions or situations). From these basic types, functor types like 〈e,t〉 and 〈e,〈e,t〉〉 for one- and two-place predicates, 〈t,t〉 and 〈t,〈t,t〉〉 for one- and two-place propositional functions, etc. are built up. To summarize: (6) Elements of SF belong to types, where a. e and t are basic types b. if α and β are types, then 〈α,β〉 is a type. A crucial point to be added is the possibility of empty slots, i.e. positions to be filled in by way of the syntactic combination of lexical items with the complements they admit or require. Empty slots are plausibly represented as variables, which are assigned to types like elements of SF in general. The kind of resulting structure is illustrated in (6) by an example adapted from Jackendoff (1990: 46), representing the SF of the verb enter in three notational variants, indicating the relevant hierarchy of types by a labeled tree (7a), and the corresponding labeled bracketing in (7b) and (7c): (7) a.
x
GO
TO
IN
y
e
e, e,t
e,e
e,e
e e
e e,t t
b. [t [e x] [〈e,t〉 [[〈e,〈e,t〉〉 go] [e [〈e,e〉 to] [e [〈e,e〉 in] [e y]]]] c. [t [〈e,t〉 [[〈e,〈e,t〉〉 go] [e [〈e,e〉 to] [e[〈e,e〉 in] [e y]]] [e x ]] While (7b) simply projects the tree branching of (7a) into the equivalent labeled bracketing, (7c) follows the so-called Polish notation, where functors systematically precede their arguments. This becomes more obvious if the type-indices are dropped, as in: [go [to [ in y]]x]. Notice, however, that in all three cases the linear ordering is theoretically irrelevant. What counts is only the functor-argument-relation. For the sake of illustration,go,to, and in are assumed to be primitive constants of SF with more or less obvious interpretation: in is a function that picks up the interior region of its argument, to turns its argument into the end of a path, and go specifies the motion of its “higher” or external argument along the path indicated by its “lower” or internal argument.
16. Semantic features and primes
329
Jackendoff’s actual treatment of this example is given in (8a), with the equivalent tree representation (8b). In spite of a gross similarity between (7) and (8), a different conception of the combinatorial type system seems to be involved. (8) a. [Event go ([Path [Thing ]i, [Path to ([Place in ([Thing ]j)])])] b.
GO
i
TO
IN
j
Event-Function
Thing
Path-Function
Place-Function
Thing
Place Path Event
Putting minor notational details aside (such as Jackendoff’s notation [ x([ y])] to indicate that a component [y] is the argument of a functor x, or his use of indices i and j instead of variables x and y to identify the relevant slots), the main difference between (7) and (8) consists in the general attachment of substantive content to type nodes in (8), in contrast to the minimal conditions assumed in (7), where types merely determine the combinatorial aspect of their elements. Thus [ in [y ]] in (7) is of the same type e as the variable y, while its counterpart [ in [j]] in (8) is assigned to the type Place, differing from the type Thing of the variable [j]. It is not obvious whether the differences between (7) and (8) have any empirical consequences, after stripping away notational redundancies by which e.g., the functor Path-Function generates a complex of type Path, or the PlaceFunction a Place without specifying substantial information. With respect to the form of semantic features, however, the two systems can be taken as largely equivalent versions, together with a fair number of further alternatives, including Miller & Johnson-Laird (1976), Kamp & Reyle (1993), and to some extent Dowty (1979), to mention some wellknown proposals. Katz (1972) pursues similar goals by means of a somewhat clumsy and mixed notational system.
3.3. SF and argument structure One of the central consequences emerging from the systems exemplified in (7) and (8) is the account of combinatorial properties of lexical items, especially the selection restrictions they give rise to. Although a systematic survey of these matters would go far beyond the concern for the nature of semantic features, at least the following points must be made. First, as already mentioned, variables like x and y in (7) determine the conditions syntactic complements have to meet semantically, as their SF has to be compatible with the position of the corresponding variables. Second, the position the variables occupy within the hierarchy of a lexical item’s SF determines to a large extent the syntactic conditions that the complements in question must meet. What is at issue here is the semantic underpinning of what is usually called the argument structure of a lexical item. As Jackendoff (1990: 48) puts it, “argument structure can be thought of as an abbreviation for the part of conceptual structure that is “visible” to syntax.” There are various ways in which parts of SF can be made accessible to syntax. Jackendoff (1990) relies
330
IV. Lexical semantics on coindexing of semantic slots with syntactic subcategorization, Bierwisch (1996) and Wunderlich (1997b) extend the type system underlying (8) by standard lambda operators, as illustrated in (9) for open, where the features [+V(erbal), –N(ominal)] identify the entry’s morpho-syntactic categorization, and act, cause, become, and open make up its semantic form: (9) / open / ; [+ V, –N] ; λy λx [t [t [〈e,t〉 act] [e x]] [〈t,t〉 [〈t,〈t,t〉〉 cause] [t [〈t,t〉become] [t [〈e,t〉 open ] [e y ] ] ] ] ] λy and λx mark the argument positions for object and subject, respectively, defining in this way the semantic selection restrictions the complements are subject to. It is worth noting, incidentally, that these operators in some way correspond to the functional categories AgrO and AgrS assumed in Chomsky (1995 and related work) to relate the verb to its complements. In spite of relevant similarities, though, there are important differences that can not be pursued here. Representations like (9) also provide a plausible account for the way in which the transitive and intransitive variants of open and other “(un)ergative” verbs like close, change, break etc. are related: The intransitive version is derived by dropping the causative elements [act x ] and [ cause ] and the operator λx, turning λy into the subject position: (10) / open / ; [+ V, –N] ;
λy [t[ 〈t,t〉become ] [t[〈e,t〉 open ] [e y ] ] ]
These observations elucidate how syntactic properties of lexical items are routed in configurations of their SF, determining thereby the semantic effect of the syntactic head-complement-construction.
3.4. Extensions The principles exemplified in (7) to (10) provide the skeleton of SF, determining the form of semantic features, the structure of meaning of lexical items and complex expressions. They do not account for all aspects of linguistically determined meaning, though, but must be augmented with respect to phenomena like shift or continuation of reference, topic-focus-articulation, presupposition, illocutionary force, and discourse relations. To capture these facts, Jackendoff (1990, 2002) enriched semantic representations by distinguishing a referential tier and information structure from the descriptive tier. Similar ideas are pursued in Kamp & Reyle (1993), where a referential “universe of discourse” is distinguished from proper semantic representation, which in Kamp (2001) is furthermore split up into a descriptive and a presuppositional component. Distinctions of this type do not affect the nature of basic semantic components, which they necessarily rely on. Hence we need not go into the details of these extensions. We only mention that components like become involve presuppositions, as observed in Bierwisch (2010), or that the extensively discussed definiteness operator def depends on the universe of discourse. Two further points must be added to this exposition, both of which show up in ordinary nouns like bottle, characterized by the following highly provisional lexical entry: (11) / bottle /; [–V, +N ] ; λx [[physical x] [artifact x] [container x] [bottle x]]
16. Semantic features and primes The first point concerns the role of standard classificatory features like [physical] [object], [artifact], [furniture], [container], etc, which are actually the root of the notion feature, descending from the venerable tradition of analyzing concepts in terms of genus proximum and differentia specifica. Although natural concepts of objects or any other sort of entities obviously do not comply with orderly hierarchies of classification, the logical relations and the (at least partial) taxonomies that classificatory features give rise to clearly play a fundamental role in conceptual structure and lexical meaning. Entries like (11) indicate how these conditions are reflected in SF: Features like [person], [artifact], [physical] etc. are elements of type 〈e,t〉, i.e. properties that classify the argument they apply to. Different conditions that jointly characterize an object x, e.g. as a bottle, must according to the general structure of SF make up an integrated unit of a resulting common type. Intuitively, the four conditions in (11), which are all propositions of type t, are connected by logical conjunction, yielding a complex proposition, again of type t. This can be made explicit by means of an (independently needed) propositional functor & of type 〈t,〈t,t〉〉, which turns two propositions into simultaneous conditions of one complex proposition. It might be noted that & is asymmetrical, in line with the general definition of functor types in (6b), according to which a functor takes exactly one argument, such that two-place functors combine with their arguments in two hierarchical steps. This is due to the lack of linear order in SF, by which the relation between elements becomes structurally asymmetrical. In case of conjunction, which is generally assumed to be symmetrical with regard to the conjuncts, this asymmetry might to some extent correspond to the specificity of conditions, such that the SF of (11) would come out as (12): (12) λ x [ [ physical x ] [ & [ artifact x ] [ & [ container x ] [ & [ bottle x ] ] ] ] ] Thus [ & [ container x ] [ & [ bottle x ] ] ] is the more specific condition added to the condition [ artifact x ]. Whether this is in fact the appropriate way to reflect hierarchical classification must be left open, especially as there are innumerable cases of joint conditions that do not make up hierarchies but reflect properties of cross-classification or just incidental combinations of conditions, differences the asymmetry of & cannot reflect. It might be added that various proposals have been made to formally represent joint conditions within lexical items as well as in standard types of adverbial or adnominal modification like little, green bottle or walk slowly through the garden. The second point to be added concerns elements like [ bottle ], whose status is problematic and characteristic in various respects. On the one hand, standard dictionary definitions of bottle like with a narrow neck, usually made of glass suggest that [ bottle ] is a heuristic abbreviation to be replaced by a more systematic analysis, turning it into a complex condition consisting of proper elementary features. On the other hand, any further analysis might run into problems that cannot reasonably be captured in SF. Thus even if the conditions indicated by narrow neck could be accounted for by appropriate basic elements, it is unclear whether the features of neck, designating the body-part connecting head and shoulders, can be used in bottle, without creating a vicious circle, which requires bottle to identify the right sort of neck. What is important here is not a matter of particular detail, but a kind of problem which shows up in innumerable places and in fact marks the limits of semantic structure. The issue has been treated in different ways. Laurence & Margolis (1999) appropriately called it “the problem of Completers”,
331
332
IV. Lexical semantics dealing with the residue of systematic analysis. It will be taken up below, cf. also article 32 (Hobbs) Word meaning and world knowledge. Features of PF and SF were compared in §3.2. with respect to form and combination, but not with respect to their interpretation, which for PF according to condition (3c) consists in instructions on mechanisms in A-P. A straightforward counterpart for SF would consider its basic elements as conditions on mechanisms of C-I. This analogy, apparently suggested in schema (1), would be misleading for various reasons, though. The size of the repertoire, the diversity of the involved components of C-I, and the status of SF as interface all raise problems wildly differing from those of PF. The subsequent sections deal with the conditions on interpretation of semantic features in more detail. Distinctions between at least three kinds of basic elements will have to be made, not only because of the ambivalent status of the completers just mentioned.
4. Semantic aspects of morpho-syntactic features 4.1. Interpretability of features and feature instances The semantic aspect of morphological and syntactic categories is a matter of continuous debate. As already mentioned, morphological features specifying categories like Case, Number, Gender, Tense etc. were considered by Jakobson, Hjelmslev, and others as grammatical elements with essentially semantic content, independently of the PFrealization assigned to them by rules like (4). We cannot deal here with the interesting results of these approaches in any detail. It must be noted, however, that the insights into conceptual aspects of morphological categories were never incorporated into systematic and coherent semantic representations, their integration was left to common sense understanding – mainly because appropriate frameworks of semantic representation were not available. To account for conceptual aspects of morpho-syntactic features, two important distinctions must be recognized. First, two different roles of feature instances are to be distinguished, which might be called “dominant” and “subdominant”, for lack of better terms. Dominant instances are essentially elements that categorize the expression they belong to, subdominant instances are conditions on participants of various syntactic relations including selection of complements, agreement, and concord. Thus the pronoun her is categorized by the dominant features [+Feminine, –Plural, +Accusative], among others, while prepositions like with or verbs like know specify their selection restriction by subdominant instances of the feature [+Accusative]. Similarly, the proper name John is categorized by the dominant features [+N, + 3.Person, –Plural ], while the (inflected) verb knows has the subdominant features [+ 3.Person, –Plural]. The intricate details of the relevant morpho-syntactic relations and the technicalities of their formal treatment are notoriously complex and cannot be dealt with here. We merely note that different feature instances are decisive in mediating the relation between PF and SF. The crucial point is that subdominant feature instances have no semantic interpretation, neither in selection restrictions nor as conditions in agreement or concord. Thus, in a case like these boys knew her, the feature [+Plural] of boys, the feature [+Past] in knew, and the feature [+ Feminine ] in her can participate in semantic interpretation, but the feature [+Plural ] of these or the Number- and Person-features of the verb cannot.
16. Semantic features and primes The second distinction to be made concerns the interpretability of dominant feature instances. Three possibilities have to be recognized: First, features that have a constant interpretation, with all dominant instances determining a fixed semantic effect, second, features with a conditional interpretation, subject to different interpretations under different circumstances, and third, features with no interpretation, whether dominant or not. Clear cases of the first type are Tense and Person, whose interpretation is invariant. Gender and Number in many languages are examples of the second type. Thus [+Plural] has a clear conceptual effect in nouns like boys, nails, clouds, or women, which it doesn’t have in scissors, glasses or trousers, and [+ Feminine] is semantically vacuous in the case of nouns designating ships. The paradigm case of the third type is structural Case. Lack of conceptual content holds even for instances where a clear semantic effect seems to be related to Case-distinctions, such as the contrast between German Dative and Accusative in locative auf dem Dach (on the roof) vs. directional auf das Dach (onto the roof), because the contrast is actually due to the SF of the preposition, not the dominant Case of its object, as shown in Bierwisch (1997). We will return to abstract Case shortly; see also article 78 (Kiparsky & Tonhauser) Semantics of inflection. This three-way-distinction is realized quite differently in different languages. A particularly intriguing case in point is Gender in German, which has dominant instances in the categorization of all nouns. Now, the feature [+Feminine] corresponds to the concept female in cases like Frau (woman) Schwester (sister), but has no interpretation in the majority of inanimate nouns like Zeit (time), Brücke (bridge), Wut (rage), many of which inherit the feature [+Feminine] from derivational suffixes like -t, -ung, -heit, as in Fahr-t (drive), Wirk-ung (effect), Dumm-heit (stupidity), etc. Moreover, some nouns like Weib (woman) or Mädchen (girl) are categorized as [-Feminine], irrespective of their SF. Even more intricate are [+Feminine]-cases like Katze (cat), Ratte (rat), which designate either the species (without specified sex) or the female animal. Further complications come with nouns like Person (person) with the feature [+ Feminine], which are animate and may or may not be female. Three conclusions follow from these observations: First, morpho-syntactic features must be distinguished from their possible semantic impact, because these features sometimes do, sometimes do not have a conceptual interpretation. Second, the relation between morpho-syntactic features and elements of SF, which mediate the conceptual content, must be determined by conditions comparable to those sketched in (4) relating morphological features to PF, even if the source and the content of the conditions is remarkably different. Third, the relation of morphological features to SF must be conditional, applying to dominant instances depending on sometimes rather special morphological or semantic circumstances.
4.2. The interpretation of morphological features Morpho-syntactic features are binary conditions of the computational system that accounts for the combinatorial matching of PF and SF. The semantic value corresponding to these features consists of elements of the functor-argument-structure of SF. Hence the conditions that capture this correspondence must correlate elements and configurations of two systems, both of which are subject to their respective principles. (13) illustrates the point for the feature [+Past] occurring, e.g., in the door opened indicating that an
333
334
IV. Lexical semantics utterance u of this sentence denotes an event s the time T of which precedes the time T of u, where T is a functor of type 〈e,e〉 and before a relation of type 〈e,〈e,t〉〉 : (13) [ +Past ] ↔ [ T s ] [ before [ T u ] ] This is a simplification which does not reflect the complex details discussed in the huge literature about temporal relations, but (13) should indicate in principle how the feature [+Past], syntactically categorizing the verb (or whatever one’s syntactic theory requires), can contribute to meaning. More specifically, most semantic analyses agree that tense information applies to events instantiating in one way or another the propositional condition specified by the verb and its complements. With this proviso, we get (14) as the SF of the door opened, if we expand (10) by the event-instantiation expressed by the operator inst of type 〈t,〈e,t〉〉. (14) [ [ T s [ before [ T u ] ] ] & [ s [ inst [ become [ open [ def x [door x ] ] ] ] ] ] ] As [+Past] has a constant interpretation in English, (13) needs no restricting condition: Instances of [+Past] are always subject to (13). It might be considered as a kind of lexical entry for [+Past] which applies to all occurrences of the feature, whether its morphophonological realization is regular or idiosyncratic. There are further non-trivial questions related to conditions like (13). One concerns the appropriate choice of the event argument s, which obviously depends on the “scope” of [+Past], i.e. the constituent it categorizes. Another one is the question what (13) means for the interpretation of [–Past], i.e. morphological present tense. One option is to take [–Past] to determine the negation not [T s [ before T u] ]. A more general strategy would consider [–Past] as the unmarked case, which is semantically unspecified and simply doesn’t determine a temporal position of the event s. These issues concerning the conceptual effect of morphological categories are highly controversial and need further clarification. Conditional interpretation of morphological features is exemplified by [+Plural] in (15), where [ collection ] of type 〈e,t〉 must be construed as imposing the condition that its argument consists of elements of equal kind, rather than being a set in the sense of set theory, since the denotation of a plural NP like these students is of the same type as that of the singular the student. See, e.g., Kamp & Reyle (1993) for discussion of these mattes and some consequences. (15) [ + Plural ] ↔ [ collection x ] Like (13), condition (15) might be construed as a kind of lexical information about the feature [+Plural], which would have the effect that for a plurale tantum like people the regular collective reading follows from the idiosyncratic categorization by [+Plural]. Now, the conditional character of (15) would have to be captured by lexically blocking nouns like glasses or measles from the application of (15). Thus measles would be marked by two exceptions: idiosyncratic categorization as [+Plural], but at the same time exemption from the feature’s interpretation. This again raises the question of interpreting [–Plural], and again, the strategy to leave the unmarked value of the feature [±Plural] without semantic effect seems to be promising, assuming that the default type of reference is just neutral with respect to the condition represented as [ collection ]. A revealing
16. Semantic features and primes
335
illustration of lexical idiosyncrasies is the contrast between German Ferien (vacation) and its near synonym Urlaub, where Ferien is a plurale tantum with idiosyncratic singular reading, much like measles, while Urlaub allows for [+Plur] and [–Plur] with (15) applying in the former case. Still more complex conditions obtain, as already noted, for the category Gender in German. The two features [± Masculine ] and [± Feminine ] identify three Genders by two features as follows: (16) a. Masculine: [+ Masculine ] ; b. Feminine: [+ Feminine ] ; c. Neuter: [– Masculine, – Feminine ] ;
Mann (man), Frau (women), Kind (child),
Löffel (spoon) Gabel (fork) Messer (knife)
Only plus values of Gender features are related to semantic conditions, and only for animate nouns. This can be expressed as follows: (17) a. [ + Masculine ] ↔ [ male x ] / [ animate x ] b. [ + Feminine ] ↔ [ female x ] / [ animate x ] The correspondence expressed in (17) holds only for cases where the semantic condition [ animate x ] is present, hence the Gender features of nouns like Löffel, Gabel have no semantic impact. According to the strategy that leaves minus-valued features semantically uninterpreted, nouns like Kind and Messer would both be equally unaffected by (17), while derivational affixes like -ung, -heit, -t, -schaft project their inherent Gender-specification even when it is not interpretable by (17). Conditions like (17) and (15) might reasonably be considered as saving lexical redundancy, such that the categorization as [+Masculine] is predictable for nouns like Vater(father) the SF of which contains [ animate x & [ male x ] ]. Idiosyncratic specifications would then be involved in various cases of blocking (17). Thus Weib (woman) is [ female x ] in SF, but categorized as [–Masculine, –Feminine]. On the other hand, Zeuge (witness) is categorized as [+Masculine], but blocked for (17), although marked as [animate x]. Similarly Person (person) is [+ Feminine ], but still not [ female x ], as eine männliche Person (a male person) is not contradictory. Finally a short remark is in order about structural or abstract Case, the paradigm case of features devoid of SF-interpretation. Two points have to be made. First, structural Case must be distinguished from semantic or lexical Case, e.g., in Finno-Ugric and Caucasian languages, representing locative and directional relations of the sort expressed by prepositions in Indo-European languages. Semantic Cases like Adessiv, Inessiv, etc. correspond to in, on, at, etc. Although the borderline between abstract and semantic Case is not always clear-cut, raising non-trivial questions of detail, the role of abstract Case we are concerned with is clear enough in principle. There is, of course, a natural temptation to extend successful analyses of semantic Cases to the Case system in general, relying on increasingly abstract characterizations, which are compatible with practically all concrete conditions. Hjelmslev (1935) is an impressive example of an ingenious strategy of this sort; cf. also article 78 (Kiparsky & Tonhauser) Semantics of inflection. Second, the semantic aspect of morphological features must not be confused with their participation in expressions whose semantic structure differs for independent
336
IV. Lexical semantics reasons, as already noted with regard to the locative/directive-distinction of the German prepositions in, an, auf, etc. The most plausible candidates for possible semantic effects of abstract Case are thematic roles in constructions like he hit him, where Nominative and Accusative relate to Agent and Patient. Identity or difference of meaning is not due to semantic features assigned to abstract Case, however, as can be seen from constructions like (18), where the verb anziehen (put on, dress) assigns the same role Recipient alternatively by means of Dative, Accusative, Nominative, and Genitive in the four cases in (18a–d), while in (18b) and (18e) the different roles Recipient and Theme are marked by the same Case Accusative. (18) a. b. c. d. e.
er zieht dem PatientenDat den MantelAcc an (he puts the coat on the patient) er zieht den PatientenAcc an (he dresses the patient) der PatientNom wird angezogen (the patient is dressed) das Anziehen des PatientenGen (the dressing of the patient) Er zieht den MantelAcc an (he puts on the coat)
Hence the roles of complements cannot derive from Case features, but must be determined by the SF of the verb in the way indicated in (9) and (10). The selection restrictions the verb imposes on its complements are inherent lexical conditions of the verb, modulated by passivization in (18c) and nominalization in (18d). See Wunderlich (1996b) for a discussion of further interactions between grammatical and semantic conditions; cf. also article 84 (Wunderlich) Operations on argument structure. To sum up, morphological features must clearly be distinguished from elements of SF, but may correspond to them by systematic conditions; and they may participate in making semantic distinctions explicit, even if they don’t have any semantic purport at all. This point is worth emphasizing in view of considerations like those in Svenonius (2007), who adumbrates conceptual content for morphological features in general, including Gender and abstract Case, not distinguishing instances with and without semantic purport (as in Gender), or semantic purport and the grammatical distinction it depends on (as in structural Case).
4.3. Syntactic categories While categories like Noun, Verb, Preposition, Determiner, etc. are generally assumed to determine the combinatorial properties of lexical items and complex expressions, the identification of their features is still a matter of debate. Incidentally, the terminology sways between syntactic and lexical categories, depending on whether functional categories like Complementizer, Determiner, Tense, etc. and phrasal categories like NP, VP, DP, etc. are included. For the sake of clarity, I will talk about syntactic categories and their features. Alternatives of the initial proposal [ ± Verbal, ± Nominal ] generally recognize the determination of argument structure as the major effect of the features in question. One point to be noted is that nouns and adjectives do not have strong argument positions, i.e. they allow their complements to be dropped, in contrast to verbs and prepositions, which normally require their argument positions to be syntactically realized. Hence the feature [+Nominal] could be construed as indicating the argument structure to be weak in this sense. This is more appropriately expressed by an inverse feature like [–Strong Arguments]. Furthermore, verbs and nouns are recognized to require functional heads
16. Semantic features and primes
337
(roughly Determiner for nouns and Tense and Complementizer for verbs), ultimately supporting the referential properties of their constituents, a condition that does not apply to adjectives and prepositions. This can be expressed by a feature [+Referential], which replaces the earlier feature [Verbal]. A similar proposal is advocated in Wunderlich (1996a). It might be noted that the feature [+Referential] – or whatever counterpart one might prefer – has a direct bearing on the categorization of the functional categories Determiner, Tense, and Complementizer, and the constituents they dominate. These matters go beyond the present topic, though. We would thus get the following still provisional proposal to capture the distinctions just noted: (19) Strong Arg Referential
Verb + +
Noun – +
Adjective – –
Preposition + –
Further systematic effects these features impose on the argument structure and selection restrictions depend on general principles of Universal Grammar and on languageparticular morphological categories. Thus in English and German Nominative- and Accusative-complements of verbs must be realized as Genitive-attributes of corresponding nouns, as shown by well-known examples like John discovered the solution vs. John’s discovery of the solution. On the whole, then, the features in (19) have syntactic consequences and regulate morphological conditions, and don’t seem to call for any semantic interpretation. There is, however, a traditional, if not very clear view, according to which syntactic categories have an essentially semantic or conceptual aspect, where nouns, verbs, adjectives, and prepositions denote, roughly speaking, things, events, properties, and relations, respectively. Even Hale & Keyser (1993) adopt within the “Minimalist Program” a notional view of syntactic categories, assuming verbs to have the type “(dynamic) event” e, prepositions the type “interrelation” r, adjectives the type “state” s, with only the type n of nouns left without further specification. There are, however, various kinds of counterexamples such as adjectives denoting relations like similar or dynamic aspects of events like sudden. Primarily, though, all these conditions are neatly accounted for by the SF and the argument structure of the items in question, as shown by a rough illustration like (20). According to (20a), a transitive verb like open denotes an event s which instantiates an activity relating x to y; (20b) shows that the relational noun father denotes a person x related to the (syntactically optional) individual y; (20c) indicates that the adjective open denotes a property of the entity y, and (20d) sketches the preposition in, which denotes a relation locating the entity x in the interior of y. (20) a. b. c. d.
/open/ /father/ /open/ /in/
λ y λ x λ s [ s inst [ [ act x ] [ cause [ become [ open y ] ] ] ] ] λ y λ x [ person x [ & [ male x ] [ & [ x [parent-of y ] ] ] ] ] λ y [ open y ] λ y λ x [ x loc [ interior y ] ]
Clearly enough, the relevant sortal or ontological aspects of the items are taken care of without further ado. This is not the whole story, though. Even if the semantic neutrality of the features in (19) might be taken for granted, one is left with the well-known but intriguing asymmetry between nouns and verbs, both marked [+Referential], but subject
338
IV. Lexical semantics to different constraints: While nouns have access to practically all sortal or ontological domains, verbs are restricted to events, processes, and states. The asymmetry is highlighted by basic lexical items like (21), which can show up as verbs as well as nouns, but with different semantic consequences: (21) a. run, walk, jump, rise, sleep, use, …. b. bottle, box, shelf, fence, house, saddle, …. Items like (21a) have essentially the same SF as nouns and verbs, differing only by their morpho-syntactic realization, as in Eve runs vs. Eve’s run. By contrast, items like (21b), occurring as verbs differ semantically from the corresponding noun, as illustrated by Eve bottles the wine, Max saddles the horse, etc. In other words, nouns allow for object- as well as event-reference, while verbs cannot refer to objects. Hence items like run need only allow for alternative values of the feature [ ± Strong Arg ] to yield nominal or verbal realizations, while nouns like bottle with object denotation achieve the necessary event reference only through additional semantic components like cause, become, and loc. These conditions can be made explicit in various ways. What we are left with in any case is the observation that the event-reference of verbs seems to be constrained by syntactic features. Event reference is used here (as before) in the general sense of eventuality discussed in Bach (1986), including events, processes, states, activities – in short, entities that instantiate propositions and are subject to temporal identification as assumed in (13) above for Tense features. Two points should be made in this respect: First, the constraint is unidirectional, as verbs require event-reference, but event reference is not restricted to verbs in view of nouns like decision, event, etc Second, as it concerns verbs but none of the other categories, it cannot be due to one feature alone, whether (19) or any other choice is adopted. Using the present notation, the constraint could be expressed as follows: (22) [ + Referential, + Strong Arg ] → λ s [ s inst [ p ] ] This condition singles out verbs, as desired, and it determines the relevant notional condition on their referential capacity, if we consider inst as imposing the sortal requirement of eventualities on its first argument. It also has the correct automatic result of fixing the event reference for verbs, which is the basis for the semantic interpretation of Tense, Aspect, and Mood. Something like (22) might, in fact, not only be part of the (perhaps universal) conditions on syntactic features in general and those of verbs in particular, but can also be considered as a redundancy condition on lexical entries, providing verbs with the event instantiation as their general semantic property. If so, the lexical information of entries like (20a) could be reduced to (9), with the event reference being supplied automatically.
5. Interpretation of semantic features 5.1. Conditions on interpretation Basic elements of SF, including those corresponding to morpho-syntactic features, are eventually related to or interpreted by the structures of C-I. As already mentioned, this
16. Semantic features and primes interpretation, its conditions, and consequences are analogous to the interpretation of phonetic features in A-P, yet different in fundamental respects. Analogies and differences can be characterized by four major points. First, phonetic and semantic features alike are primitive elements of the representational systems they belong to, they cannot be reduced to smaller elements. Structural differences of PF cannot be due to distinctions within elements like [–voiced] or [+nasal], just as distinct representations of SF cannot be due to distinctions within components like [ animate ], [ cause ], or [ interior ], although acoustic properties of an utterance (say, because of different speakers), or variants of animacy or causation might well be at issue (e.g., due to differences in intentionality). In other words, representations of PF and SF and their correspondence via morpho-syntax cannot affect or depend on internal structures of the primitive elements of PF and SF. Second, the interpretation of basic elements is nevertheless likely to be complex, and in any case subject to additional, rather different principles and relations. Basically, primitive elements of PF and SF recruit patterns based on different conditions of mental organization for the structural building blocks of language: The correlates of phonetic features in A-P integrate conditions and mechanisms of articulation and auditory perception, whereas correlates of semantic features in C-I involve patterns of various domains of perception, intention, and complex structural dependencies, as shown by cases like [ animate ] or [ artifact ], combining sensory with abstract, purpose oriented conditions. This is the very gist of positing interface elements, which deal with items of two currencies, so to speak, valid under intra- and extra-linguistic conditions, albeit in different ways. As the present article necessarily focuses on the interpretation of basic elements, it has to leave aside encroaching conditions involving larger structures, where compositional principles of PF and SF are partially suspended. With respect to PF, overlay-effects can be seen, e.g., in whispering, where suspending [ ± voiced ] has consequences for various other features. With respect to SF, the wide range of metaphor, for instance, is based on more or less systematic shifts within conceptually coherent areas. Projecting, e.g., conditions of animacy into inanimate domains, as is frequently done in talk about computers or other technology, obviously alters the interpretation of features beyond [ animate ]. Phenomena of this sort may partially suspend the compositionality of interpretation, still presupposing the base line of compositional matching, though. Third, PF interacts with one complex but unified domain, viz. production and processing of acoustic signals, and it shares with A-P the linear, time dependent structure as the basic principle of organization. SF on the other hand deals with practically all dimensions of experience, motivating the abstract functor-argument-structure sketched above. It necessarily cannot match the different organizational principles inherent in domains as diverse as visual perception, social relations, emotional values or practical intentions, hence it must be neutral with respect to all of them. In other words, the relation of SF to the diverse subsystems of C-I cannot mean that it shares the specific structure of different domains of interpretation. To mention just one case in point: Color perception, though highly integrated with other aspects of vision, is structured by autonomous principles, which are completely different from the distinctions of color categories and their relations in SF, as discussed in Berlin & Kay (1969) and subsequent work. It follows from these observations that the interpretation of SF must be compatible with conditions obtaining in disparate domains – not only with regard to the compositionality of complex configurations, but also with regard to the nature and content of its primitive elements.
339
340
IV. Lexical semantics Fourth, while for good reasons PF is assumed to be the interface with both articulation and perception, possibly even granting their integration, it is unclear whether in the same way SF could be the interface integrating all the subsystems and modules of C-I. Notice that this is not the same issue as the fact that two or more systems organized along different principles might well have a common interface. Singing for instance integrates language and music, and visual perception integrates shape and color, which originate from quite different neuronal mechanisms. On the one hand, for reasons of mental economy one would expect language, which forces SF to interface with the whole range of different modules of mental organization anyway, to be the designated representational system unifying the systems of C-I, setting the stage for coherent experience and mental operations like planning, reasoning, orientation. The central system posited in Fodor (1983) as the global domain the modular input/output systems interface with, might be taken to stipulate this kind of overall interface. On the other hand, there are well-known systems integrating different modules of C-I independently of SF and according to autonomous, different principles of organization. The most extensively studied case in point is spatial orientation, which integrates various modes of perception and locomotion and is fundamental for cognition in general, as it also provides a representational format for various other domains, including patterns of social relation or abstract values, as documented in the vast literature about many aspects of spatial orientation, its neurophysiological basis, its computational realization and psychological representation, including the particular conditions of spatial problem solving. An instructive survey is found in Jackendoff (2002) and the contributions in Bloom et al. (1996), which deal in particular with the representation of space in language; see also article 107 (Landau) Space in semantics and cognition. Similar considerations hold for other domains, like music, emotion, or plans of action. The question whether SF serves as the central instance, providing the common interface that directly integrates the different modules and subsystems of C-I, or whether there are separate interfaces mediating among parts of C-I independently of language must be left open here. In any case, SF and its basic elements must eventually interface with all aspects of experience we can talk about. In other words, color, music, emotions, other people’s minds, and theories about everything must all in some way participate in interpreting the basic elements of SF.
5.2. Remarks on Conceptual Structure One of the major problems originating from these considerations is the difficulty to specify at least tentatively the format of interpretations which elements of SF receive in C-I and its subsystems. To be sure, different branches of psychology and cognitive sciences provide important and elaborate theories for particular mental domains, such as Marr’s (1982) seminal theory of vision. But because of their genuine task and orientation, they deal with specific domains within the mental world. Thus they are far from covering the whole range of areas that elements of SF have to deal with, and have little to say about the interpretation of linguistic expressions. Proposals seriously dealing with the question of how extra-linguistic structures correspond to linguistic elements and configurations are largely linguo-centric, i.e. they approach the problem via insights and hypotheses developed and motivated by the analysis of linguistic expressions. An impressive large scale approach of this sort is Miller & Johnson-Laird (1976), providing an extensive survey of perceptual and conceptual structures that interpret
16. Semantic features and primes linguistic expressions. The approach systematically distinguishes the “Sensory” Field and representations of the “Perceptual World” it gives rise to from the semantic structure of linguistic expressions, the format of which is very similar to that of SF as discussed here. (23) illustrates elements of the perceptual world, (24) the semantic elements based on them. (23) a. Cause (e, e') Event e causes event e' b. Event (e) e is an event (24) a. happen (S): An event x characterized by S happens at time t if: (i) Eventt (x) b. cause (S, S'): Something characterized by S causes something S' if: (i) happen (S) (ii) happen (S') (iii) Cause ((i),(ii)) These examples cannot do justice to the general approach, but they correctly show that perceptual and semantic representations have essentially the same format, suggesting that semantic structures and their interpretation are of roughly the same sort and are related by conditions as in (24). (This is surprising in view of the fact that Johnson-Laird (1983) explores the mental structure of spatial representations and the inferences based on them, showing that they are systematically different from the semantic structure of verbal expressions describing the same spatial situations, thus supporting different inferences.) In other words, perceptual interpretation could be (mis)construed as turning the organization of SF into principles of perception, something Miller and Johnson-Laird clearly do not suggest. In a different but comparable approach, Jackendoff (1984, 2002) assumes representations of the sort illustrated in (8), which he calls Conceptual Structures, to cover semantic representations and much of their perceptual interpretation. Thus, like Miller & JohnsonLaird, he takes the format of semantic representations to be the model of mental systems at large – with one important exception. In Jackendoff (1996, 2002) a system of Spatial Representation SR, which Conceptual Structure CS interfaces with, is explicitly assumed as a system based on separate principles, some of which, like deictic frameworks, identification of axes, or orientation of objects, are made explicit. The systematic interaction of CS and SR, however, is left implicit, except that its roots are fixed by means of hybrid combinations inside lexical items, as indicated by the following entry for dog, adapted from Jackendoff (1996: 11). (25) PF: Categorization: CS: SR: Auditory:
/ dog / [ +N, – V, +Count, … ] [ animal x & [ [ carnivore x ] & [ possible [ pet x ] ] ] ] [ 3-D model with motion affordances ] [ sound of barking ]
The notion of a 3-D model invoked here is taken from the theory of vision in Marr (1982), extended by a mental representation of motion types Marr’s theory does not
341
342
IV. Lexical semantics deal with. The actual model must be construed as the prototype of dogs in the sense of Rosch & Mervis (1975) and related work. Jackendoff’s inclusion of auditory information points (without further comment) to the natural condition that the concept of dogs includes the characteristic voice, the mental representation of which requires access to the organization of auditory perception. Besides this purely sensory quality, however, barking is a linguistically classified aspect of dogs (and foxes) and as such belongs to the SF-information of the entry of dog. In general, then, what kind of representations SF must serve as interface for, is anything but obvious and clear. Principles of propositional structure, on which SF is based, are certainly not excluded from mental systems outside of language, but it is more than unlikely that they can all be couched in this format, as it appears to be suggested in Miller & Johnson-Laird (1976), Jackendoff (2002) and (at least implicitly) many others. Spatial reasoning, to mention just one domain, does not rely on descriptions using the principles of predicate logic, as shown in Johnson-Laird (1983).
5.3. The inventory The diversity of domains SF has to cope with leads to difficulties in identifying the repertoire of its primes – a problem that does not arise in PF, where the repertoire of primes is uncontroversial, at least in principle. For SF, two contrary tendencies can be recognized, which might be called minimal and maximal decomposition. Fodor et al. (1980) defend the view that concepts (which are roughly the meaning of simple lexical items) are essentially basic, unstructured elements. On this account, the inventory of semantic primes (for which the term features ceases to be appropriate) is by and large identical to the repertoire of semantically distinct lexical entries. Although Fodor (1981) admits a certain amount of lexical decomposition, recognizing some undisputable structure within lexical items, for the majority of cases he considers word meanings as primes. As a matter of fact, the problem boils down to the role of Completers, noted above – a point to which we will return. The opposite view requires the SF of all lexical items to be reducible to 1 fixed repertoire of basic elements, corresponding to the features of PF. There are various versions of this orientation, but explicit reflections on its feasibility are rare. Jackendoff (2002: 334ff) at least sketches its consequences, even contemplating the possibility of combinatorial principles inside lexical items that differ from lexicon-external combination, a view that is clearly at variance with detailed proposals he pursues elsewhere. On closer inspection, the claims of minimal and maximal decomposition are not really incompatible, though, allowing for intermediate positions with more or less extensive decomposition. As a matter of fact, the contrary perspectives end up, albeit in rather different ways, with the need to account for the nature of irreducible elements. Two points are to be made, however, before these issues will be taken up. First, with regard to the size of the repertoire, there is no reliable estimate on the market. It is clear that the number of PF-features is on the order of ten or so, but for SF-features the order of magnitude is not even a matter of serious debate. If decomposition is denied, the repertoire of basic elements is on the order of the lexical items, but the decompositional view per se does not lead to interesting estimates, either, as there is no direct relation between the number of lexical items and the number of primes. The combinatorial principles of SF would allow for any number of different lexical items on
16. Semantic features and primes the basis of whatever number of primes is proposed. Hence for standard methodological reasons, parsimonious assumptions should be expected. However, since reasonable assumptions must be empirically motivated, they would have to rely on systematically analyzed, representative sets of lexical items. But so far, available results have not led to any converging guesses. To give at least an idea about possible estimates, the comprehensive analysis of larger domains of linguistic expressions in Miller & Johnson-Laird (1976) can be taken – with all necessary caveats – to deal with remarkably more than a hundred basic elements of the sort illustrated in (24). This repertoire is due to well-motivated considerations of the authors, leading to interesting results with respect to central cognitive domains but with no general implications, except that huge parts of the English vocabulary could not even be touched. One completely different attempt to set up and motivate a general repertoire of basic elements should at least be mentioned: Wierzbicka (1996) has a growing list of 55 primes on the basis of what is called Natural Semantic Metalanguage, a rather idiosyncratic framework a more detailed presentation of which would by far exceed the present limits, but cf. article 17 (Engelberg) Frameworks of decomposition. Among the peculiar assumptions of the approach is the tenet that semantic primes are the meaning of designated, irreducible lexical items, like I, you, this, not, can, good, bad, do, happen, where, etc. Second, as to the content, the repertoire of primes has to respond not only to the richness of the vocabulary of different languages, but first of all to the diversity and nature of the mental domains to be covered. Distinctions must be captured and categorized not only for different perceptual modalities, but also for “higher order”, more central domains like social dependencies, goals of action, generating explanations, etc. There is, after all, no simple and unconditional pattern to order sets of primes. Two perspectives emerge from the contrary views about decomposition. On the one hand, even strong opponents of decomposition grant a certain amount of formal analysis, which requires a limited, not strictly closed core of basic semantic elements. These elements show up in different guises in practically all pertinent approaches. Besides components related to morpho-syntactic features as noted above, there is a small set of primes like cause, become, and a few others that have been treated as direct syntactic elements in different ways. The “light verbs” in Hale & Keyser (1993) and much related work around the Minimalist Program, for instance, are just syntactically realized semantic components. These different perspectives, emphasizing either the semantic or the syntactic role of Logical Form, are the reason for the terminological tensions mentioned at the outset. Three decades earlier, an even stronger syntactification of semantic elements had been proposed in Generative Semantics by McCawley (1968) and subsequent work, also relying on cause, become, not, and the like. In spite of deep differences between the alternative frameworks, there are clearly recurring reasons to identify central semantic elements and the structures they give rise to; cf. also article 7 (Engelberg) Lexical decomposition, article 17 (Engelberg) Frameworks of decomposition, and article 81 (Harley) Semantics in Distributed Morphology. On the other hand, even if one keeps to the program according to which lexical items are completely decomposed into primitive elements, the problem of idiosyncratic residues, identified as Distinguishers in Katz & Fodor (1963) and Katz (1972) or as Completers in Laurence & Margolis (1999), must be taken seriously. One might consider elements like [ bottle ] in (11), or taxonomic specifications like [ canine ] for dog or [ feline ] for cat, or distinguishers like [ knight serving under the standard of another
343
344
IV. Lexical semantics knight ] in Katz & Fodor (1963) as provisional candidates, to be replaced by more systematic items. But there are different domains where one cannot get rid of elements of this sort for quite principled reasons. Taxonomies of animals, plants, or artifacts are by their very nature based on elements that must be identified by idiosyncratic conditions. For different reasons, distinctions in perceptual dimensions like color, heat, pressure, etc. and other mental domains must be identified by primitive elements. In short, there is a complex variety of possibly systematic sources for Completers, such that an apparently undetermined set of basic elements of this sort must be acknowledged. For two reasons, these elements cannot be considered as a side issue, to be dismissed in view of the more principled, linguistically motivated semantic elements. First, they make up the majority of the inventory of basic semantic elements, which is, moreover, not closed on principled grounds, but open to amendments within systematic limits. Second, they are of remarkably different character with respect to their interpretation and their position within SF and the role of SF as the interface with C-I. To sum up, the overall repertoire that SF-representations are made up of comprises elements of different kinds. They can reasonably be grouped according to two conditions into (a) elements that are related to language-internal systematic conditions of L as opposed to ones that are not, and (b) elements with homogeneous as opposed to potentially heterogeneous interpretation – an important aspect that will be made explicit below. The emerging kinds of elements can be characterized as follows: (26)
Systematic Features
Idiosyncratic Features
Dossiers
+ +
– +
– –
L-systematic Homogeneous
The distinctions between these kinds of elements are based on systematic conditions with principled theoretical foundations, although the resulting differences need not always be clear-cut, a usual phenomenon with regard to empirical phenomena. As closer inspection will show, however, the classification reasonably reflects the fact that linguistically entrusted elements exhibit increasingly homogeneous and systematic properties.
6. Kinds of elements 6.1. Systematic semantic features The distinction between systematic elements and the rest seems reminiscent of that between semantic markers and distinguishers in Katz & Fodor (1963), but it differs fundamentally, both in principle and in empirical detail. There are at least two ways in which semantic primes can be systematic in the sense of bearing on morpho-syntactic distinctions. First, there are elements that participate in the interpretation of morphological and syntactic features as discussed in §4, including both categorization and the various types of selection. These include properties like [ female ], [ animate ], [ physical ], relations like [ before ], [ location ], or functions like [ surface ], [ proximate ], [ vertical ], etc. Second, there are components on which much of the argument structure of expressions, and hence their syntactic behavior depends. Besides well-known and widely discussed elements like [ cause ], [ become ], [ act ], also features of relational nouns like [ parent ], [ color ], [ extension ] and others belong to this group. For general
16. Semantic features and primes reasons, negation, conjunction, possibility, and a number of other connectives and logical operators must be subsumed here, as they are clearly involved in the way in which SF accounts for the logical organization of cognitive structures. It must be noted, however, that logical constants, explored in pertinent theories of logic, are not necessarily identical with their natural counterparts in SF. The functor & assumed in (12), for example, differs from standard conjunction at least by its formal asymmetry. Further differences between logical and linguistic perspectives have been recognized with regard to quantification and modality. In any case, the interpretation of these elements consists in conceptually basic, integrated conditions, which may or may not be related to particular sensory domains. Causality, for instance, is a fundamental dimension of experience, which lives on integrated perceptual conditions, incorporating information from various domains like change and position, as discussed, e.g., in Miller & Johnson-Laird (1976). More specifically, it is not extracted from, but imposed on perceptual information. Dowty (1979) furthermore shows that the conceptual conditions of causality and change have straightforward external, model-theoretic correlates. Similarly, shape, location, relative size, part-wholerelation, or possession are integrated and often fairly abstract conditions organizing perception and action. For instance, the relation [ loc ] or functions like [ interior ], [ surface ], etc. cover conditions of rather different kind, depending on the sortal aspect of their arguments. Thus a page in the novel, a chapter in the novel, an error in the novel, or a scene in the novel pick out different aspects by means of the same semantic conditions represented by the features assumed for in in (20d). Other and perhaps still more complex and integrated aspects of experience are involved in the recognition of animacy or the identification of male and female sex. Even if some ingredients of these complex patterns can be sorted out and verbalized separately (identifying, e.g., physiological properties or behavioral characteristics), the features are holistic components with no internal structure within SF. This corresponds to the fact that the content of PF-features is sometimes accessible to conscious control and verbalization. The articulatory gestures of features like [voiced] or [labial] for instance may well be reflected on, without their encapsulated character being undercut. In general, systematic features are to be construed as the stabilized, linguistic reflex of mechanisms involved in organizing and conceptualizing experience. Fundamental conditions of this sort seem to provide a core of possibilities the language faculty can draw on. Which of these options are eventually implemented in a given language obviously depends to some extent on its morphological and lexical regularities.
6.2. Idiosyncratic semantic features Two factors converge in this kind of features. First, there is the fundamental observation, granted even under a radical denial of lexical decomposition, that the meanings of linguistic expressions respond to distinctions within basic sensory domains, from which more complex concepts are constructed. An extensively debated and explored domain in this respect is color perception. Items like yellow, green, blue differ semantically on the basis of perceptual conditions, but with no other linguistic consequences, except common properties of the group as a whole, which is captured by the common, systematic feature [ color ], while the different chromatic categories are represented by idiosyncratic features like [ red ], [ green ], [ black ], etc. It might be noted in this respect that color terms –
345
346
IV. Lexical semantics unlike those for size, value, speed, etc. – are adjectives that can also occur as nouns, as shown by the red of the car versus *the huge of the car, due to the component [ color ] as opposed to the idiosyncratic values [ red ], [ blue ], which still are by no means arbitrary, as Berlin & Kay (1969) and subsequent work has shown. Various other domains categorizing sensory distinctions have attracted different degrees of attention, with limited systematic insight. Intriguing problems arise, e.g., with respect to haptic qualities, concerning texture and plasticity of surfaces or even the perceptual classification of temperature. It might be added that besides perceptual domains, there are biologically determined motor patterns, as in grasp or the distinction between walk and run, which also provide the interpretation of basic elements of SF. The second factor supporting idiosyncratic features is the obvious need to account for differences of meaning that obviously have no systematic status in morpho-syntax or more general semantic patterns. Whether or not the distinctions between break, shatter, smash can be traced back to homogeneous sensory modalities need not be a primary concern, since in any case besides basic modes of perception and motor control, more complex conditions like those involved in the processes denoted by cough, breath, or even sleep, laugh, cry and smile are likely to require semantic features that are motivated by nothing else than the distinctions assigned to them in C-I. These distinctions might well be based on processes that are physiologically or physically complex – they still lead to self-contained features, as long as their interpretation is experientially homogeneous. One easily realizes that lexical knowledge abounds with elements like shatter, splinter, smash, and plenty of other sets of items that cannot get along without features of this idiosyncratic kind. They seem to be the paradigm cases demonstrating the need for Distinguishers or Completers. This is only half of the truth, though, because distinguishing elements like [ canine ], [ feline ] etc. are usually assumed to be equally typical for Completers as are [ green ] or [ smooth ] and the like. Characteristic elements identifying the peculiarities of a particular species or a specific artifact are, of course, plausibly considered as Completers or Distinguishers. But they do have an essentially different character, as their interpretation can be heterogeneous for principled reasons, combining information from separate mental modalities. Similar considerations apply to what Pustejovsky (1995) calls the Qualia structure of lexical items, which he supposes to characterize their particular distinctive properties, using the term Qualia, however, in a fairly different sense than the original notion of subjective (usually monomodal) percepts, e.g., in Goodman (1951). The integration of different types of information acknowledged in these considerations might be a matter of degree, which yields a fuzzy boundary with respect to the kind of elements to be considered next.
6.3. Dossiers The notion of Dossiers, proposed in Bierwisch (2007), takes up a problem that is present but not made explicit in much of the literature that deals with the phenomena in question. The point is illustrated by the entry (25) proposed by Jackendoff for dog, the semantically relevant parts of which are repeated here as (27): (27) CS: SR: Auditory:
[ animal x & [ [ carnivore x ] & [ possible [ pet x ] ]] ] [ 3-D model with motion affordances ] [ sound of barking ]
16. Semantic features and primes The elements in CS (Jackendoff’s counterpart of SF) are fairly abstract conceptual conditions, and in fact plausible candidates for systematic features classifying animals. But instead of the distinguisher [ canine ] that would have to identify dogs, we have SR and Auditory information to account for their specificity. Now, SR would either require the domain of 3-D models to be subdivided in prototypes of dogs, cats, horses, spiders, etc. – comparable to the division of the color space by cardinal categories like red, green, yellow, etc. –, clearly at variance with the theory of 3-D representations, or it would leave the information about shape and movement of dogs outside the range of the features of SF. Similar remarks apply to the auditory information. The essential point is, however, that 3-D models (of any kind of object, not just of dogs or animals) are Spatial Representations, which are subject to completely different principles of organization than SF (or CS, for that matter). SR-representations by their very nature preserve three-dimensional relations, allowing for spatial properties and inferences, as discussed in Johnson-Laird (1983) and elsewhere, conditions that SF is incapable to support. Moreover, besides shape, movement, and acoustic characteristics, the distinguishing information about dogs includes a variety of other aspects of behavior, capacities, possible functions, etc. some of which might require propositional specifications that can be represented in the format of SF. In other words, the replacement or interpretation of [ canine] does not only concern SR and Auditory in (27), but involves a presumably extendible, heterogeneous cluster of conditions, i.e. a dossier of different conditions, which is nevertheless a unified element within SF, as the abbreviation [ canine ] in the SF of dog would correctly indicate. The comments made on the SF of dog are easily extended to a large range of lexical items. Take, for the sake of illustration, nouns like ski or bike, which would be classified by systematic features like [ artifact ] and [ for motion ], while the more-or-less complex specificities are indicated by dossiers like [ ski ], or [ bicycle ], respectively. Again, what is involved are different domains of C-I, indicating e.g., conditions on substance, bits and pieces, and technology, in addition to spatial models, and interpreting integrated elements, which, incidentally, enter into operations of verbalization noted with respect to (21b), such that ski and bike become verbs denoting movement by means of the specified objects. This in turn shows that dossiers are by no means restricted to the SF of nouns. Three comments should be made here. First, as noted in (26), dossiers are elements with inhomogeneous interpretation that allow for conditions from different modules of C-I. To the extent to which these conditions may or may not be integrated, the borderline between dossiers and idiosyncratic features gets fuzzy, as noted earlier. With respect to SF, however, they are basic elements, subject to the general type structure and the combinatorial conditions it imposes, as shown by constructions like we were skiing or she biked home. As a consequence, the Lexical System LS of a given language L must be able to contain a remarkable collection of primitive elements whose idiosyncratic, heterogeneous interpretation is attached to, but not really part of, the lexical entries. In other words, lexical entries are connected to C-I not only by means of the general, systematic primes, but also by means of idiosyncratic files calling up different modules. One might observe that entries like (25) reflect this situation by making the extra-linguistic (e.g. spatial and auditory) information part of the lexical items directly. That way one would avoid ambivalent primes of the sort called Dossier, but the Lexical System would become a hybrid combination of linguistic and other knowledge – which perhaps it is. On that view, dossiers would behave as unified basic elements, a point that will be taken up below. Under the opposite view, dossiers might be construed to a large extent as
347
348
IV. Lexical semantics the meaning of lexical entries without decomposition. And that is in fact what dossiers are – precisely to the extent to which primes resist language-internal decomposition. Second, for this very reason dossiers are crucial joints for the role of SF as the interface with the different modules of C-I. To the extent to which they include, e.g., 3-D models that allow for spatial characteristics and spatial similarity judgments, dossiers situate the elements they specify in SF with respect to spatial possibilities, whose actual conditions, however, are given by the system of Spatial Representation. Similar considerations apply to auditory or interpersonal information. In general then, insofar as dossiers address different mental subsystems, they directly mediate between language and the different modes of experience. Third, as already noted, dossiers may integrate propositional information, in particular if they are enriched through individual experience, such that, e.g., the file [ frog ] includes conditions of procreation and development or usual habitats besides the characteristic visual and auditory information. Much of this information might be of propositional character, sharing in principle the format of SF. With respect to these components, dossiers are transparent in the sense that on demand their content is available for verbalization. In a way, dossiers might thus be means of sluicing information into the proper lexical representations. It is worth noting in this respect that Fodor (1981) supports the claim that lexical meanings are basic, indefinable elements by explaining them as a kind of Janus-headed entities, which are logically primitive but may nevertheless depend on and integrate other (basic or complex) elements. The decisive point is that the dependence in question must not be construed as a logical combination of the integrated elements, since this would lead to analyzable complex items, but as something which Fodor calls mental chemistry in the sense of John Stuart Mill (1967), who suggests that by way of mental chemistry simple ideas generate, rather than compose, the complex ones. In other words, the way in which the prerequisites of such a double-faced element are involved is not the kind of combination on which SF is based, but a type of integration by which the mental architecture supports hybrid elements that are primitive in one respect, still recruiting resources that are complex in other respects. This is exactly what Fodor’s lexical meanings share with dossiers or completers. Mill’s and Fodor’s mental chemistry, the obviously necessary amalgamation of different mental resources and modalities, is a permanent challenge for the cognitive sciences dealing with perception and conceptual organization. The notion of frames introduced, e.g., in Fillmore (1985) and systematically developed in Barsalou (1992, 1999) is one such proposal to relate the meaning of linguistic expressions to the different dimensions of their interpretation, where frames are assumed to integrate and organize the different dimensions of experience; cf. also article 29 (Gawron) Frame Semantics. In conclusion, in view of the controversial properties of (in)decomposable complex semantic elements it might be unavoidable to recognize ambivalent elements, which are at the same time primitive features of the internal architecture of L and heterogeneous elements at the border of SF.
6.4. Combinatorial effects The phenomena just noted are related to a number of widely discussed problems. One of them concerns the variation on the basis of the so-called literal meaning, in particular the
16. Semantic features and primes combinatorial effect that SF-features may exert on their interpretation. (28) is a familiar exemplification of one type of interaction: (28) a. b. c. d. e. f. g.
Tom opened the door. Sally opened her eyes The carpenters opened the wall Sam opened his book to page 37 The surgeon opened the wound The chairman opened the meeting Bill opened a restaurant
Searle (1983) remarks about these examples that open has the same literal meaning in (28a–e), adumbrating a somewhat different meaning for (28f) and (28g), as incidentally suggested by the German glosses öffnen for (28a-e), but eröffnen for (28f/g), although he takes it as obvious that the verb is understood differently in all cases. As far as this is correct, the differences are connected to the objects the verb combines with, inducing different acts and processes of opening. If the SF of open given in (20a), repeated as (29), represents the invariant meaning of the occurrences in question, then the differences noted by Searle must arise from the feature-interpretation in C-I: (29) / open /
λ y λ x λ s [ s inst [ [ act x ] [cause [ become [ open y ] ] ] ] ]
To sort out the relevant points, we first notice that the different types of transition towards the resulting state of y and their causation by the act of x are intimately related to the nature of the resulting state and the sort of activity that brings it about. In other words, the time course covered by [ become ] and the act causing the change are different if the eventual state is that of the door, the eye, a book, a bottle, or a meeting. Similarly the character of s instantiating the transition is determined by the nature of x and y and their involvement. Hence [ s inst …] and [cause [ become … ] ] don’t contribute to the variants of interpretation independently of [ act x] and [ open y ]. As noted earlier, [ become p ] moreover involves a presupposition. It requires the preceding state to be [ not p ], which in turn depends on the interpretation of p, i.e. [ open y ] in the present case. Hence the actual choice in making sense of open depends – not surprisingly – on the values of x and y, which are provided by the subject and object of the verb. Now, the subject in all cases at hand has the feature [ human x ], allowing for intentional action and thus imposing no relevant differences on the interpretation of x. Variation in the interpretation of [ act x ] is therefore not due to the value of x, but to the interpretation of [ open y ], which in turn depends on y and the property abbreviated by open. This property is less clear-cut than it appears, however. What it requires is that y does not preclude access, where y is either the container whose interior is at issue, as in he opened the bottle, or its boundary or possible barrier, as in he opened the door, while in cases like she opened her eyes or he opened the hand the alternative seems altogether inappropriate. Bowerman (2005) shows that differences of this sort may lead to different lexical items in different languages according to conditions on the specification of y. How these and further variations are to be reflected in SF must be left open here. As a first approximation, [ open y ] could be replaced by something like [ y allow-for [ access-to [ interior ] ] ], leaving undecided how the interior relates to y, i.e. to the container or its boundary. In any case, the interpretation of
349
350
IV. Lexical semantics [ open y ] depends on how the object of open, delivering the value of y, is to be interpreted in C-I in all relevant respects. This in turn modulates the interpretation of [ act x ], fostering the specific activity by which the resulting state can be brought about. Without going through further details involved in cases similar to (28), it should be clear that the SF-features are not interpreted in isolation, but only with regard to connected scenarios C-I must provide on the basis of experience from different modules. Searle (1983) calls these conditions the “Background” of meaning, without which literal meaning could not be understood. The Background cannot itself be part of the meaning, i.e. of the semantic structure, without leading to an infinite regress, as the background elements would in turn bring in their background. It can be verbalized, however, to the extent to which it is accessible to propositional representation. Roughly the same distinction is made in Bierwisch & Lang (1989), Bierwisch (1996) between Semantic Form and Conceptual Structure; cf. also article 31 (Lang & Maienborn) Two-level Semantics. A different conclusion with respect to the same phenomena is drawn by Jackendoff (1996, 2002, and related work, cf. article 30 (Jackendoff) Conceptual Semantics), who proposes Conceptual Structure, i.e. the representation of literal meaning, to include spatial as well as any other sensory and motor representation. Carefully comparing various versions to relate semantic, spatial, and other conceptual information, Jackendoff (2002) ends up with the assumption that no principled separation of linguistic meaning from other aspects of meaning is warranted. But besides the fact that there are different types of reasoning based e.g., on spatial and propositional representations, as Jackendoff is well aware, the problem remains whether and how to capture the different interpretations related to the same lexical meaning, as illustrated by cases like (28) and plenty of others, without corresponding differences in Conceptual Structure. In any case, if one does not assume the verb open to be indefinitely ambiguous, with equally many different representations in SF (or CS), it is indispensable to have some way to account for the unified literal meaning as opposed to the multitude of its interpretations. This problem has many facets and consequences, one of which is exemplified by the following contrast: (30) a. Mary left the institute two hours ago. b. Mary left the institute two years ago. Like open in (28), leave has the same literal meaning in (30a) and (30b), which can roughly be indicated as follows, assuming that [ x at y ] provisionally represents the condition that x is in some way connected to y. (31) / leave /
λ y λ x λ s [ s inst [ [ act x ] [ cause [ become [ not [x at y ] ] ] ] ] ]
Under the preferred interpretation, (30a) denotes a change of place, while (30b) denotes a change of affiliation. The alternatives rely on the interpretation of institute as a building or as a social institution. Pustejovsky (1995) calls items of this sort dot-objects. A dotobject connects ontologically heterogeneous conditions by means of a particular type of combination which makes them compatible with alternative qualifications. The entry for institute has the feature [ building x ] in “dotted” combination (hence the name) with [ institution x ], which are picked up alternatively under appropriate combinatorial conditions, as shown in (32). Whether [ building ] and [ institution ] are actually primes
16. Semantic features and primes or rather configurations of SF built on more basic elements such as [ physical ] and [ social ] etc. can be left open. The point is that they have complementary conditions with incompatible sortal properties. (32) a. The institute has six floors and an elevator. b. The institute has three directors and twenty permanent employees. The cascade of dependencies in (30) turns on the different values for y in (31) – the “dotted” institute –, which induces a different interpretation of the relation at. This in turn implies a different type of change, caused by different sorts of activity, involving crucially different aspects of the actor Mary, who must either cause a change of her location or of her social relations, participating as a physical or a social, i.e. intentional agent. More generally, the feature [ human x ] must be construed as a “dotted” combination tantamount to the very basis of the mind-body-problem. Now, the choice between the physical and the social interpretation is triggered by the time adverbials two hours ago vs. two years ago, which sets the limits for the event e, which in turn affects the interpretation of the time course covered by [ change …]. The intriguing point is that the contrasting elements hour vs. year have a fixed temporal interpretation with no dotted properties. Hence they must trigger the relevant cascades via general, extra-linguistic background knowledge. An intriguing consequence of these observations is the fact that there are clearly violations of the principle of compositionality, according to which the interpretation of a complex expression derives from the interpretation of its constituent parts. Simple and straightforward cases like (31) illustrate the point: The interpretation of Mary, institute and leave – or rather of the components [ person ], [ act ], [ building ] etc. they contain – depends on background- or context-information not part of SF at all, and definitely outside the SF of the respective constituents. Hence even if SF is not supposed to be systematically separate from conceptual structure and background information, its compositional aspect, following the logic of its type-structure, must be distinguished from differently organized aspects of knowledge.
7. Primes and universals As noted at the beginning, there are strong and plausible tendencies to consider the primitive elements that make up linguistic expressions as substantive universals, provided by the language faculty, the formal organization of which is characterized by Universal Grammar. In other words, UG is assumed to contain a universal repertoire of basic possibilities, which are activated or triggered by individual experience. Thus individual, ontogenetic processes select the actual distinctions from the general repertoire, which is part of the language capacity as such. This means that the distinctions indicated by features like [ tense ] or [ round ] may or may not appear in the system of PF-features in a given system L, but if the features appear, they just realize options UG provides as one of the prerequisites for the acquisition and use of language. If these considerations are extended to features of SF, two aspects must be distinguished, which at PF are often considered as essentially one phenomenon, namely features and their interpretation. For articulation this identification is a plausible abbreviation, but it doesn’t hold at the conceptual side. For SF, the mental systems that provide the interpretation of linguistic expressions must be conceived as having a rich and
351
352
IV. Lexical semantics to a large extent biologically determined structure of their own, independent of (or in parallel to) the language capacity. Spatial structure is the most obvious, but by no means the only case in point. General conditions of experience such as three-dimensionality of spatial orientation, identification of verticality, dimensions and distinctions of color perception, and many other domains and distinctions may directly correspond to possible features in SF, very much like parameters of articulation correspond to possible features of PF. Candidates in this respect are systematic features like [ vertical ] or [ before ] and their interpretation in C-I, as discussed earlier. Similarly, idiosyncratic features, which correspond to biologically determined conditions of perception, motor control, or emotion might be candidates of this sort. To which extent observations about the role of focal colors or the body-schema and natural patterns of motor activity like walking, grasping, chewing, swallowing, etc. can be considered as options recruited as features of SF, similar to articulatory conditions for features of PF, must be left open. Notice, however, that we are talking about features of SF predetermined by UG, not merely about perceptual or motor correlates in C-I. In any case, because of the number and diversity of phenomena to be taken into account, there is a problem of principle, which precludes generally extending SF from the notion of universal options predetermined to be triggered by experience. The problem comes from the wide variety of basic elements that must be taken to be available in principle, but cannot be conceived without distinct experience. It primarily concerns dossiers, but also a fair range of idiosyncratic features, and it arises independently of the question whether one believes in decomposition or takes, like Fodor (1981), all concepts or possible word meanings to be innate. It is most easily demonstrated with regard to 3-D models (or visual prototypes) that must belong to many dossiers. We easily identify cats, dogs, birds, chairs, trumpets, trees, flowers, etc. on the basis of limited experience, but it does not make sense to stipulate that the characteristic prototypes are all biologically fixed, ready to be triggered like for instance the prototype of the human face, which is known to unfold along a fixed maturational path. The same holds for the wide variety of auditory patterns (beyond those recruited for features of PF), by which we distinguish frogs, dogs, nightingales, or flutes and trombones, cars, bikes, and trains. Without adding further aspects and details, it should be obvious that whatever enters the interpretation of these kinds of basic elements, it cannot belong to conditions that are given prior to experience and need only be activated. At the same time, the range of biologically predisposed capacities, by means of which features can be interpreted in C-I and distinguished in SF, is by no means arbitrary, chaotic, and unstructured, but clearly determined by principles organizing experience. Taken together, these considerations suggest a modified perspective on the nature of primitive elements. Instead of taking features to be generally part of UG, triggered by experience, one might look at UG as affording principles, patterns, or guidelines along which features or primes are constructed or extracted, if experience provides the relevant information. The envisaged distinction between primes and principles is not merely a matter of terminology. It corresponds, metaphorically speaking, to the difference between locations fixed on a map and the principles from which a map indicating the locations would emerge on the basis of exploration. More formally, the difference corresponds to that between the actual set of natural numbers and its construction from the initial element by means of the successor operation. Less metaphorically, the substantial content of the distinction can be explained by analogy with face recognition. Human beings can
16. Semantic features and primes
353
normally distinguish and recognize a large number of faces on the basis of limited and often short exposure. The resulting knowledge is due to a particular, presumably innate disposition, but it cannot be assumed to result from triggering individual faces that were innately known, just waiting for their eventual activation. In other words, the acquisition of faces depends on incidental, personal information, processed by the biologically determined capacity to identify and recognize faces, a capacity that ontogenetically emerges along a biologically determined schema and a fixed developmental path. With these considerations in mind, UG need not be construed as containing a fixed and finite system of semantic features, but as providing conditions and principles according to which distinctions in C-I can make up elements of SF. The origin and nature of these conditions are twofold, corresponding to the interface character of the emerging elements. On the one hand the conditions and principles must plausibly be assumed to rely on characteristic structures of the various interpretive domains which are independently given by the modules of C-I. Fundamentals of spatial and temporal orientation, causal explanation, principles of good form, discovered in Gestalt psychology, focal colors, principles of pertinence and possession, or the body schema and its functional determinants are likely candidates. On the other hand, conditions on primes are given by the organization of linguistic expressions, i.e. the format of representations and the principles by which they are built up and mapped onto each other. In this respect, even effects of morpho-syntax come into play through constraints on government, agreement, or selection restrictions, which depend on features at least partially interpreted in C-I, as discussed in § 4. In general, then, a more or less complex configuration of conditions originating from principles and distinctions within C-I serves as a feature of SF just in case this element plays a proper systematic role within the combinatorial structure of linguistic expressions. Under this perspective, semantic primes would emerge from the interaction of principles of strictly linguistic structure with those of sensory-motor and various other domains of mental organization – just as one should expect from their role as components of an interface level. The effect of this interaction, although resulting from universal principles of linguistic structure and general cognition, is determined (to different degrees) by particular languages and their lexical inventory, as convincingly demonstrated by Bowerman (2000), Slobin et al. (2008) and much related work. This applies already to fairly elementary completers like those indicated by [ interior ] in (20d), or [ at ] in (31). Thus Bowerman (2005) shows that conditions like containment, (vertical) support, tight fit, or flexibility can make up configurations matched differently by (presumably basic) elements in different languages, marking conditions that children are aware of at the age of 2. (33) is a rather incomplete illustration of distinctions that figure in the resulting situation of actions placing x in/on y. (33) block book ring Lego cup hat towel
→ → → → → → →
pan fitted case finger Lego stack table head hook
English
Dutch
German
Korean
in in on on on on on
in in om op op op aan
in in an auf auf auf an
nehta kkita kkita kkita nohta ssuta kelta
354
IV. Lexical semantics In a similar vein, joints of distinctions and generalizations are involved in elements like [ open ] in (20c) and (29), or [ animal ] and [ pet ] in (25), but also [ artifact ], [ container ], and [ bottle ] in (12) and many others. Further complexities along the same lines must be involved in Completers or dossiers specifying concepts like car, desk, bike, dog, spider, eagle, murky, clever, computer, equation, exaggerate, occupy and all the rest. In this sense, the majority of lexical items is likely to consist of elements that comprise a basic categorization in terms of systematic features and a dossier indicating its specificity, similarly to proper names, which combine – as argued in Bierwisch (2007) – a categorization of their referent with an individuating dossier. Remember that these specifying conditions are based on cross-modal principles in terms of which experience is organized, integrated by what is metaphorically called mental chemistry, such that dossiers are not in general logically combined necessary and sufficient criteria of classification. The upshot of these considerations is that primes of SF are clearly determined by UG, as they emerge from principles of the faculty of language and are triggered by conditions of mental organization in general. They are not, however, necessarily elements fixed in UG prior to experience. They are not learned in the sense of inductive learning, but induced or triggered by structures of experience, which linguistic expressions interface with. This may lead, among others, to cross-linguistic differences within the actual repertoire of primes, as suggested in (33). It does not exclude, however, certain core-elements like [ cause ], [ become ], [ not ], [ loc ] and quite a few others to be explicitly fixed and pre-established in UG, providing the initial foundation to interface linguistic expressions with experience. This view leads to a less paradoxical reading of Fodor’s (1981) claim that lexical concepts must all be innate, including not only nose and triangle, but also elephant, electron or grandmother, because they cannot logically be decomposed into basic sensory features. Since Fodor concedes a (very restricted) amount of lexical decomposition, his claim concerns essentially the status of dossiers. Now, these elements can well be considered as innate, if at least the principles by which they originate are innate – either via UG or by conditions of mental organization at large. Under this construal, the specific dossier of e.g., horse or electron can be genetically determined without being actually represented prior to triggering experience, just as, e.g., the knowledge of all prime numbers could be considered as innate if the successor-operation and multiplication are innate, identifying any prime number on demand, without an infinite set of them being actually represented. As to the general principles of possible concepts, which under this view are the innate aspect of possible entities (analogous to prime numbers fixed by their indivisibility), Fodor assumes a hierarchy of triggering configurations, with a privileged range of Basic-level concepts in the sense of Rosch (1977) as the elements most directly triggered. Their particular status is due to two general conditions, ostensibility and accessibility, both presupposing conceptual configurations to depend on each other along the triggering-hierarchy (or perhaps network). Ostensibility requires a configuration to be triggered by way of ostension or a direct presentation of exemplars, given the elements triggered so far. Accessibility relates to dependence on prior (ontognetic or intellectual) acquisition of other configurations. Thus the basic-level concept tree is prior to the superordinate plant, but also to the subordinates oak or elm, etc. Fodor is well aware that ostension and accessibility are plausible, descriptive constraints on the underlying principles, the actual specification of which is still a research program rather than a simple result. What needs to be clarified are at least three problems: First the functional
16. Semantic features and primes integration of conditions from different domains (the mental chemistry) within complex but unified configurations of CS, second the abstraction or filtering, by which these configurations are matched with proper, basic elements of SF, and third the dependency of clusters or configurations in CS along the triggering hierarchy, which cannot generally be reduced to logical entailment. Some of these dependencies are characteristically reflected within SF by means of ordinary semantic elements and relations, while much of crosslinguistic variation, metaphor and other aspects of reasoning are a matter of essentially extralinguistic principles. It might finally be added that for quite different reasons a similar orientation seems to be indicated even with respect to features of PF. As pointed out in Bierwisch (2001), if the discoveries about sign language are taken seriously, the language capacity cannot be restricted to articulation by means of the vocal tract. As has been shown by Klima & Bellugi (1979) and subsequent work, sign languages are organized by the same general principles and with the same expressive power as spoken languages. Now, if the faculty of language includes the option of sign languages as a possibility activated under particular conditions, the principles of PF could still be generally valid, using time slots and articulatory properties, but the choice and interpretation of features can no longer be based on elements like [ nasal ], [ voiced ], [ palatal ] etc. Hence even for PF, what UG is likely to provide are principles that create appropriate structural elements from the available input information, rather than a fixed repertoire of articulatory features to be triggered. Logically, of course, UG can be assumed to contain a full set of features for spoken and signed languages, of which normally only those of PF are triggered, but it is not a very plausible scenario. To sum up, semantic primitives are based on principles of Universal Grammar, even if they do not make up a finite list of fixed elements. UG is assumed to provide the principles that determine both the mapping of an unlimited set of structured signals to an equally infinite array of meanings and the format of representations this mapping requires. These principles specify in particular the formal conditions for possible elements with interpretation provided by different aspects of experience, such as spatial orientation, motor control, social interaction, etc. Although this view does not exclude UG from supporting fixed designated features for certain aspects of linguistic structure, it allows for the repertoire of primitive semantic elements to be extended on demand.
8. References Ajdukiewicz, Kazimierz 1935. Über die syntaktische Konnexität. Studia Philosophica 1, 1–27. Bach, Emmon 1986. The algebra of events. Linguistics & Philosophy 9, 1–16. Barsalou, Lawrence W. 1992. Frames, concepts, and conceptual fields. In: A. Lehrer & E. Kittay (eds.). Frames, Fields, and Contrasts. Hillsdale, NJ: Erlbaum, 21–74. Barsalou, Lawrence W. 1999. Perceptual symbol systems. Behavioral and Brain Sciences 22, 577–660. Berlin, Brent & Paul Kay 1969. Basic Color Terms. Berkeley, CA: University of California Press. Bierwisch, Manfred 1967. Syntactic features in morphology: General problems of so-called pronominal inflection in German. In: To Honor Roman Jakobson, vol. I. The Hague: Mouton, 239–270. Bierwisch, Manfred 1996. How much space gets into language? In: P. Bloom et al. (eds.). Language and Space. Cambridge, MA: The MIT Press, 31–76. Bierwisch, Manfred 1997. Lexical information from a minimalist point of view. In: C. Wilder, H.-M. Gärtner & M. Bierwisch (eds.). The Role of Economy Principles in Linguistic Theory. Berlin: Akademie Verlag, 227–266.
355
356
IV. Lexical semantics Bierwisch, Manfred 2001. Repertoires of primitive elements – prerequisite or result of acquisition? In: J. Weissenborn & B. Höhle (eds.). Approaches to Bootstrapping, vol. II. Amsterdam: Benjamins, 281–307. Bierwisch, Manfred 2007. Semantic form as interface. In: A. Späth (ed.). Interfaces and Interface Conditions. Berlin: de Gruyter, 1–32. Bierwisch, Manfred 2010. become and its presuppositions. In: R. Bäuerle, U. Reyle & T. E. Zimmermann (eds.). Presupposition and Discourse. Bingley: Emerald, Group Publishing, 189–234. Bierwisch, Manfred & Ewald Lang 1989 (eds.). Dimensional Adjectives. Berlin: Springer. Bierwisch, Manfred & Rob Schreuder 1992. From concepts to lexical items. Cognition 42, 23–60. Bloom, Paul, Mary A. Peterson, Lynn Nadel & Merrill F. Garrett 1996. Language and Space. Cambridge, MA: The MIT Press. Bowerman, Melissa 2000. Where do children’s meanings come from? Rethinking the role of cognition in early semantic development. In: P. Nucci, G. Saxe & E. Turiel (eds.). Culture, Thought, and Development. Mahwah, NJ: Erlbaum, 199–230. Bowerman, Melissa 2005. Why can’t you ‘open’ a nut or ‘break’ a cooked noodle? Learning covert object categories in action word meanings. In: L. Gershkoff-Stowe & D. H. Rakison (eds.). Building Object Categories in Developmental Time. Mahwah, NJ: Erlbaum, 209–243. Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge, MA: The MIT Press. Chomsky, Noam 1970. Remarks on nominalization. In: R. A. Jacobs & P. S. Rosenbaum (eds.). Readings in English Transformational Grammar. The Hague: Mouton, 11–61. Chomsky, Noam 1981. Lectures on Government and Binding. Dordrecht: Foris. Chomsky, Noam 1985. Knowledge of Language: Its Nature, Origin, and Use. New York: Praeger. Chomsky, Noam 1995. The Minimalist Program. Cambridge, MA: The MIT Press. Chomsky, Noam 2000. Minimalist inquiries. In: R. Martin, D. Michaels & J. Uriagereka (eds.). Step by Step. Cambridge, MA: The MIT Press, 89–155. Chomsky, Noam & Morris Halle 1968. The Sound Pattern of English. New York: Harper & Row. Clements, George N. 1985. The geometry of phonological features. Phonology Yearbook 2, 225–252. Cresswell, Max 1973. Logics and Languages. London: Methuen. Dowty, David 1979. Word Meaning and Montague Grammar. Dordrecht: Reidel. Fillmore, Charles J. 1985. Frames and the semantics of understanding. Quaderni di Semantica 6, 222–255. Fodor, Jerry A. 1981. Representations. Cambridge, MA: The MIT Press. Fodor, Jerry A. 1983. The Modularity of Mind. Cambridge, MA: The MIT Press. Fodor, Jerry A., Merrill F. Garrett, Edward Walker & Cornelia Parkes 1980. Against definitions. Cognition 8, 263–367. Goodman, Nelson 1951. The Structure of Appearance. Cambridge, MA: Harvard University Press. Hale, Kenneth & Samuel J. Keyser 1993. Argument structure and the lexical expression of syntactic relations. In: K. Hale & S. J. Keyser (eds.). The View from Building 20. Cambridge, MA: The MIT Press, 53–109. Halle, Morris 1983. On distinctive features and their articulatory implementation. Natural Language and Linguistic Theory 1, 91–105. Halle, Morris & Alec Marantz 1993. Distributed morphology and the pieces of inflection. In: K. Hale & S. J. Keyser (eds.). The View from Building 20. Cambridge, MA: The MIT Press, 111–176. Halle, Morris & Jean-Roger Vergnaud 1987. An Essay on Stress. Cambridge, MA: The MIT Press. Hjelmslev, Louis 1935. La Catégorie des Cas I. Aarhus: Universitetsforlaget. Hjelmslev, Louis 1938. Essai d’une théorie des morphemes. In: Actes du IVe Congres international de linguists 1936. Kopenhagen, 140–151. Reprinted in: L. Hjelmslev. Essais linguistiques. Copenhague: Nordisk Sprog- og Kulturforlag, 1959, 152–164. Hjelmslev, Louis 1953. Prolegomena to a Theory of Language. Madison, WI: The University of Wisconsin Press.
16. Semantic features and primes
357
Jackendoff, Ray 1984. Semantics and Cognition. Cambridge, MA: The MIT Press. Jackendoff, Ray 1990. Semantic Structures. Cambridge, MA: The MIT Press. Jackendoff, Ray 1996. The architecture of the linguistic-spatial interface. In: P. Bloom et al. (eds.). Language and Space. Cambridge, MA: The MIT Press, 1–30. Jackendoff, Ray 2002. Foundations of Language. Oxford: Oxford University Press. Jakobson, Roman 1936. Beitrag zur allgemeinen Kasuslehre. Travaux du Cercle Linguistique de Prague 6, 240–288. Johnson-Laird, Philip N. 1983. Mental Models: Toward a Cognitive Science of Language, Inference and Consciousness. Cambridge, MA: Harvard University Press. Kamp, Hans 2001. The importance of presupposition. In: C. Rohrer, A. Roßdeutscher & H. Kamp (eds.). Linguistic Form and its Computation. Stanford, CA: CSLI Publications, 207–254. Kamp, Hans & Uwe Reyle 1993. From Discourse to Logic. Dordrecht: Kluwer. Katz, Jerrold J. 1972. Semantic Theory. New York: Harper & Row. Katz, Jerrold J. & Jerry A. Fodor 1963. The structure of a semantic theory. Language 39, 170–210. Klima, Edward S. & Ursula Bellugi 1979. The Signs of Language. Cambridge, MA: Harvard University Press. Laurence, Stephen & Eric Margolis 1999. Concepts and cognitive science. In: E. Margolis & S. Laurence (eds.). Concepts: Core Readings. Cambridge, MA: The MIT Press, 3–81. Lewis, David 1972. General semantics. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 169–218. Marr, David 1982. Vision. San Francisco, CA: Freeman. McCawley, James D. 1968. Lexical insertion in a transformational grammar without deep structure. In: B. J. Darden, C.-J.N. Bailey & A. Davison (eds.). Papers from the Regional Meeting of the Chicago Linguistic Society (= CLS) 4. Chicago, IL: Chicago Linguistic Society, 71–80. Mill, John S. 1967. System of Logic. Collected Works. Vol. III. Toronto, ON: Toronto University Press. Miller, George & Philip Johnson-Laird 1976. Language and Perception. Cambridge, MA: Harvard University Press. Montague, Richard 1974. Formal Philosophy. Selected Papers of Richard Montague. Edited and with an introduction by Richmond H. Thomason. New Haven, CT: Yale University Press. Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA: The MIT Press. Rosch, Eleanor. 1977. Classification of real-world objects: Origins and representations in cognition. In: P. N. Johnson-Laird & P. C. Wason (eds.). Thinking – Readings in Cognitive Science. Cambridge: Cambridge University Press, 212–222. Rosch, Eleanor & Carolin Mervis 1975. Family resemblances: Studies in the internal structure of categories. Cognitive Psychology 7, 573–605. Searle, John R. 1983. Intentionality. Cambridge: Cambridge University Press. Slobin, Dan I., Melissa Bowerman, Penelope Brown, Sonja Eisenbeiß & Bhuvana Narasimhan 2008. Putting things in places: Developmental consequences of linguistic typology. In: J. Bohnemeyer & E. Pederson (eds.). Event Representation. Cambridge, MA: Cambridge University Press, 1–28. Svenonius, Peter 2007. Interpreting uninterpretable features. Linguistic Analysis 33, 375–413. Wierzbicka, Anna 1996. Semantics: Primes and Universals. Oxford: Oxford University Press. Wunderlich, Dieter 1996a. Lexical categories. Theoretical Linguistics 22, 1–48. Wunderlich, Dieter 1996b. Dem Freund die Hand auf die Schulter legen. In: G. Harras & M. Bierwisch (eds.). Wenn die Semantik arbeitet. Tübingen: Niemeyer, 331–360. Wunderlich, Dieter 1997a. A minimalist model of inflectional morphology. In: C. Wilder, H.-M. Gärtner & M. Bierwisch (eds.). The Role of Economy Principles in Linguistic Theory. Berlin: Akademie Verlag, 267–298. Wunderlich, Dieter 1997b. Cause and the structure of verbs. Linguistic Inquiry 28, 27–68.
Manfred Bierwisch, Berlin (Germany)
358
IV. Lexical semantics
17. Frameworks of lexical decomposition of verbs 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
Introduction Generative Semantics Lexical decomposition in Montague Semantics Conceptual Semantics LCS decompositions and the MIT Lexicon Project Event Structure Theory Two-level Semantics and Lexical Decompositional Grammar Natural Semantic Metalanguage Lexical Relational Structures Distributed Morphology Outlook References
Abstract Starting from early approaches within Generative Grammar in the late 1960s, the article describes and discusses the development of different theoretical frameworks of lexical decomposition of verbs. It presents the major subsequent conceptions of lexical decompositions, namely, Dowty’s approach to lexical decomposition within Montague Semantics, Jackendoff’s Conceptual Semantics, the LCS decompositions emerging from the MIT Lexicon Project, Pustejovsky’s Event Structure Theory, Wierzbicka’s Natural Semantic Metalanguage, Wunderlich’s Lexical Decompositional Grammar, Hale and Kayser’s Lexical Relational Structures, and Distributed Morphology. For each of these approaches, (i) it sketches their origins and motivation, (ii) it describes the general structure of decompositions and their location within the theory, (iii) it explores their explanative value for major phenomena of verb semantics and syntax, (iv) and it briefly evaluates the impact of the theory. Referring to discussions in article 7 (Engelberg) Lexical decomposition, a number of theoretical topics are taken up throughout the paper concerning the interpretation of decompositions, the basic inventory of decompositional predicates, the location of decompositions on the different levels of linguistic representation (syntactic, semantic, conceptual), and the role they play for the interfaces between these levels.
1. Introduction The idea that word meanings are complex has been present ever since people have tried to explain and define the meaning of words. When asked the meaning of the verb persuade, a competent speaker of the language would probably say something like (1a). This is not far from what semanticists put into a structured lexical decomposition as in (1b): (1) a. You persuade somebody if you make somebody believe or do something. b. persuade(x,y,z): x cause (y believe z) (after Fillmore 1968a: 377) However, it was not until the mid-1960s that intuitions about the complexity of verb meanings lead to formal theories of their lexical decomposition. This article will review the history of lexical decomposition of verbs from that time on. For some general discussion of Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 358–399
17. Frameworks of lexical decomposition of verbs the concept of decomposition and earlier decompositional approaches, cf. article 7 (Engelberg) Lexical decomposition. The first theoretical framework to systematically develop decompositional representations of verb meanings was Generative Semantics (section 2), where decompositions were representations of syntactic deep structure. Later theories did not locate lexical decompositions on a syntactic level but employed them as representations on a lexical-semantic level as in Dowty’s Montague-based approach (section 3) and the decompositional approaches emerging from the MIT Lexicon Project (section 5) or on a conceptual level as in Jackendoff’s Conceptual Semantics (section 4). Other lexical approaches were characterized by the integration of decompositions into an Event Structure Theory (section 6), the conception of a comprehensive Natural Semantic Metalanguage (section 7), and the development of a systematic structure-based linking mechanism as in Lexical Decomposition Grammar (section 8). Parallel to these developments, new syntactic approaches to decompositions emerged such as Hale and Kayser’s Lexical Relational Structures (section 9) and Distributed Morphology (section 10). Throughout the paper a number of theoretical topics will be touched upon that are discussed in more detail in article 7 (Engelberg) Lexical decomposition. Of particular interest will be the questions on which level of linguistic representation (syntactic, semantic, conceptual) decompositions are located, how the interfaces to other levels of linguistic representation are designed, what evidence for the complexity of word meaning is assumed, how decompositions are semantically interpreted, what role the formal structure of decompositions plays in explanations of linguistic phenomena, and what the basic inventory of decompositional predicates is. In the following sections, the major theoretical approaches involving lexical decompositions will be presented. Each approach will be described in four subsections that (i) sketch the historical development that led to the theory under discussion, (ii) describe the place lexical decompositions take in these theories and their structural characteristics, (iii) present the phenomena that are explained on the basis of decompositions, and (iv) give a short evaluation of the impact of the theory.
2. Generative Semantics 2.1. Origins and motivation Generative Semantics was a school of syntactic and semantic research that opposed certain established views within the community of Generative Grammar. It was active from the mid-1960s through the mid-1970s. Its major proponents were George Lakoff, James D. McCawley, Paul M. Postal, and John Robert Ross. (For the history of Generative Semantics, cf. Binnick 1972; McCawley 1994.) At that time, the majority view held within Generative Grammar was that there is a single, basic structural level on which generative rules operate and to which all other structural levels are related by interpretive rules. This particular structural level was syntactic deep structure from which semantic interpretations were derived. It was this view of ‘interpretive semantics’ that was not shared by the proponents of Generative Semantics. Although there never was a “standard theory” of Generative Semantics, a number of assumptions can be identified that were wide-spread in the GS-community (cf. Lakoff 1970; McCawley 1968; Binnick 1972): (i) Deep structures are more abstract than Chomsky (1965) assumed. In particular, they are semantic representations of sentences. (ii) Syntactic and semantic representations have the same formal status: They are
359
360
IV. Lexical semantics structured trees. (iii) There is one system of rules that relates semantic representations and surface structures via intermediary representations. Some more specific assumptions that are important when it comes to lexical decomposition were the following: (iv) In semantic deep structure, lexical items occur as decompositions where the semantic elements of the decomposition are distributed over the structured tree (cf. e.g., Lakoff 1970; McCawley 1968; Postal 1971). (v) Some transformations take place before lexical insertion (prelexical transformations, McCawley 1968). (vi) Semantic deep structure allows only three categories: V (corresponding to predicates), NP (corresponding to arguments), and S (corresponding to propositions). Thus, for example, verbs, adjectives, quantifiers, negation, etc. are all assigned the category V in semantic deep structure (cf. Lakoff 1970: 115ff; Bach 1968; Postal 1971). (vii) Transformations can change syntactic relations: Since Floyd broke the glass contains a structure expressing the glass broke as part of its semantic deep structure, the glass occurs as subject in semantic deep structure and as object in surface structure.
2.2. Structure and location of decompositions In Generative Semantics, syntactic and semantic structures do not constitute different levels or modules of linguistic theory. They are related by syntactically motivated transformations; that is, although semantic by nature, the lexical decompositions occurring in semantic deep structure and intermediate levels of sentence derivation must be considered as parts of syntactic structure. In contrast to the “Aspects”-model of Generative Syntax (Chomsky 1965), the terminal constituents of semantic deep structure are semantic and not morphological entities. Particularly interesting for the development of theories of lexical decomposition is the fact that semantic deep structure contained abstract verbs like cause or change (Fig. 17.1) that were sublexical in the sense that they were part of a lexical decomposition. Moreover, it was assumed that all predicates that appear in semantic deep structure are abstract predicates. A basic abstract predicate like believe resembles the actual word believe in its meaning and its argument-taking properties, but unlike actual words, it is considered to be unambiguous. These abstract entities attach to the terminal nodes in semantic deep structure (cf. Fig. 17.1 and for structures and derivations of this sort Lakoff 1965; McCawley 1968; Binnick 1972).
Fig. 17.1: Semantic deep structure for David killed Goliath
17. Frameworks of lexical decomposition of verbs Abstract predicates can be moved by a transformation called ‘predicate raising’, a form of Chomsky adjunction that has the effect of fusing abstract predicates into predicate complexes (cf. Fig. 17.2).
Fig. 17.2: Adjunction of ALIVE to NOT by predicate raising
Performing a series of transformations of predicate raising, the tree in Fig. 17.2 is transformed into the tree in Fig. 17.4 via the tree in Fig. 17.3.
Fig. 17.3: Adjunction of NOT ALIVE to BECOME by predicate raising
Fig. 17.4: Adjunction of BECOME NOT ALIVE to CAUSE by predicate raising
361
362
IV. Lexical semantics Finally, the complex of abstract predicates gets replaced by a lexical item. The lexical insertion transformation ‘[cause[become[not[alive]]]] → kill’ yields the tree in Fig. 17.5.
Fig. 17.5: Lexical insertion of kill, replacing [ CAUSE [ BECOME [ NOT ALIVE ] ] ]
According to McCawley (1968), predicate raising is optional. If it does not take place, the basic semantic components do not fuse. Thus, kill is only one option of expressing the semantic deep structure in Fig. 17.1 besides cause to die, cause to become dead, cause to become not alive, etc.
2.3. Linguistic phenomena Since Generative Semantics considered itself a general theory of syntax and semantics, a large array of phenomena were examined within this school, in particular quantification, auxiliaries, tense, speech acts, etc. (cf. Immler 1974; McCawley 1994; Binnick 1972). In the following, a number of phenomena will be illustrated whose explanation is closely related to lexical decomposition. (i)
Possible words: Predicate raising operates locally. A predicate can only be adjoined to an adjacent higher predicate. For a sentence like (2a), Ross (1972: 109ff) assumes the semantic deep structure in (2b). The local nature of predicate raising predicts that the decompositional structure can be realized as try to find or in case of adjunction of find to try as look for. A verb conveying the meaning of ‘try-entertain’, on the other hand, is universally prohibited since entertain cannot adjoin to try.
(2) a. Fritz looked for entertainment. b. [try fritz [find fritz [entertain someone fritz]]] (ii) Lexical gaps: Languages have lexical gaps in the sense that not all abstract predicate complexes can be replaced by a lexical item. For example, while the three admitted structures cause become red (redden), become red (redden), and red (red) can be replaced by lexical items, the corresponding structures for blue show accidental gaps: cause become blue (no lexical item), become blue (no lexical item), and blue (blue) (cf. McCawley 1968). Lexical items missing in English may exist in other languages, for example, French bleuir. Since the transformation of predicate raising is restricted in the way described above, Generative Semantics can distinguish between those non-existing lexical items that are ruled out in principle, namely, by the restrictions on predicate raising and those that are just accidentally missing. (iii) Related sentences: Lexical decompositions allow us to capture identical sentenceinternal relations across a number of related sentences. The relation between Goliath and not alive in the three sentences David kills Goliath (cf. Fig. 17.1), Goliath dies,
17. Frameworks of lexical decomposition of verbs
(iv)
(v)
(vi)
(vii)
and Goliath is dead is captured by assigning an identical subtree expressing this relation to the semantic deep structures of all three sentences. Lakoff (1970: 33ff) provides an analysis of the adjective thick, the intransitive verb thicken, and the transitive verb thicken that systematically bridges the syntactic differences between the three items by exploring their semantic relatedness through decompositions. Cross-categorical transfer of polysemy: The fact that the liquid cooled has two readings (‘the liquid became cool’ and ‘the liquid became cooler’) is explained by inserting into the decomposition the adjective from which the verb is derived where the adjective can assume the positive or the comparative form (Lakoff 1970). Selectional restrictions (cf. e.g., Postal 1971: 204ff): Generative Semantics is capable of stating generalizations over selectional restrictions. The fact that the object of kill and the subjects of die and dead share their selectional restrictions is due to the fact that all three contain the abstract predicate not alive in their decomposition (cf. Postal 1971: 204ff). Derivational morphology: Terminal nodes in lexical decompositions can be associated with derivational morphemes. McCawley (1968) suggests a treatment of lexical causatives like redden in which the causative morpheme en is inserted under the node become in the decomposition cause become red. Reference of pronouns: Particular properties of the reference of pronouns are explained by relating pronouns to subtrees within lexical decompositions.
(3) a. Floyd melted the glass though it surprised me that he would do so. b. Floyd melted the glass though it surprised me that it would do so. In (3b) in contrast to (3a), the pronoun picks up the decompositional subtree the-glass become melted within the semantic structure [Floyd cause [the-glass become melted]] (cf. Lakoff 1970; Lakoff & Ross 1972). (viii) Semantics of adverbials: Adverbials often show scopal ambiguities (cf. Morgan 1969). Sentences like Rebecca almost killed Jamaal can have several readings depending on the scope of almost (cf. article 7 (Engelberg) Lexical decomposition, section 1.2). Assuming the semantic deep structure [do [Rebecca cause [become [Jamaal not alive]]]], the three readings can be represented by attaching the adverb above to either do, cause, or not alive. Similar analyses have been proposed for adverbs to like again (Morgan 1969), temporal adverbials (cf. McCawley 1971), and durative adverbials like for four years in the sheriff of Nottingham jailed Robin Hood for four years, where the adverbial only modifies the resultative substructure indicating that Robin Hood was in jail (Binnick 1968).
2.4. Evaluation Generative Semantics has considerably widened the domain of phenomena that syntactic and semantic theories have to account for. It stimulated research not only in syntax but particularly in lexical semantics, where structures similar to the lexical decompositions proposed by Generative Semantics are still being used. Yet, Generative Semantics experienced quite vigorous opposition, in particular from formal semantics, psycholinguistics, and, of course, proponents of interpretive semantics within Generative Grammar
363
364
IV. Lexical semantics (e.g., Chomsky 1970). Some of the critical points pertaining to decompositions were the following: (i)
(ii)
(iii)
(iv)
(v)
(vi)
Generative Semantics was criticized for its semantic representations not conforming to standards of formal semantic theories. According to Bartsch & Vennemann (1972: 10ff), the semantic representations of Generative Semantics are not logical forms: The formation rules for semantic deep structures are uninterpreted; operations like argument deletion lead to representations that are not well-formed; and the treatment of quantifiers, negation, and some adverbials as predicates instead of operators is inadequate. Dowty (1972) emphasizes that Generative Semantics lacks a theory of reference. While the rules defining deep structure and the number of categories were considerably reduced by Generative Semantics, the analyses were very complex, and semantic deep structure differed extremely from surface structure (Binnick 1972: 14). It was never even approximately established how many and what transformations would be necessary to account for all sentential structures of a language (Immler 1974: 121). It often remained unclear how the reduction to more primitive predicates should proceed, that is, what criteria allow one to decide whether dead is decomposed as not alive or alive as not dead (Bartsch & Vennemann 1972: 22) – an objection that also applies to most other approaches to decompositions. In most cases, a lexical item is not completely equivalent to its decomposition. De Rijk has shown that while forget and its presumed decomposition ‘cease to know’ are alike with respect to presuppositions, there are cases where argument-taking properties and pragmatic behaviour are not interchangeable. If somebody has friends in Chicago who suddenly move to Australia, it is appropriate to say I have ceased to know where to turn for help in Chicago but not I have forgotten where to turn for help in Chicago (de Rijk, after Morgan 1969: 57ff). More arguments of this sort can be found in Fodor’s (1970) famous article Three Reasons for Not Deriving “Kill” from “Cause to Die” which McCawley (1994) could partly repudiate by reference to pragmatic principles. A lot of the phenomena that Generative Semantics tried to explain by regular transformations on a decomposed semantic structure exhibited lexical idiosyncrasies and were less regular than would be expected under a syntactic approach. Here are some examples. Pronouns are sometimes able to refer to substructures in decompositions. Unlike example (3) above, in sentences with to kill, they cannot pick up the corresponding substructure x become dead (Fodor 1970: 429ff); monomorphemic lexical items often seem to be anaphoric islands:
(4) a. John killed Mary and it surprised me that he did so. b. John killed Mary and it surprised me *that she did so. While to cool shows an ambiguity related to the positive and the comparative form of the adjective (cf. section 2.3), to open only relates to the positive form of the adjective (Immler 1974: 143f). Sometimes selectional restrictions carry over to related sentences displaying the same decompositional substructure, in other cases they do not.
17. Frameworks of lexical decomposition of verbs While the child grew is possible, a decompositionally related structure does not allow child as the corresponding argument of grow: *the parents grew the child (Kandiah 1968). Adverbials give rise to structurally ambiguous sentence meanings, but they usually cannot attach to all predicates in a decomposition (Shibatani 1976: 11; Fodor et al. 1980: 286ff). In particular, Dowty (1979) showed that Generative Semantics overpredicted adverbial scope, quantifier scope, and syntactic interactions with cyclic transformations. He concluded that rules of semantic interpretation of lexical items are different from syntactic transformations (Dowty 1979: 284). (vii) Furthermore, Generative Semantics was confronted with arguments derived from psycholinguistic evidence (cf. article 7 (Engelberg) Lexical decomposition, section 3.6).
3. Lexical decomposition in Montague Semantics 3.1. Origins and motivation The interesting phenomena that emerged from the work of Generative Semanticists, on the one hand, and the criticism of the syntactic treatment of decompositional structures, on the other, led to new approaches to word-internal semantic structure. Dowty’s (1972; 1976; 1979) goal was to combine the methods and results of Generative Semantics with Montague’s (1973) rigorously formalized framework of syntax and semantics where truth and denotation with respect to a model were considered the central notions of semantics. Dowty refuted the view that all interesting semantic problems only concern the socalled logical words and compositional semantics. Instead, he assumed that compositional semantics crucially depends on an adequate approach to lexical meaning. While he acknowledged the value of lexical decompositions in that, he considered decompositions as incomplete unless they come with “an account of what meanings really are.” Dowty (1979: v, 21) believed that the essential features of Generative Semantics can all be accommodated within Montague Semantics where the logical structures of Generative Semantics will get a model-theoretic interpretation and the weaknesses of the syntactic approaches, in particular overgeneration, can be overcome. In Montague Semantics, sentences are not interpreted directly but are first translated into expressions of intensional logic. These translations are considered the semantic representations that correspond to the logical structure (i.e., the semantic deep structure) of Generative Semantics (Dowty 1979: 22). Two differences between Dowty’s approach and classical Generative Semantics have to be noted: Firstly, directionality is inverse. While Generative Semantics maps semantic structures onto syntactic surface structure, syntactic structures are mapped onto semantic representations in Dowty’s theory. Secondly, in Generative Semantics but not in Dowty’s theory, derivations can have multiple stages (Dowty 1979: 24ff).
3.2. Structure and location of decompositions Lexical decompositions are used in Dowty (1979) mainly in order to reveal the different logical structures of verbs belonging to the several so-called Vendler classes. Vendler
365
366
IV. Lexical semantics (1957) classified verbs (and verb phrases) into states, activities, accomplishments, and achievements according to their behaviour with respect to the progressive aspect and temporal-aspectual adverbials (cf. also article 48 (Filip) Aspectual class and Aktionsart). Dowty (1979: 52ff) extends the list of phenomena associated with these classes, relates verbs of different classes to decompositional representations as in (5a-c), and also distinguishes further subtypes of these classes as in (5d-f) (cf. Dowty 1979: 123ff). (5) a. simple statives πn(α1,...,αn) John knows the answer. b. simple activities do(α1, [πn(α1,...,αn)]) John is walking. c.
simple achievements become[πn(α1,...,αn)] John discovered the solution.
d. non-intentional agentive accomplishments [[do(α1, [πn(α1,...,αn)])] cause [become[ρm(β1,...,βm)]]] John broke the window. e. agentive accomplishments with secondary agent [[do(α1, [πn(α1,...,αn)])] cause [do(β1, [ρm (β1,...,βm)])]] John forced Bill to speak. f.
intentional agentive accomplishment do(α1, [do(α1, πn(α1,...,αn)) cause φ]) John murdered Bill.
The different classes are built up out of stative predicates (πn, ρm) and a small set of operators (do, become, cause). Within an aspect calculus, the operators involved in the decompositions are given model-theoretic interpretations, and the stative predicates are treated as predicate constants. The interpretation of become (6a) is based on von Wright’s (1963) logic of change; the semantics of cause (6b) as a bisentential operator (cf. Dowty 1972) is mainly derived from Lewis’ (1973) counterfactual analysis of causality, and the less formalized analysis of do (6c) relates to considerations about will and intentionality in Ross (1972). (6) a. [become φ] is true at I if there is an interval J containing the initial bound of I such that ¬ φ is true at J and there is an interval K containing the final bound of I such that φ is true at K (Dowty 1979: 140). b. [φ cause ψ] is true if and only if (i) φ is a causal factor for ψ, and (ii) for all other φ' such that φ⬘ is also a causal factor for ψ, some ¬ φ -world is as similar or more similar to the actual world than any ¬ φ⬘-world is. φ is a causal factor for ψ if and only if there is a series of sentences φ, φ1,..., φn, ψ (for n ≥ 0) such that each member of the series depends causally on the previous member.
17. Frameworks of lexical decomposition of verbs
c.
φ depends causally on ψ if and only if φ, ψ and ¬ φ ⎕ → ¬ ψ are all true (Dowty 1979: 108f). ⎕[do(α,φ)↔φ ∧ under_the_unmediated_control_of_the_agent_α (φ)] (Dowty 1979: 118)
By integrating lexical decompositions into Montague Semantics, Dowty wants to expand the treatment of the class of entailments that hold between English sentences (Dowty 1979: 31); he aims to show how logical words interact with non-logical words (e.g., words from the domain of tense, aspect, and mood with Vendler classes), and he expects that lexical decompositions help to narrow down the range of possible lexical meanings (Dowty 1979: 34f, 125ff). With respect to the semantic status of lexical decompositions, Dowty (1976: 209ff) explores two options; namely, that the lexical expression itself is decomposed into a complex predicate, or that it is related to a predicate constant via a meaning postulate (cf. article 7 (Engelberg) Lexical decomposition, section 3.1). While he mentions cases where a strong equivalence between predicate and decomposition provides evidence for the first option, he also acknowledges that the second option might often be empirically more adequate since it allows the weakening of the relation between predicate and decomposition from a biconditional to a conditional. This would account for the observation that the complex phrase cause to die has a wider extension than kill.
3.3. Linguistic phenomena Dowty provides explanations for a wide range of phenomena related to Vendler classes. A few examples are the influence of mass nouns and indefinites on the membership of expressions in Vendler classes (Dowty 1979: 78ff), the interaction of derivational morphology with sublexical structures of meaning (Dowty 1979: 32, 206f, 256ff), explanations of adverbial and quantifier scope (Dowty 1976: 213ff), the imperfective paradox (Dowty 1979: 133), the progressive aspect (Dowty 1979: 145ff), resultative constructions (Dowty 1979: 219ff), and temporal-aspectual adverbials (for an hour, in an hour) (Dowty 1979: 332ff). Dowty also addresses aspectual composition. Since he conceives of Vendler classes in terms of lexical decomposition, he shows that decompositional structures involving cause, become, and do are not only introduced via semantically complex verbs but also via syntactic and morphological processes. For example, accomplishments that arise when verbs are combined with prepositional phrases get their become operator from the preposition (7a) where the preposition itself can undergo a process that adds an additional cause component (7b) (Dowty 1979: 211f). In (7c), the morphological process forming deadjectival inchoatives is associated with a become proposition (Dowty 1979: 206) that can be expanded with a cause operator in the case of deadjectival causatives (Dowty 1979: 307f). (7) a. John walks to Chicago. walk⬘(john) ∧ become be-at⬘(john, chicago) b. John pushes a rock to the fence. ∃x[rock⬘(x) ∧ ∃y[∀z[fence⬘(z) ⇔ y = z] ∧ push⬘(john, x) cause become beat⬘ (x, y)]]
367
368
IV. Lexical semantics c.
The soup cooled. ∃x[∀y[soup'(y) ⇔ x = y] ∧ become cool'(x)]
3.4. Evaluation Of the pre-80s work on lexical decomposition, besides the work of Jackendoff (cf. 2.4), it is probably Dowty’s “Word Meaning and Montague Grammar” that still exerts the most influence on semantic studies. It has initiated a long period of research dominated by approaches that located lexical decompositions in lexical semantics instead of syntax. It must be considered a major advancement that Dowty was committed to providing formal truth-conditions for operators involved in decompositions. Many of the approaches preceding and following Dowty (1979) lack this degree of explicitness. His account of the compositional nature of many accomplishments, the interaction of aspectual adverbials with Vendler classes, and many other phenomena mentioned in section 3.3 served as a basis for discussion for the approaches to follow. Among the approaches particularly influenced by Dowty (1979) are van Valin’s (1993) decompositions within Role and Reference Grammar, Levin and Rappaport Hovav’s Lexical Conceptual Structures (cf. section 5), and Pustejovsky’s Event Structures (cf. section 6).
4. Conceptual Semantics 4.1. Origins and motivation Along with lexical decompositions in Generative Semantics, another line of research emerged where semantic arguments of predicates were associated with the roles they played in the events denoted by verbs (Gruber 1965; Fillmore 1968b). Thematic roles like agent, patient, goal, etc. were used to explain how semantic arguments are mapped onto syntactic structures (cf. also article 18 (Davis) Thematic roles). Early approaches to thematic roles assumed that there is a small set of unanalyzable roles that are semantically stable across the verbal lexicon. Thematic role approaches were confronted with a number of problems concerning the often vague semantic content of roles, their coarsegrained nature as a descriptive tool, the lack of reliable diagnostics for them, and the empirically inadequate syntactic generalizations (cf. the overview in Levin & Rappaport Hovav 2005: 35ff; Dowty 1991: 553ff). As a consequence, thematic role theories developed in different ways by decomposing thematic roles into features (Rozwadowska 1988), by reducing thematic roles to just two generalized macroroles (van Valin 1993) or to proto-roles within a prototype approach based on lexical entailments (Dowty 1991), by combining them with event structure representations (Grimshaw 1990; Reinhart 2002), and, in particular, by conceiving of them as notions derived from lexical decompositions (van Valin 1993). This last approach was pursued by Jackendoff (1972; 1976) and was one of the foundations of a semantic theory that came to be known as Conceptual Semantics, and which, over the years, has approached a large variety of phenomena beyond thematic roles (cf. also article 30 (Jackendoff) Conceptual Semantics). According to Jackendoff’s (1983; 1990; 2002) Conceptual Semantics, meanings are essentially conceptual entities and semantics is “the organization of those thoughts that language can express” (Jackendoff 2002: 123). Meanings are represented on an autonomous level of cognitive representation called “conceptual structure” that is related to
17. Frameworks of lexical decomposition of verbs syntactic and phonological structure, on the one side, and to non-linguistic cognitive levels like the visual, the auditory, and the motor system, on the other side. Conceptual structure is conceived of as a universal model of the mind’s construal of the world (Jackendoff 1983: 18ff). Thus, Conceptual Semantics differs from formal, model-theoretic semantics in locating meanings not in the world but in the mind of speakers and hearers. Therefore, notions like truth and reference do not play the role they play in formal semantics but are relativized to the speaker’s conceptualizations of the world (cf. Jackendoff 2002: 294ff). Jackendoff’s approach also differs from many others in not assuming a strict division between semantic and encyclopaedic meaning and between grammatically relevant and irrelevant aspects of meaning (Jackendoff 2002: 267ff).
4.2. Structure and location of decompositions Conceptual structure involves a decomposition of meaning into conceptual primitives (Jackendoff 1983: 57ff). Since Jackendoff (1990: 10f) assumes that there is an indefinitely large number of possible lexical concepts, conceptual primitives must be combined by generative principles to determine the set of lexical concepts. Thus, most lexical concepts are considered to be conceptually complex. Decompositions in Conceptual Semantics differ considerably in content and structure from lexical structure in Generative Semantics and Montague-based approaches to the lexicon. The sentence in (8a) would yield the meaning representation in (8b): (8) a. John entered the room. b. [Event go ([Thing john], [Path to ([Place in ([Thing room])])])] Each pair of square brackets encloses a conceptual constituent, where the capitalized items denote the conceptual content, which is assigned to a major conceptual category like Thing, Place, Event, State, Path, Amount, etc. (Jackendoff 1983: 52ff). There are a number of possibilities for mapping these basic ontological categories onto functorargument structures (Jackendoff 1990: 43). For example, the category Event can be elaborated into two-place functions like go ([Thing], [Path]) or cause ([Thing/Event], [Event]). Furthermore, each syntactic constituent maps into a conceptual constituent, and partly language-specific correspondence rules relate syntactic categories to the particular conceptual categories they can express. In later versions of Conceptual Semantics, these kinds of structures are enriched by referential features and modifiers, and the propositional structure is accompanied by a second tier encoding elements of information structure (cf. Jackendoff 1990: 55f; Culicover & Jackendoff 2005: 154f). A lexical entry consists of a phonological, syntactic, and conceptual representation. As can be seen in Fig. 17.6, most of the conceptual structure in (8b) is projected from the conceptual structure of the verb that provides a number of open argument slots (Jackendoff 1990: 46).
Fig. 17.6: Lexical entry for enter
369
370
IV. Lexical semantics The conceptual structure is supplemented with a spatial structure that captures finer distinctions between lexical items in a way closely related to non-linguistic cognitive modules (Jackendoff 1990: 32ff; 2002: 345ff). The same structure can also come about in a compositional way. While enter already includes the concept of a particular path (in), this information is contributed by a preposition in the following example: (9) a. John ran into the room. b. [Event go ([Thing john]), [Path to ([Place in ([Thing room])])]] The corresponding lexical entries for the verb and the preposition (Fig. 17.7) account for the structure in (9b) (Jackendoff 1990: 45).
Fig. 17.7: Lexical entries for run and into
An important feature of Jackendoff’s (1990: 25ff; 2002: 356ff) decompositions is the use of abstract location and motion predicates in order to represent the meaning of words outside the local domain. For example, a change-of-state verb like melt is rendered as ‘to go from solid to liquid’. Building on Gruber’s (1965) earlier work, Jackendoff thereby relates different classes of verbs by analogy. In Jackendoff (1983: 188), he states that in any semantic field in the domain of events and states “the principle event-, state-, path-, and place-functions are a subset of those used for the analysis of spatial location and motion”. To make these analogies work, predicates are related to certain fields that determine the character of the arguments and the sort of inferences. For example, if the two-place function be(x,y) is supplemented by the field feature spatial, it indicates that x is an object and y is its location while the field feature possession indicates that x is an object and y the person who owns it. Only the latter one involves inferences about the rights of y to use x (Jackendoff 2002: 359ff). Jackendoff (2002: 335f) dissociates himself from approaches that compare lexical decompositions with dictionary definitions; the major difference is that the basic elements of decompositions need not be words themselves, just as phonological features as the basic components on phonological elements are not sounds. He also claims that the speaker does not have conscious access to the decompositional structure of lexical items; this can only be revealed by linguistic analysis. The question of what elements should be used in decompositions is answered to the effect that if the meaning of lexeme A entails the meaning of lexeme B, the decomposition of A includes that of B (Jackendoff 1976). Jackendoff (2002: 336f) admits that it is hard to tell how the lower bound of decomposition can be determined. For example,
17. Frameworks of lexical decomposition of verbs some approaches consider cause to be a primitive; others conceive of it as a family of concepts related through feature decomposition. In contrast to approaches that are only interested in finding those components that are relevant for the syntax-semantics interface, he argues that from the point of learnability the search for conceptual primitives has to be taken seriously beyond what is needed for the syntax. He takes the stance that it is just a matter of further and more detailed research before the basic components are uncovered.
4.3. Linguistic phenomena Lexical decompositions within Conceptual Semantics serve much wider purposes than in many other approaches. First of all, they are considered a meaning representation in their own right, that is, not primarily driven by the need to explain linking and other linguistic interface phenomena. Moreover, as part of conceptual structure, they are linked not only to linguistic but also to non-linguistic cognitive domains. As a semantic theory, Conceptual Semantics has to account for inferences. With respect to decompositions, this is done by postulating inference rules that link decompositional predicates. For example, from any decompositional structure involving go(x,y,z), we can infer that be(x,y) holds before the go event and be(x,z) after it. This is captured by inference rules as in (10a) that resemble meaning postulates in truthconditional semantics (Jackendoff 1976: 114). Thus, we can infer from the train went from Kankakee to Mattoon that the train was in Kankakee before and in Mattoon after the event. Since other sentences involving a go-type predicate like the road reached from Altoona to Johnstown do not share this inference, the predicates are subtyped to the particular fields transitional versus extensional. The extensional version of go is associated with the inference that one part of x in go(x,y,z) is located in y and the other in z (10b) (Jackendoff 1976: 139): (10) a. goTrans(x,y,z) at t1 ⇒ for some times t2 and t3 such that t2 < t1 < t3, beTrans(x,y) at t2 and beTrans(x,z) at t3. b. goExt(x,y,z) ⇒ for some v and w such that v ⊂ x and w ⊂ x, beExt(v,y) and beExt(w,z). One of Jackendoff’s main concerns is the mapping between conceptual and syntactic structure. Part of this mapping is the linking of semantic arguments into syntactic structures. In Conceptual Semantics this is done via thematic roles. Thematic roles are not primitives in Conceptual Semantics but can be defined on the basis of decompositions (Jackendoff 1972; 1987). They “are nothing but particular structural configurations in conceptual structure” (Jackendoff 1990: 47). For example, Jackendoff (1972: 39) identifies the first argument of the cause relation with the agent role. In later versions of his theory, Jackendoff (e.g., 1990: 125ff) expands the conceptual structure of verbs by adding an action tier to the representation. While the original concept of decomposition (the ‘thematic tier’) is couched in terms of location and motion, thereby rendering thematic roles like theme, goal, or source, the action tier expresses how objects are affected and accounts for roles like actor and patient. Thus, hit as in the car hit the tree provides a theme (the car, the “thing in motion” in the example sentence) and a goal (the tree) on
371
372
IV. Lexical semantics the thematic tier, and an actor (the car) and a patient (the tree) on the action tier. These roles can be derived from the representation in Fig. 17.8 (after Jackendoff 1990: 125ff).
Fig. 17.8: Thematic tier and action tier for the car hit the tree
The thematic roles derived from the thematic and the action tiers are ordered within a thematic hierarchy. This hierarchy is mapped onto a hierarchy of syntactic functions such that arguments are linked to syntactic functions according to the rank of their thematic role in the thematic hierarchy (Jackendoff 1990: 258, 268f; 2002: 143). Strict subcategorization can largely be dispensed with. However, Jackendoff (1990: 255ff; 2002: 140f) still acknowledges subcategorizational idiosyncrasies. Among the many other phenomena treated within Conceptual Semantics and related to lexical decompositions are argument structure alternations (Jackendoff 1990: 71ff), aspectual-temporal adverbials and their relation to the boundedness of events (Jackendoff 1990: 27ff), the semantics of causation (Jackendoff 1990: 130ff), and phenomena at the border between adjuncts and arguments (Jackendoff 1990: 155ff).
4.4. Evaluation Jackendoff’s approach to lexical decomposition has been a cornerstone in the development of lexical representations since it coveres a wide domain of different classes of lexical items and phenomena associated with these classes. The wide coverage forced Jackendoff to expand the structures admitted in his decompositions. This in turn evoked criticism that his theory lacks sufficient restrictiveness. Furthermore, it has been criticized that the locational approach to decomposition needs to be stretched too far in order to make it convincing that it includes all classes of verbs (Levin 1995: 84). Wunderlich (1996: 171) considers Jackendoff’s linking principle problematic since it cannot easily be applied to languages with case systems. In general, Jackendoff’s rather heterogeneous set of correspondence rules has attracted criticism because it involves a considerable weakening of the idea of semantic compositionality (cf. article 30 (Jackendoff) Conceptual Semantics for Jackendoff’s position).
5. LCS decompositions and the MIT Lexicon Project 5.1. Origins and motivation An important contribution to the development of decompositional theories of lexical meaning originated in the MIT Lexicon Project in the mid-eighties. Its main proponents are Beth Levin and Malka Rappaport Hovav. Their approach is mainly concerned with the relation between semantic properties of lexical items and their syntactic behaviour. Thus, it aims at “developing a representation of those aspects of the meaning of a lexical item which characterize a native speaker’s knowledge of its argument structure and determine the syntactic expression of its arguments” (Levin 1985: 4). The meaning representations were supposed to lead to definitions of semantic classes that show a uniform syntactic behaviour:
17. Frameworks of lexical decomposition of verbs (1) All arguments bearing a particular semantic relation are systematically expressed in certain ways. (2) Predicates fall into classes according to the arguments they select and the syntactic expression of these arguments. (3) Adjuncts are systematically expressed in the same way(s) and their distribution often seems to be limited to semantically coherent classes of predicates. (4) There are regular extended uses of predicates that are correlated with semantic class. (5) Predicates belonging to certain semantic classes display regular alternations in the expression of their arguments. (Levin 1985: 47)
Proponents of the approach criticized accounts that were solely based on thematic roles, as they were incapable of explaining diathesis alternations (Levin 1985: 49ff; 1995: 76ff). Instead, they proposed decompositions on the basis of Jackendoff’s (1972; 1976) conceptual structures, Generative Semantics, earlier work by Carter (1976) and Joshi (1974), and ideas from Hale & Keyser (1987).
5.2. Structure and location of decompositions The main idea in the early MIT Lexicon Project Working Papers was that two levels of lexical representation have to be distinguished, a lexical-semantic and a lexicalsyntactic one (Rappaport, Laughren & Levin 1987, later published as Rappaport, Levin & Laughren 1993; Rappaport & Levin 1988). The lexical-syntactic representation, PAS (“predicate argument structure”), “distinguishes among the arguments of a predicator only according to how they combine with the predicator in a sentence”. PAS, which is subject to the projection principle (Rappaport & Levin 1988: 16), expresses whether the role of an NP-argument is assigned (i) by the verb (“direct argument”), (ii) by a different theta role assigner like a preposition (“indirect argument”), or (iii) by the VP via predication (“external argument”) (Rappaport, Laughren & Levin 1987: 3). These three modes of assignment are illustrated in the PAS for the verb put: (11) a. put, PAS: x < y, Ploc z> b. put, LCS: [ x cause [ y come to be at z ]] The lexical-semantic basis of PAS is a lexical decomposition, LCS (“Lexical Conceptual Structure”) (Rappaport, Laughren & Levin 1987: 8). The main task for an LCS-based approach to lexical semantics is to find the mapping principles between LCS and PAS and between PAS and syntactic structure (Rappaport 1985: 146f). Not all researchers associated with the MIT Lexicon Project distinguished two levels of representation. Carter (1988) refers directly to argument positions of predicates within decompositions in order to explain linking phenomena. In later work, Levin and Rappaport do not make reference to PAS as a level of representation anymore. The distinction between grammatically relevant and irrelevant lexical information is now reflected in a distinction between primitive predicates that are embedded in semantic templates, which are claimed to be part of Universal Grammar, and predicate constants, which reflect the idiosyncratic part of lexical meaning. The templates pick up distinctions known from Vendler classes (Vendler 1957) and are referred to as event structure representations. For example, (12a) is a template for an activity, and (12b) is a template for a particular kind of accomplishment (Rappaport Hovav & Levin 1998: 108):
373
374
IV. Lexical semantics (12) a. [x act] b. [[x act] cause [become [y ]]] The templates in (12) illustrate two main characteristics of this approach: templates can embed other templates and constants can function as modifiers of predicates (e.g., with respect to ACT) or as arguments of predicates (e.g., with respect to BECOME). The representations are augmented by well-formedness conditions that require that each subevent in a template is represented by a lexical head in syntax and that all participants in lexical structure and all argument XPs in syntax are mapped onto each other. The principle of Template Augmentation makes it possible to build up complex lexical representations from simple ones such that the variants of sweep reflected in (13a-13c) are represented by the decompositions in (14a-14c): (13) a. Phil swept the floor. b. Phil swept the floor clean. c. Phil swept the crumbs onto the floor. (14) a. [x act y] b. [[x act y] cause [become [y ]]] c. [[x act y] cause [become [z ]]] It is assumed that the nature of the constant can determine the range of templates that can be associated with it (cf. article 19 (Levin & Rappaport Hovav) Lexical Conceptual Structure). It should be noticed that the approach based on LCS-type decompositions aims primarily at explaining the regularities of argument realization. Particularly in its later versions, it is not intended to capture different kinds of entailments, aspectual behaviour, or restrictions on adverbial modification. It is assumed that all and only those meaning components that are relevant to grammar can be isolated and represented as LCS templates.
5.3. Linguistic phenomena Within this approach, most research was focused on argument structure alternations that verbs may undergo. (15) a. Bill loaded cartons onto the truck. b. Bill loaded the truck with cartons. With respect to alternations as in (15) Rappaport & Levin (1988: 19ff) argued that a lexical meaning representation has to account for (i) the near paraphrase relation between the two variants, (ii) the different linking behaviour of the variants, (iii) and the interpretation of the goal argument in (15b) as completely affected by the action. They argued that a theta role approach could not fulfill these requirements: If the roles for load were considered identical for both variants, for example, , requirement (i) is met but (ii) and (iii) are not since the different argument realizations cannot follow from identical theta role assignments and the affectedness component in (15b) is not
17. Frameworks of lexical decomposition of verbs expressed. If the roles for the two variants are considered different, for example, for (15a) versus for (15b), the near paraphrase relation gets lost, and the completeness interpretation of (15b) still needs stipulative interpretation rules. Within an LCS approach, linking rules make reference not to theta roles but to substructures of decompositions, for example, “When the LCS of a verb includes one of the substructures in [16], link the variable represented by x in either substructure to the direct argument variable in the verb’s PAS. (16) a. ... [ x come to be at LOCATION ] ... b. ... [ x come to be in STATE ] ...” (Rappaport & Levin 1988: 25) With respect to load, it is assumed that the variant in (17b) with its additional meaning component of completion entails the variant in (17a) giving rise to the following representations: (17) a. load: [ x cause [ y to come to be at z ] /LOAD ] b. load: [[ x cause [ z to come to be in STATE ]] BY MEANS OF [ x cause [ y to come to be at z ]] /LOAD ] (Rappaport & Levin 1988: 26) Assuming that the linking rules apply to the main clause within the decomposition, the two decompositions lead to different PAS representations in which the direct argument is associated with the theme in (18a) and the goal in (18b): (18) a. load: x < y, Ploc z> b. load: x Thus the three observations, (i) near-paraphrase relation, (ii) different linking behaviour, and (iii) complete affectedness of the theme in one variant, are accounted for. Among the contemporary studies that proceeded in a similar vein are Hale & Keyser’s (1987) work on the middle construction and Guerssel et al.’s (1985) crosslinguistic studies on causative, middle, and conative alternations. It is particularly noteworthy that Levin and Rappaport have greatly expanded the range of phenomena in the domain of argument structure alternations that a lexical semantic theory has to cover. Their empirical work on verb classes determined by the range of argument structure alternations they allow is documented in Levin (1993): About 80 argument structure alternations in English lead to the definition of almost 200 verb classes. The theoretical work represented by the template approach to LCS focuses on finding the appropriate constraints that guide the extension of verb meanings and explain the variance in argument structure alternations.
5.4. Evaluation The work of Levin, Rappaport Hovav, and other researchers working with LCS-like structures had a large influence on later work on the syntax-semantics interface. By uncovering the richness of the domain of argument structure alternations, they defined what theories at the lexical syntax-semantic interface have to account for today. Among
375
376
IV. Lexical semantics the work inspired by Levin and Rappaport Hovav’s theory are approaches whose goal is to establish linking regularities on more abstract, structural properties of decompositions (e.g., Lexical Decomposition Grammar, cf. section 8) and attempts to integrate elements of lexical decompositions into syntactic structure (cf. section 9). Levin and Rappaport Hovav’s work is also typical of a large amount of lexical semantic research in the 1980s and 90s that has largely given up the semantic rigorousness characteristic of approaches based on formal semantics like Dowty (1979). Less rigorous semantic relations make theories more susceptible to circular argumentations when semantic representations are mapped onto syntactic ones (cf. article 7 (Engelberg) Lexical decomposition, section 3.5). It has also been questioned whether Levin and Rappaport Hovav’s approach allows for a principled account of cross-linguistic variation and universals (Croft 1998: 26; Zubizarreta & Oh 2007: 8).
6. Event Structure Theory 6.1. Origins and motivation In the late 1980s, two papers approaching verb semantics from a philosophical point of view inspired much research in the domain of aspect and Aktionsart, namely, Vendler’s (1957) classification of expressions based on predicational aspect and Davidson’s (1967) suggestion to reify events in order to explain adverbial modification. In connection with Dowty’s (1979) work on decompositions within Montague semantics, the intensification of research on grammatical aspect, predicational aspect, and Aktionsarten also stimulated event-based research in lexical semantics. In particular, Pustejovsky’s (1988; 1991a; 1991b) idea of conceiving of verbs as referring to structured events added a new dimension to decompositional approaches to verb semantics.
6.2. Structure and location of decompositions According to Pustejovsky (1988; 1991a; 1991b), each verb refers to an event that can consist of subevents of different types, where ‘processes’ (P) and ‘states’ (S) are simple types that can combine to yield the complex type ‘transition’ [P S]T via event composition. A process is conceived of as “a sequence of events identifying the same semantic expression”, a state as “a single event, which is evaluated relative to no other event”, and a transition as “an event identifying a semantic expression, which is evaluated relative to its opposition” (Pustejovsky 1991a: 56). In addition to this event structure (ES), Pustejovsky assumes a level LCS’, where each subevent is related to a decomposition. Out of this, a third level of Lexical Conceptual Structure (LCS) can be derived, which contains a single lexical decomposition. The following examples illustrate how the meaning of sentences is based on these representational levels: (19) a. b. c. d. e.
Mary ran. Mary ran to the store. The door is closed. The door closed. John closed the door.
17. Frameworks of lexical decomposition of verbs
Fig. 17.9: Representation of Mary ran
Fig. 17.10: Representation of Mary ran to the store
Fig. 17.11: Representation of the door is closed
Fig. 17.12: Representation of the door closed
Fig. 17.13: Representation of John closed the door
377
378
IV. Lexical semantics In terms of Vendler classes, Fig. 17.9 describes an activity, Fig. 17.11 a state, Figs. 17.10 and 17.13 accomplishments, and Fig. 17.12 an achievement. According to Pustejovsky, achievements and accomplishments have in common that they lead to a result state and are distinguished in that achievements do not involve an act-predicate at LCS’. As in many other decompositional theories (Jackendoff 1972; van Valin 1993), thematic roles are considered epiphenomenal and can be derived from the structured lexical representations (Pustejovsky 1988: 27). Pustejovsky’s event structure theory is part of his attempt to construct a theory of the Generative Lexicon (Pustejovsky 1995) that, besides Event Structure, also comprises Qualia Structure, Argument Structure, and Inheritance Structure (Pustejovsky 1991b, 1995). He criticises contemporary theories for focussing too much on the search for a finite set of semantic primitives: Rather than assuming a fixed set of primitives, let us assume a fixed number of generative devices that can be seen as constructing semantic expressions. Just as a formal language is described in terms of the productions in the grammar rather than its accompanying vocabulary, a semantic language should be defined by the rules generating the structures for expressions rather than the vocabulary of primitives itself. (Pustejovsky 1991a: 54)
6.3. Linguistic phenomena The empirical coverage of Pustejovsky’s theory is wider than many other decompositional theories: (i) The ambiguity of adverbials as in Lisa rudely departed is explained by attaching the adverb either to the whole transition T (‘It was rude of Lisa to depart’) or to the embedded process P (‘Lisa departed in a rude manner’) (Pustejovsky 1988: 31f). (ii) The mapping of Vendler classes onto structural event representations allows for a formulation of the restrictions on temporal-aspectual adverbials (in five minutes, for five minutes, etc.) (Pustejovsky 1991a: 73). (iii) The linking behaviour of verbs is related to LCS’ components; for example, the difference between unaccusatives and unergatives is accounted for by postulating that a participant involved in a predicate opposition (as in Fig. 12) is mapped onto the internal argument position in syntax while the agentive participant in an initial subevent (as in Fig. 17.13) is realized as the external argument (Pustejovsky 1991a: 75). Furthermore, on the basis of Event Structure and Qualia Structure, a theory of aspectual coercion is developed (Pustejovsky & Bouillon 1995) as well as an account of lexicalizations of causal relations (Pustejovsky 1995).
6.4. Evaluation Pustejovsky’s concept of event structures has been taken up by many other lexical semanticists. Some theories included event structures as an additional level of representation. Grimshaw (1990) proposed a linking theory that combined a thematic hierarchy and an aspectual hierarchy of arguments based on the involvement of event participants in Pustejovsky-style subevents. In Lexical Decomposition Grammar, event structures were introduced as a level expressing sortal restrictions on events in order to explain the distribution and semantics of adverbials (Wunderlich 1996). It has sometimes been criticised that Pustejovsky’s event structures were not fine-grained enough to explain
17. Frameworks of lexical decomposition of verbs adverbial modification. Consequently, suggestions have been made how to modify and extend event structures (e.g., Wunderlich 1996; cf. also Engelberg 2006). Apart from those studies and theories that make explicit reference to Pustejovsky’s event structures, a number of other approaches emerged in which phasal or mereological properties of events are embedded in lexical semantic representations, among them work by Tenny (1987; 1988), van Voorst (1988), Croft (1998), and some of the syntactic approaches to be discussed in section 9. Even standard lexical decompositions are often conceived of as event descriptions and referred to as ‘event structures’, for example, the LCS structures in Rappaport Hovav & Levin (1998). Event structures by themselves can of course not be considered full decompositions that exhaust the meaning of a lexical item. As we have seen above, they are always combined with other lexical information, for example, LCS-style decompositions or thematic role representations. Depending on the kind of representation they are attached to it is not quite clear if they constitute an independent level of representation. In Pustejovsky’s approach, event structures are probably by and large derivable from the LCS structures they are linked to.
7. Two-level Semantics and Lexical Decompositional Grammar 7.1. Origins and motivation Two-level-Semantics originated in the 1980s with Manfred Bierwisch, Ewald Lang and Dieter Wunderlich being its main proponents (Bierwisch 1982; 1989; 1997; Bierwisch & Lang 1989; Wunderlich 1991; 1997a; cf. article 31 (Lang & Maienborn) Two-level Semantics). In particular, Bierwisch’s contribution is remarkable for his attempt to define the role of the lexicon within Generative Grammar. Lexical Decomposition Grammar (LDG) emerged out of Two-level Semantics in the early 1990s. LDG has been particularly concerned with the lexical decompositon of verbs and the relation between semantic, conceptual, and syntactic structure. It has been developed by Dieter Wunderlich (1991; 1997a; 1997b; 2000; 2006) and other linguists stemming from the Düsseldorf Institute for General Linguistics (Stiebels 1996; 1998; 2006; Kaufmann 1995a; 1995b; 1995c; Joppen & Wunderlich 1995; Gamerschlag 2005) – with contributions by Paul Kiparsky (cf. article 78 (Kiparsky & Tonhauser) Semantics of inflection).
7.2. Structure and location of decompositions Two-level semantics argues for separating semantic representations (semantic form, SF) that are part of the linguistic system, and conceptual representations that are part of the conceptual system (CS). Only SF is seen as a part of grammar that is integrated into its computational mechanisms while conceptual structure is a level of reasoning that builds on more general mental operations. How the interplay between SF and CS can be spelled out is shown in Maienborn (2003). She argues that spatial PPs can either function as event-external modifiers, locating the event as a whole as in (20a), or as event-internal modifiers, specifying a spatial relation that holds within the event as in (20b). While external event location is semantically straightforward, internal event location is subject to conceptual knowledge. Not only does the local relation expressed in (20b) require world knowledge about spatial relations in bike riding events, it is also re-interpreted as
379
380
IV. Lexical semantics an instrumental relation that is not lexically provided by the verb or the preposition. Furthermore, in sentences like (20c) the external argument of the preposition, the woman’s hand or some instrument the woman uses is not even mentioned in the sentence but has to be supplied by conceptual knowledge. (20) a. Der Bankräuber ist auf der Insel geflohen. the bank robber has on the island escaped b. Der Bankräuber ist auf dem Fahrrad geflohen. the bank robber has on the bicycle escaped c.
Maria zog Paul an den Haaren aus dem Zimmer. Maria pulled Paul at the hair out of the room
Several tests show that event-internal modifiers attach to the edge of V and event external modifiers to the edge of VP. Maienborn (2003: 487) suggests that the two syntactic positions trigger slightly different modification processes at the level of SF. While in both cases the lexical entries entering the semantic composition have the same decompositional representation (21a, b), the process of external modification identifies the external argument of the preposition with the event argument of the verb (21c; λQ applying to the PP, λP to the verb), whereas the process of internal modification turns the external argument of the preposition into a free variable, a so-called SF parameter (variable v in 21d) that is specified as a constituent part (part-of) of what will later be instantiated with the event variable. (21) a. b. c. d.
[P auf]: [V fliehen]: MOD: MOD’:
λy λx [LOC (x, ON (y))] λx λe [ESCAPE (e) & THEME (e, x)] λQ λP λx [P(x) & Q(x)] λQ λP λx [P(x) & PART-OF (x, v) & Q(v)]
The compositional processes yield the representation in (22a) for the sentence (20b), the variable v being uninstantiated. This representation will be enriched at the level of CS which falls back on a large base of shared conceptual knowledge. The utterance meaning is achieved via abduction processes which lead to the most economical explanation that is consistent with what is in the knowledge base. Spelling out the relevant part of the knowledge base, i.e. knowledge about spatial relations, about event types in terms of participants serving particular functions, and about the part-whole organization of physical objects, Maienborn shows how the CS representation for (20b), given in (22b), can be fomally derived (for details cf. Maienborn 2003: 492ff). (22) a. SF:
∃e [ESCAPE (e) & THEME (e, r) & BANK-ROBBER (r) & PART-OF (e, v) & LOC (v, ON (b)) & BIKE (b)]
b. CS: ∃e [EXTR-MOVE (e) & ESCAPE (e) & THEME (e, r) & BANK-ROBBER (r) & INSTR (e, b) & VEHICLE (b) & BIKE (b) & SUPPORT (b, r, τ(e)) & LOC (r, ON (b))] The emergence of Lexical Decomposition Grammar out of Two-level Semantics is particularly interesting for the development of theories of decompositional approaches
17. Frameworks of lexical decomposition of verbs to verb meaning. LDG locates decompositional representations in semantics and rejects syntactic approaches to decomposition arguing that they have failed to provide logically equivalent paraphrases and an adequate account of scopal properties of adverbials (Wunderlich 1997a: 28f). It assumes four levels of representation: conceptual structure (CS), semantic form (SF), theta structure (TS), and morphological/syntactic structure (MS, in earlier versions of LDG also called phrase structure, PS). SF is a decomposition based on type logic that is related to CS by restrictive lexicalization principles; TS is derived from SF by lambda abstraction and encodes the argument hierarchy. TS in turn is mapped onto MS by linking principles (Wunderlich 1997a: 32; 2000: 249ff). The four levels are illustrated in Fig. 17.14 with respect to the German verb geben ‘give’ in (23). (23) a. (als) [der Torwart [dem Jungen [den Ball gab]]] (when) the goalkeeper the boy the ball gave b. [DPxnom [DPydat [DPzacc geb-agrx ]]] TS λz +hr –lr
SF λy +hr +lr
λx –hr +lr
λs
{ACT(x) & BEC POSS(y,z)}(s)
CS
ACC MS
DAT
AGR NOM
x = Agent or Controller y = Recipient z = Patient or Affected Causal event: ACT(x,s1) Result state: POSS(y,z)(s2)
Fig. 17.14: The representational levels of LDG (Wunderlich 2000: 250)
Semantic form does not provide a complete characterization of a word’s meaning. It serves to represent those properties of predicate-argument structures that make it possible to account for their grammatical properties (Wunderlich 1996: 170). This level of representation must be finite and not subject to contingent knowledge. In contrast to semantic form, conceptual structures draw on an infinite set of properties and can be subject to contingent knowledge (Wunderlich 1997a: 29). Since SF decompositions consist of hierarchically ordered binary structures - assuming that a & b branches as [a [& b]] – arguments can be ranked according to how deeply they are embedded within this structure. TS in turn preserves the SF hierarchy of arguments in inverse order so that arguments can be discharged by functional application (Wunderlich 1997a: 44). Each argument role in TS is characterized as to whether there is a higher or lower role. Besides their thematic arguments, nouns and verbs also have referential arguments, which do not undergo linking. Referential arguments are subject to sortal restrictions that are represented as a structured index on the referential argument (Wunderlich 1997a: 34). With verbs, this sortal index consists of an event structure, similar in form to Pustejovsky’s event structures but slightly differing with respect to the distinctions expressed (Wunderlich 1996: 175ff).
381
382
IV. Lexical semantics The relation between the different levels is mediated by a number of principles. For example, Argument Hierarchy regulates the inverse hierarchical mapping from SF to TS, and Coherence requires that the subevents corresponding to SF predicates be interpreted as contemporaneous or causally related (Kaufmann 1995c). Thus, the causal interpretation of geben in (23) is not explicitly given in SF but left to Coherence as a general CS principle of interpretation.
7.3. Linguistic phenomena The main concern of LDG is argument linking, and its basic assumption is that syntactic properties of arguments follow from hierarchical structures within semantic form. Structural linking is based on the assignment of two binary features to the arguments in TS, [± hr] ‘there is a / no higher role’ and [± lr] ‘there is a / no lower role’. The syntactic features are associated with these two binary features, dative with [+ hr, + lr], accusative with [+hr], ergative with [+lr], and nominative/absolutive with [ ] (cf. Fig. 17.14). All and only the structural arguments have to be matched with a structural linker. Besides structural linking, it is taken into account that arguments can be suppressed or realized by oblique markers. This also motivates the distinction between SF and TS (Wunderlich 1997b: 47ff; 2000: 252). The following examples show how structural arguments are matched with structural linkers in nominative-accusative (NA) and absolutive-ergative (AE) languages (Wunderlich 1997a: 49): (24) a. intransitive verbs: NA: AE: b. transitive verbs: NA: AE: c.
ditransitive verbs: NA: AE:
λx [-hr, -lr] nom abs λy [+hr, -lr] acc abs λz [+hr, -lr] acc abs
λx [-hr, +lr] nom erg λy [+hr, +lr] dat dat
λx [-hr, +lr] nom erg
LDG pursues a strictly lexical account of argument extensions such as possessors, beneficiaries, or arguments introduced by word formation processes or resultative formation. These argument extensions are all handled within SF formation by adding predicates to an existing SF (Stiebels 1996; Wunderlich 2000). Thus, the complex verb in (25a) is represented as the complex SF (25c) on the basis of (25b) and an argument extension principle. (25) a. Sie erschrieb sich den Pulitzer-Preis. she “er”-wrote herself the Pulitzer Prize ‘She won the Pulitzer Prize by her writing.’ b. schreib- ‘write’: λyλxλs write(x,y)(s) c. erschreib: λvλuλxλs∃y {write(x,y)(s) & become poss(u,v)}(s)
17. Frameworks of lexical decomposition of verbs These processes are restricted by two constraints on possible verbs, Coherence and Connexion, the latter one requiring that each predicate in SF share at least one, possibly implicit, argument with another predicate in SF (Kaufmann 1995c). In (25c), Coherence guarantees the causal interpretation, and Connexion accounts for the identification of the agent of writing with the possessor of the prize. The resulting SF is then subject to Argument Hierarchy and the usual linking principles. As we have seen in (25c), the morphological operation adds semantic content to SF as it does with other operations like resultative formation. In other cases, morphology operates on TS in order to change linking conditions (e.g., passive) (Wunderlich 1997a: 52f). During the last 20 years, LDG has produced numerous studies on phenomena in a number of typologically diverse languages, dealing with agreement (Wunderlich 1994), word formation of verbs (Wunderlich 1997b; Stiebels 1996; Gamerschlag 2005), locative verbs (Kaufmann 1995a), causatives and resultatives (Wunderlich 1997a; Kaufmann 1995a), dative possessors (Wunderlich 2000), ergative case systems (Joppen & Wunderlich 1995), and nominal linking (Stiebels 2006).
7.4. Evaluation In contrast to some other decompositional approaches, LDG adheres to a compositional approach to meaning and tries to define its relation to current syntactic theories. In more recent publications (Stiebels 2002; Gamerschlag 2005), LDG has been reformulated within an optimality theoretic framework. Lexical Decomposition Grammar is criticized by Taylor (2000), in particular for its division between semantic and conceptual knowledge. LDG, based on Two-level Semantics, accounts for the different readings of a lexical item within conceptual structure, leaving lexical entries largely monosemous. Taylor argues that lexical usage is to a large degree conventionalized and that the particular readings a word does or does not have cannot be construed entirely from conceptual knowledge. Bierwisch (2002) presents a number of arguments against the removal of cause from SF decompositions. Further problems, emerging from structural stipulations, are discussed in article 7 (Engelberg) Lexical decomposition, section 3.5.
8. Natural Semantic Metalanguage 8.1. Origins and motivation The theory of Natural Semantic Metalanguage (NSM) originated in the early seventies. Its main proponents have been Anna Wierzbicka (1972; 1980; 1985; 1992; 1996) and Cliff Goddard (1998; 2006; 2008a; Goddard & Wierzbicka 2002). NSM theory has been developed as an attempt to construct a semantic metalanguage (i) that is expressive enough to cover all the word meanings in natural languages, (ii) that allows noncircular reductive paraphrases, (iii) that avoids metalinguistic elements that are not part of the natural language it describes, (iv) that is not ethnocentric, and (v) that makes it possible to uncover the universal properties of word meanings (cf. for an overview Goddard 2002a; Durst 2003). In order to achieve this, Wierzbicka suggested that the lexicon of a language can be divided into a small set of indefinable words (semantic primes) and a large set of words that can be defined in terms of these indefinables.
383
384
IV. Lexical semantics
8.2. Structure and location of decompositions The term Natural Semantic Metalanguage is intended to reflect that the semantic primes used as a metalanguage are actual words of the object language. The indefinables constitute a finite set and, although they are language-specific, each language-specific set “realizes, in its own way, the same universal and innate alphabet of human thought” (Wierzbicka 1992: 209). More precisely, this implies that the set of semantic primes of a particular language and their combinatorial potential have the expressive power of a full natural language and that the sets of semantic primes of all languages are isomorphic to each other. The set of semantic primes consists of 60 or so elements including such words as you, this, two, good, know, see, word, happen, die, after, near, if, very, kind of, and like, each disambiguated by a canonical context (cf. Goddard 2008b). These primes are claimed to be indefinable and indispensable (cf. Goddard & Wierzbicka 1994b; Wierzbicka 1996; Goddard 2002b; Wierzbicka 2009). Meaning descriptions within NSM theory look like the following (Wierzbicka 1992: 133): (26) a. (X is embarrassed) b. X thinks something like this: something happened to me now because of this, people here are thinking about me I don’t want this because of this, I would want to do something I don’t know what I can do I don’t want to be here now because of this, X feels something bad It is required for the relationship between the defining decomposition and the defined term that they be identical in meaning. This is connected to substitutability; the definiens and the definiendum are supposed to be replaceable by each other without change of meaning (Wierzbicka 1988: 12).
8.3. Linguistic phenomena More than any other decompositional theory, NSM theory resembles basic lexicographic approaches to meaning, in particular, those traditions of English learner lexicography in which definitions of word meanings are restricted to the non-circular use of a limited “controlled” defining vocabulary (e.g., Summers 1995). Thus, it is not surprising that NSM theory tackles word meanings in many semantic fields that have not been at the centre of attention within other decompositional approaches, for example, pragmatically complex domains like speech act verbs (Wierzbicka 1987). Other investigations focus on the cultural differences reflected in words and their alleged equivalents in other languages, for example, Wierzbicka’s (1999) study on emotion words. NSM theory also claims to be able to render the meaning of syntactic constructions and grammatical categories by decompositions. An example is given in (27). (27) a. [‘first person plural exclusive’] b. I’m thinking of some people
17. Frameworks of lexical decomposition of verbs I am one of these people you are not one of these people (Goddard 1998: 299) The claim of NSM theory to be particularly apt as a means to detect subtle cross-linguistic differences is reflected in Goddard & Wierzbicka (1994a), where studies on a fairly large number of typologically and genetically diverse languages are presented.
8.4. Evaluation While many of its critics acknowledge that NSM theory has provided many insights into particular lexical phenomena, its basic theoretical assumptions have often been subject to criticism. It has been called into question whether the emphasis on giving dictionarystyle explanations of word meanings is identical to uncovering the native speaker’s knowledge about word meaning. NSM theory has also been criticized for not putting much effort into providing a foundation for the theory on basic semantics concepts (cf. Riemer 2006: 352). The lack of a theory of truth, reference, and compositionality within NSM theory raised severe doubts about whether it can adequately deal with phenomena like quantification, anaphora, proper names, and presuppositions (Geurts 2003; Matthewson 2003; Barker 2003). This criticism also affects the claim of the theory to be able to cover the semantics of the entire lexicon of a language.
9. Lexical Relational Structures 9.1. Origins and motivation With the decline of Generative Semantics in the 1970s, lexical approaches to decomposition began to dominate the field. These approaches enriched our understanding of the complexity of lexical meaning as well as the possibility of generalizations across verb classes. Then in the late 1980s syntactic developments within the Principles & Parameter framework suggested more complex structures within the VP. The assumption of VPinternal subjects and, in particular, Larson’s (1988) theory of VP-shells as layered VPinternal structures suggested the possibility to align certain bits of verb-internal semantic structure with structural positions in layered VPs. With these developments underway, the time was ripe for new syntactic approaches to decomposition (cf. also the summary in Levin & Rappaport Hovav 2005: 131ff).
9.2. Structure and location of decompositions On the basis of data from binding, quantification, and conjunction with respect to double object constructions, Larson (1988: 381) argues for the Single Argument Hypothesis, according to which a head can only have one argument. This forces a layered structure with multiple heads within VP. Fig. 17.15 exhibits the structure of Mary gave a box to Tom within this VP-shell. The verb moves to the higher V node by head-movement. The mapping of a verb’s arguments onto the nodes within the VP-shell is determined by a theta hierarchy ‘agent > theme > goal > obliques’ such that the lowest role of a verb is assigned to the lowest argument position, the next lowest role to the next lowest argument position, and so on (Larson 1988: 382). Thus, there is a weak correspondence between structural positions and verb semantics in the sense that high argument positions
385
386
IV. Lexical semantics are associated with a comparatively high thematic value. However, structural positions within VP shells are not linked to any stable semantic interpretation.
Fig. 17.15: VP-shell and theta role assignment
Larsonian shells inspired research on the syntactic representation of argument structure. Particularly influential was the approach pursued by Hale & Keyser (1993; 1997; 2002). They assume that argument structure is handled within a lexicon component called l-syntax, which is an integral part of syntax as it obeys syntactic principles. The basic assumption is that argument structure, also called “lexical relational structure”, is defined in reference to two possible relations between a head and its arguments, namely, the head-complement and the head-specifier relation (Hale & Keyser 1999: 454). Each verb projects an unambiguous structure in l-syntax. In Fig. 17.16, the lexical relational structure of put as in put the books on the shelf is illustrated.
Fig. 17.16: Lexical relational structure of put and head movement of the verb
17. Frameworks of lexical decomposition of verbs Primary evidence for this approach is taken from verbs that are regarded as denominal. Locational verbs of this sort such as to shelve, to box, or to saddle receive a similar representation as to put. The Lexical Relational Structure of shelve consists of Larsonian VP-shells with the noun shelf as complement of the embedded prepositional head. From there, the noun incorporates into an abstract V head by head movement (cf. Fig. 17.17). (Hale & Keyser 1993: 55ff)
Fig. 17.17: Lexical relational structure of shelve and incorporation of the noun
In a similar way, unergative verbs like sneeze or dance (28b), which are assumed to have a structure parallel to expressions like make trouble or have puppies (28a), are derived by incorporation of a noun into a V head (Hale & Keyser 1993: 54f). (28) a. [have V [puppies N] NP] V' b. [sneezei V [ti N] NP] V' Hale & Kayser (1993: 68) assume that “elementary semantic relations” are “associated” with these syntactic structures: The agent occurs in a Spec position above VP, the theme in a Spec position of a V that takes a PP/AP complement, and so forth. The argument structure of shelve is thus related to semantic relations as exhibited in Fig. 17.18. It is important to keep in mind that Hale and Keyser do not claim that argument structures are derived from semantics. On the contrary, they assume that “certain meanings can be assigned to certain structures” in the sense that they are fully determined by l-syntactic configurations (Hale & Keyser 1993: 68; 1999: 463). Fig. 17.18 also reflects two important points of Hale and Keyser’s theory. Firstly, they assume a central distinction between verb classes: Contrary to the early exploration of VP structures as in Fig. 17.17, unergatives and transitives in contrast to unaccusatives are assumed not to have a subject as part of their argument structure; their subjects are assigned in s-syntax (Hale & Keyser 1993: 76ff). Secondly, Hale and Kayser emphasize that their approach explains why the number of theta roles is (allegedly) so small, namely, because there is only a very restricted set of syntactic configurations with which they can be associated.
387
388
IV. Lexical semantics
Fig. 17.18: Semantic relations associated with lexical relational structures (after Hale & Keyser1993: 76ff)
9.3. Linguistic phenomena Hale and Keyser’s approach aims to explain why certain argument structures are possible while others are not. For example, it is argued that sentences like (29a) are ungrammatical because incorporation of a subject argument violates the Empty Category Principle (Hale & Keyser 1993: 60). The ungrammaticality of (29b) is accounted for by the assumption that unergatives as in (28) do not project a specifier that would allow a transitivity alternation (Hale & Keyser 1999: 455). (29c) is argued to be ungrammatical because the Lexical Relational Structure would have to be parallel to she gave a church her money, in which church occupies Spec,VP, the “subject” position of the inner VP. Incorporation from this position violates the Empty Category Principle. By the same reasoning, she flattened the metal is well-formed, incorporating flat from an AP in complement position while (29d) is not since metal would have to incorporate from an inner subject position. (29) a. b. c. d.
*It cowed a calf. (with the meaning ‘a cow calved’ and it as expletive) *An injection calved the cow early. *She churched the money. *She metalled flat.
This incorporation approach allows Hale and Keyser to explore the parallels in syntactic behaviour between expressions like give a laugh and laugh which, besides being
17. Frameworks of lexical decomposition of verbs near-synonymous, both fail to transitivize, as well as the differences between expressions like make trouble and thicken soups where only the latter allows middles and inchoatives.
9.4. Evaluation Hale and Keyser’s work has stimulated a growing body of research aiming at a syntactification of notions of thematic roles, decompositional and aspectual structures. Some prominent examples are Mateu (2001), Alexiadou & Anagnostopoulou (2004), Erteschik-Shir & Rapoport (2005; 2007), Zubizarreta & Oh (2007), and Ramchand’s (2008) first phase syntax. Some of this work places a strong emphasis on aspectual structure, for example, Ritter & Rosen (1998) and Travis (2000). As Travis (2000: 181f) argues, the new syntactic approaches to decompositions avoid many of the pitfalls of Generative Semantics, which is certainly due to a better understanding of restrictive principles in recent syntactic theories. However, Hale and Keyser’s work has also attracted heavy criticism from proponents of Lexical Conceptual Structure (e.g., Culicover & Jackendoff 2005, Rappaport Hovav & Levin 2005), Two-level Semantics (e.g., Bierwisch 1997; Kiparsky 1997) as well as from anti-decompositionalist positions (e.g., Fodor & Lepore 1999). The analyses themselves raise many questions. For example, it remains unexplained which principles exclude a lexical structure of a putative verb to church (29c) along the lines of she gave the money to the church, that is, a structure parallel to the one suggested for to shelve (cf. also Kiparsky 1997: 481). It has also been observed that the position allegedly vacated by the noun in structures as in Fig. 17.18 can actually show lexical material as in Joe buttered the toast with rancid butter (Culicover & Jackendoff 2005: 102). Many other analyses and assumptions have also been under attack, among them assumptions about which verbs are denominal and how their meanings come about (Kiparsky 1997: 485ff; Culicover & Jackendoff 2005: 55) as well as predictions about possible transitivity alternations (Kiparsky 1997: 491). Furthermore, one can of course doubt that the number of different theta roles is as small as Hale and Keyser assume. More empirically oriented approaches to verb semantics come to dramatically different conclusions (cf. Kiparsky 1997: 478 or work on Frame Semantics like Ruppenhofer et al. 2006). Even some problems from Generative Semantics reemerge, such as that expressions like put on a shelf and shelve are not synonymous, the latter being more specific (Bierwisch 1997: 260). Overgeneralization is not accounted for, either. The fact that there is a verb to shelve but no semantically corresponding verb to basket points to a location of decomposition in the lexicon (Bierwisch 1997: 232f). Reacting to some of the criticism, Hale & Keyser (2005) later modified some assumptions of their theory; for example, they abandoned the idea of incorporation in favour of a locally operating selection mechanism.
10. Distributed Morphology 10.1. Origins and motivation Hale & Keyser (1993) and much research inspired by them have attempted to reduce the role of the lexicon in favour of syntactic representations. An even more radical anti-
389
390
IV. Lexical semantics lexicalist approach is pursued by Distributed Morphology (DM), which started out in Halle & Marantz (1993) and since then has been elaborated by a number of DM proponents (Marantz 1997; Harley 2002; Harley & Noyer 1999; 2000; Embick 2004; Embick & Noyer 2007) (cf. also article 81 (Harley) Semantics in Distributed Morphology).
10.2. Structure and location of decompositions According to Distributed Morphology, syntax does not combine words but generates structures by combining morphosyntactic features. Terminal nodes, so-called “morphemes”, are bundles of these morphosyntactic features. DM distinguishes f-nodes from l-nodes. F-nodes correspond to what is traditionally known as functional, closed-class categories; their insertion at spell-out is deterministic. L-nodes correspond to lexical, open-class categories; their insertion is not deterministic. Vocabulary items are only inserted at spell-out. These vocabulary items are minimally specified in that they only consist of a phonological string and some information where this string can be inserted (cf. Fig. 17.19).
Fig. 17.19: The architecture of distributional morphology (cf. Harley & Noyer 2000: 352; Embick & Noyer 2007)
Neither a syntactic category nor any kind of argument structure representation is included in vocabulary entries as can be seen in example (30a) from Harley & Noyer (1999: 3). The distribution information in the vocabulary item replaces what is usually done by theta-roles and selection. In addition to the Vocabulary, there is a component called Encyclopaedia where vocabulary items are linked to those aspects of meaning that are not completely predictable from morphosyntactic structure (30b). (30) a. Vocabulary item: /dog/: [Root] [+count] [+animate] ... b. Encyclopaedia item: dog: four legs, canine, pet, sometimes bites etc... chases balls, in environment “let sleeping ____s lie”, refers to discourse entity who is better left alone... While the formal information in vocabulary items in part determines grammatical wellformedness, the encyclopaedic information guides the appropriate use of expressions. For example, the oddness of (31a) is attributed to encyclopaedic knowledge. The sentence is pragmatically anomalous but interpretable: It could refer to some unusual telepathic transportation event. (31b), on the other hand, is considered ungrammatical because put is
17. Frameworks of lexical decomposition of verbs not properly licensed and, therefore, uninterpretable under any circumstances (Harley & Noyer 2000: 354). (31) a. Chris thought the book to Mary. b. *James put yesterday. Part of speech is reflected in DM by the constellation in which a root morpheme occurs. For example, a root is a noun if its nearest c-commanding f-node is a determiner and a verb if its nearest c-commanding f-nodes are v, aspect, and tense. Not only are lexical entries more reduced than in approaches based on Hale & Keyser (1993), there is also no particular part of syntax corresponding to l-syntax (cf. for this overview Harley & Noyer 1999, 2000; Embick & Noyer 2007).
10.3. Linguistic phenomena Distributed Morphology has been applied to all kinds of phenomena in the domain of inflectional and derivational morphology. Some work has also been done with respect to the argument structure of verbs and nominalizations. One of the main topics in this area is the explanation of the range of possible argument structure alternations. A typical set of data is given in (32) and (33) (taken from Harley & Noyer 2000: 362). (32) a. b. c. d.
John grows tomatoes. Tomatoes grow. The insects destroyed the crop. *The crops destroyed.
(33) a. b. c. d. e.
the growth of the tomatoes the tomatoes’ growth *John’s growth of the tomatoes the crop’s destruction the insects’ destruction of the crop
It has to be explained why grow, but not destroy, has an intransitive variant and why destruction, but not growth, allows the realization of the causer argument in Spec, DP (for the following, cf. Harley & Noyer 2000: 356ff). Syntactic structures are based on VP-shells. For each node in these structures, there is a set of possible items that can fill this position (with LP corresponding approximately to VP): (34) a. b. c. d. e.
node Spec,vP v head Spec,LP L head Comp,LP
possible filling DP, ∅ happen/become, cause, be DP, ∅ l-node DP, ∅
Picking from this menu, a number of different syntactic configurations can be created:
391
392
IV. Lexical semantics (35) a. b. c. d. e. f.
Spec,vP DP ∅ DP DP ∅ ∅
v cause become cause cause become be
Spec,LP ∅ ∅ ∅ DP ∅ DP
L l l l l l l
Comp,LP DP DP DP DP DP DP
(example) grow (tr.) grow (itr.) destroy give arrive know
The items filling the v head are the only ones conceived of as having selectional properties: cause, but not become or be, selects an external argument. The grammaticality of the configurations in (35) is also determined by the licensing environment specified in the Vocabulary (cf. Fig. 17.20). VOCABULARY
ENCYCLOPAEDIA
Phonology Licensing environment destroy grow sink open arrive
[+v],[+DP],[+cause] [+v],[+DP],[±cause] [±v],[+DP],[±cause] [±v],[+DP],[±cause] [+v],[+DP],[–cause]
what we mean by destroy what we mean by grow what we mean by sink what we mean by open what we mean by arrive
Fig. 17.20: Vocabulary and encyclopaedic entries of verbs (after Harley & Noyer 2000: 361)
Thus, the transitive and intransitive uses of the roots in (32a) through (32c) are reflected in the syntactic structures in (36), where cause and become are realized as zero morphemes (cf. article 81 (Harley) Semantics in Distributed Morphology) and the LP head is assumed to denote a resulting state: (36) a. [vP [DP John] [v' cause [LP grown [DP tomatoes]]]] b. [v' become [LP grown [DP tomatoes]]] c. [vP [DP the insects] [v' cause [LP destroyed [DP the crop]]]] The fact that destroy does not allow an intransitive variant is due to the fact that its licensing environment requires embedding under cause while grow is underspecified in this respect. The explanation for the nominalization data in (33) relies on the assumption that Spec,DP is not as semantically loaded as Spec,vp. It is further assumed that by encyclopaedic knowledge destroy always requires external causation while grow refers inherently to an internally caused spontaneous activity, which is optionally facilitated by some agent. Since cause is only implied with destroy but not with grow, only the causer of destroy can be interpreted in a semantically underspecified Spec,DP position. The fact that some verbs like explode behave partly like grow, in allowing the transitive-intransitive alternation, and partly like destroy, in allowing the realization of the causer in nominalizations, is explained by the assumption that the events denoted by such roots can occur spontaneously (internal causation) but can also be directly brought about by some agent (external causation) (cf. also Marantz 1997). In summary, the phenomena in (32) are traced back to syntactic regularities, those in (33) to encyclopaedic, that is, pragmatic conditions.
17. Frameworks of lexical decomposition of verbs
10.4. Evaluation While some other approaches to argument structure share a number of assumptions with DM – for example, they also operate on category-neutral roots (e.g., Borer 2005; Arad 2002) – of course the radical theses of Distributed Morphology have also drawn some criticism. It has been doubted that all the differences in the syntactic behaviour of verbs can be accounted for with a syntax-free lexicon (cf. e.g., Ramchand 2008). Cross-linguistic differences might also pose some problems. For example, it is assumed that verbs allowing the unaccusative-transitive alternation are distinguished on the basis of encyclopaedic semantic knowledge from those that do not (Embick 2004: 139). The causative variant of the showcase example grow in (32a) is grammatically licensed and pragmatically acceptable because of encyclopaedic knowledge. However, it is not clear why German wachsen ‘grow’ does not have a causative variant. The alternation would be expected since wachsen does not seem to differ from grow in its encyclopaedic properties. Moreover, many other verbs like to dry ‘trocknen’ (37) demonstrate that German does not show any kind of structural aversion to alternations of this sort. (37) a. Der Salat trocknet / wächst. ‘The lettuce dries / grows.’ b. Peter trocknet Salat / *wächst Salat. ‘Peter dries lettuce / grows lettuce.’ The way the line is drawn between grammatical and pragmatic (un)acceptability also poses some problems. If the use of put with only one argument is considered ungrammatical, then how can similar uses of three-place verbs like German stellen ‘put (in upright position)’ and geben ‘give’ be explained (38)? (38) a. Er gibt. ‘He deals (in a card game).’ b. Sie stellt. ‘She plays a volleyball such that somebody can smash it.’ Since they are ruled out by grammar, encyclopaedic knowledge cannot save these examples by assigning them an idiomatic meaning. Thus, it might turn out that sometimes argument-structure flexibility is not as general as DM’s encyclopaedia suggests, and grammatical restrictions are not as strict as syntax and DM’s vocabulary predict.
11. Outlook The overview has shown that stances on lexical decomposition still differ widely, in particular with respect to the questions of where to locate lexical decompositions, how to interpret them, and how to justify them. It has to be noted that most work on lexical decompositions has not been accompanied by extensive empirical research. With the rise of new methods in the domain of corpus analysis, grammaticality judgements, and psycholinguistics (cf. article 7 (Engelberg) Lexical decomposition, section 3.6), the empirical basis for further decompositional theories will alter dramatically. It remains to be seen how theories of the sort presented here will cope with the empirical turn in contemporary linguistics.
393
394
IV. Lexical semantics
12. References Alexiadou, Artemis & Elena Anagnostopoulou 2004. Voice morphology in the causative-inchoative alternation: Evidence for a non-unified structural analysis of unaccustives. In: A. Alexiadou, E. Anagnostopoulou & M. Everaert (eds.). The Unaccusativity Puzzle. Explorations of the Syntax-Lexicon Interface. Oxford: Oxford University Press, 114–136. Arad, Maya 2002. Universal features and language-particular morphemes. In: A. Alexiadou (ed.). Theoretical Approaches to Universals. Amsterdam: Benjamins, 15–29. Bach, Emmon 1968. Nouns and noun phrases. In: E. Bach & R. T. Harms (eds.). Universals in Linguistic Theory. New York: Holt, Rinehart & Winston, 90–122. Barker, Chris 2003. Paraphrase is not enough. Theoretical Linguistics 29, 201–209. Bartsch, Renate & Theo Vennemann 1972. Semantic Structures. A Study in the Relation between Semantics and Syntax. Frankfurt/M.: Athenäum. Bierwisch, Manfred 1982. Formal and lexical semantics. Linguistische Berichte 30, 3–17. Bierwisch, Manfred 1989. Event nominalizations: Proposals and problems. In: W. Motsch (ed.). Wortstruktur und Satzstruktur. Berlin: Akademie Verlag, 1–73. Bierwisch, Manfred 1997. Lexical information from a minimalist point of view. In: C. Wilder, H.-M. Gärtner & M. Bierwisch (eds.). The Role of Economy Principles in Linguistic Theory. Berlin: Akademie Verlag, 227–266. Bierwisch, Manfred 2002. A case for cause. In: I. Kaufmann & B. Stiebels (eds.). More than Words: A Festschrift for Dieter Wunderlich. Berlin: Akademie Verlag, 327–353. Bierwisch, Manfred & Ewald Lang (eds.) 1989. Dimensional Adjectives. Grammatical Structure and Conceptual Interpretation. Berlin: Springer. Binnick, Robert I. 1968. On the nature of the ‘lexical item’. In: B. J. Darden, C.-J. N. Bailey & A. Davison (eds.). Papers from the Fourth Regional Meeting of the Chicago Linguistic Society (= CLS). Chicago, IL: Chicago Linguistic Society, 1–13. Binnick, Robert J. 1972. Zur Entwicklung der generativen Semantik. In: W. Abraham & R. J. Binnick (eds.). Generative Semantik. Frankfurt/M.: Athenäum, 1–48. Borer, Hagit 2005. Structuring Sense, vol. 2: The Normal Course of Events. Oxford: Oxford University Press. Carter, Richard J. 1976. Some constraints on possible words. Semantikos 1, 27–66. Carter, Richard 1988. Arguing for semantic representations. In: B. Levin & C. Tenny (eds.). On Linking: Papers by Richard Carter. Lexicon Project Working Papers 25. Cambridge, MA: Center for Cognitive Science, MIT, 139–166. Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge, MA: The MIT Press. Chomsky, Noam 1970. Some empirical issues in the theory of transformational grammar. In: P. S. Peters (ed.). Goals of Linguistic Theory. Englewood Cliffs, NJ: Prentice Hall, 63–130. Croft, William 1998. Event structure in argument linking. In: M. Butt & W. Geuder (eds.). The Projection of Arguments: Lexical Compositional Factors. Stanford, CA: CSLI Publications, 21–63. Culicover, Peter W. & Ray Jackendoff 2005. Simpler Syntax. Oxford: Oxford University Press. Davidson, Donald 1967. The logical form of action sentences. In: N. Rescher (ed.). The Logic of Decision and Action. Pittsburgh, PA: University of Pittsburgh Press, 81–95. Dowty, David R. 1972. Studies in the Logic of Verb Aspect and Time Reference in English. Ph.D. dissertation. University of Texas, Austin, TX. Dowty, David R. 1976. Montague Grammar and the lexical decomposition of causative verbs. In: B. Partee (ed.). Montague Grammar. New York: Academic Press, 201–246. Dowty, David R. 1979. Word Meaning and Montague Grammar. The Semantics of Verbs and Times in Generative Semantics and in Montague’s PTQ. Dordrecht: Reidel. Dowty, David R. 1991. Thematic proto-roles and argument selection. Language 67, 547–619. Durst, Uwe 2003. The Natural Semantic Metalanguage approach to linguistic meaning. Theoretical Linguistics 29, 157–200.
17. Frameworks of lexical decomposition of verbs Embick, David 2004. Unaccusative syntax and verbal alternations. In: A. Alexiadou, E. Anagnostopoulou & M. Everaert (eds.). The Unaccusativity Puzzle. Explorations of the Syntax-Lexicon Interface. Oxford: Oxford University Press, 137–158. Embick, David & Rolf Noyer 2007. Distributed Morphology and the syntax/morphology interface. In: G. Ramchand & C. Reiss (eds.). The Oxford Handbook of Linguistic Interfaces. Oxford: Oxford University Press, 289–324. Engelberg, Stefan 2006. A theory of lexical event structures and its cognitive motivation. In: D. Wunderlich (ed.). Advances in the Theory of the Lexicon. Berlin: de Gruyter, 235–285. Erteschik-Shir, Nomi & Tova Rapoport 2005. Path predicates. In: N. Erteschik-Shir & T. Rapoport (eds.). The Syntax of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 65–86. Erteschik-Shir, Nomi & Tova Rapoport 2007. Projecting argument structure. The grammar of hitting and breaking revisited. In: E. Reuland, T. Bhattacharya & G. Spathas (eds.). Argument Structure. Amsterdam: Benjamins, 17–35. Fillmore, Charles J. 1968a. Lexical entries for verbs. Foundations of Language 4, 373–393. Fillmore, Charles J. 1968b. The case for case. In: E. Bach & R. T. Harms (eds.). Universals in Linguistic Theory. New York: Holt, Rinehart & Winston, 1–88. Fodor, Jerry A. 1970. Three reasons for not deriving ‘kill’ from ‘cause to die’. Linguistic Inquiry 1, 429–438. Fodor, Jerry A., Merrill F. Garrett, Edword C. T. Walker & Cornelia H. Parkes 1980. Against definitions. Cognition 8, 263–367. Fodor, Jerry & A. & Ernie Lepore 1999. Impossible words? Linguistic Inquiry 30, 445–453. Gamerschlag, Thomas 2005. Komposition und Argumentstruktur komplexer Verben. Eine lexikalische Analyse von Verb-Verb-Komposita und Serialverbkonstruktionen. Berlin: Akademie Verlag. Geurts, Bart 2003. Semantics as lexicography. Theoretical Linguistics 29, 223–226. Goddard, Cliff 1998. Semantic Analysis. A Practical Introduction. Oxford: Oxford University Press. Goddard, Cliff 2002a. Lexical decomposition II: Conceptual axiology. In: A. D. Cruse et al. (eds.). Lexikologie – Lexicology. Ein internationales Handbuch zur Natur und Struktur von Wörtern und Wortschätzen – An International Handbook on the Nature and Structure of Words and Vocabularies, (HSK 21.1). Berlin: de Gruyter, 256–268. Goddard, Cliff 2002b. The search for the shared semantic core of all languages. In: C. Goddard & A. Wierzbicka (eds.). Meaning and Universal Grammar, vol. 1: Theory and Empirical Finding. Amsterdam: Benjamins, 5–40. Goddard, Cliff 2006. Ethnopragmatics: A new paradigm. In: C. Goddard (ed.). Ethnopragmatics. Understanding Discourse in Cultural Context. Berlin: de Guyter, 1–30. Goddard, Cliff 2008a. Natural Semantic Metalanguage: The state of the art. In: C. Goddard (ed.). Cross-Linguistic Semantics. Amsterdam: Benjamins, 1–34. Goddard, Cliff 2008b. Towards a systematic table of semantic elements. In: C. Goddard (ed.). CrossLinguistic Semantics. Amsterdam: Benjamins, 59–81. Goddard, Cliff, & Anna Wierzbicka (eds.) 1994a. Semantic and Lexical Universals – Theory and Empirical Findings. Amsterdam: Benjamins. Goddard, Cliff & Anna Wierzbicka 1994b. Introducing lexical primitives. In: C. Goddard & A. Wierzbicka (eds.). Semantic and Lexical Universals – Theory and Empirical Findings. Amsterdam: Benjamins, 31–54. Goddard, Cliff & Anna Wierzbicka 2002. Semantic primes and universal grammar. In: C. Goddard & A. Wierzbicka (eds.). Meaning and Universal Grammar, vol. 1: Theory and Empirical Findings. Amsterdam: Benjamins, 41–85. Grimshaw, Jane 1990. Argument Structure. Cambridge, MA: The MIT Press. Gruber, Jeffrey S. 1965. Studies in Lexical Relations. Ph.D. dissertation. MIT, Cambridge, MA. Guerssel, Mohamed, Kenneth Hale, Mary Laughren, Beth Levin & Josie White Eagle 1985. A cross-linguistic study of transitivity alternations. In: W. H. Eilfort, P. D. Kroeber & K. L. Peterson
395
396
IV. Lexical semantics (eds.). Papers from the Parasession on Causatives and Agentivity at the 21st Regional Meeting of the Chicago Linguistic Society (= CLS), April 1985. Chicago, IL: Chicago Linguistic Society, 48–63. Hale, Ken & Samuel Jay Keyser 1987. A View from the Middle. Lexicon Project Working Papers 10. Cambridge, MA: Center for Cognitive Science, MIT. Hale, Ken & Samuel Jay Keyser 1993. On argument structure and the syntactic expression of lexical relations. In: K. Hale & S. J. Keyser (eds.). The View from Building 20. Essays in Linguistics in Honor of Sylvain Bromberger. Cambridge, MA: The MIT Press, 53–109. Hale, Ken & Samuel Jay Keyser 1997. On the complex nature of simple predicators. In: A. Alsina, J. Bresnan & P. Sells (eds.). Complex Predicates. Stanford, CA: CSLI Publications, 29–65. Hale, Ken & Samuel Jay Keyser 1999. A response to Fodor and Lepore, “Impossible words?” Linguistic Inquiry 30, 453–466. Hale, Ken & Samuel Jay Keyser 2002. Prolegomenon to a Theory of Argument Structure. Cambridge, MA: The MIT Press. Hale, Ken & Samuel Jay Keyser 2005. Aspect and the syntax of argument structure. In: N. Erteschik-Shir & T. Rapoport (eds.). The Syntax of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 11–41. Halle, Morris & Alec Marantz 1993. Distributed Morphology and the pieces of inflection. In: K. Hale & S. J. Keyser (eds.). The View from Building 20. Essays in Honor of Sylvain Bomberger. Cambridge, MA: The MIT Press, 111–176. Harley, Heidi 2002. Possession and the double object construction. In: P. Pica & J. Rooryck (eds.). Linguistic Variation Yearbook, vol. 2. Amsterdam: Benjamins, 31–70. Harley, Heidi & Rolf Noyer 1999. Distributed Morphology. GLOT International 4, 3–9. Harley, Heidi & Rolf Noyer 2000. Formal versus encyclopedic properties of vocabulary: Evidence from nominalizations. In: B. Peeters (ed.). The Lexicon-Encyclopedia Interface. Amsterdam: Elsevier, 349–375. Immler, Manfred 1974. Generative Syntax – Generative Semantik. München: Fink. Jackendoff, Ray 1972. Semantic Interpretation in Generative Grammar. Cambridge, MA: The MIT Press. Jackendoff, Ray 1976. Toward an explanatory semantic representation. Linguistic Inquiry 7, 89–150. Jackendoff, Ray 1983. Semantics and Cognition. Cambridge, MA: The MIT Press. Jackendoff, Ray 1987. Relations in linguistic theory. Linguistic Inquiry 18, 369–411. Jackendoff, Ray 1990. Semantic Structures. Cambridge, MA: The MIT Press. Jackendoff, Ray 2002. Foundations of Language. Brain, Meaning, Grammar, Evolution. Oxford: Oxford University Press. Joppen, Sandra & Dieter Wunderlich 1995. Argument linking in Basque. Lingua 97, 123–169. Joshi, Aravind 1974. Factorization of verbs. In: C. H. Heidrich (ed.). Semantics and Communication. Amsterdam: North-Holland, 251–283. Kandiah, Thiru 1968. Transformational Grammar and the layering of structure in Tamil. Journal of Linguistics 4, 217–254. Kaufmann, Ingrid 1995a. Konzeptuelle Grundlagen semantischer Dekompositionsstrukturen. Die Kombinatorik lokaler Verben und prädikativer Komplemente. Tübingen: Niemeyer. Kaufmann, Ingrid 1995b. O- and D-predicates: A semantic approach to the unaccusativeunergative distinction. Journal of Semantics 12, 377–427. Kaufmann, Ingrid 1995c. What is an (im-)possible verb? Restrictions on semantic form and their consequences for argument structure. Folia Linguistica 29, 67–103. Kiparsky, Paul 1997. Remarks on denominal verbs. In: A. Alsina, J. Bresnan & P. Sells (eds.). Complex Predicates. Stanford, CA: CSLI Publications, 473–499. Lakoff, George 1965. Irregularity in Syntax. Ph.D. dissertation. Indiana University, Bloomington, IN. Reprinted: New York: Holt, Rinehart & Winston, 1970. Lakoff, George & John Robert Ross 1972. A note on anaphoric islands and causatives. Linguistic Inquiry 3, 121–125.
17. Frameworks of lexical decomposition of verbs Larson, Richard K. 1988. On the double object construction. Linguistic Inquiry 19, 335–391. Levin, Beth 1985. Lexical semantics in review: An introduction. In: B. Levin (ed.). Lexical Semantics in Review. Cambridge, MA: The MIT Press, 1–62. Levin, Beth 1993. English Verb Classes and Alternations. A Preliminary Investigation. Chicago, IL: The University of Chicago Press. Levin, Beth 1995. Approaches to lexical semantic representation. In: D. E. Walker, A. Zampolli & N. Calzolari (eds.). Automating the Lexicon: Research and Practice in a Multilingual Environment. Oxford: Oxford University Press, 53–91. Levin, Beth & Malka Rappaport Hovav 2005. Argument Realization. Cambridge: Cambridge University Press. Lewis, David 1973. Causation. The Journal of Philosophy 70, 556–567. Maienborn, Claudia 2003. Event-internal modifiers: Semantic underspecification and conceptual interpretation. In: E. Lang, C. Maienborn & C. Fabricius-Hansen (eds.). Modifying Adjuncts. Berlin: Mouton de Gruyter, 475–509. Marantz, Alec 1997. No escape from syntax: Don’t try morphological analysis in the privacy of your own lexicon. In: A. Dimitriadis, L. Siegel, C. Surek-Clark & A. Williams (eds.). Proceedings of the 21st Annual Penn Linguistics Colloquium. Philadelphia, PA: University of Pennsylvania Press, 201–225. Mateu, Jaume 2001. Unselected objects. In: N. Dehé & A. Wanner (eds.). Structural Aspects of Semantically Complex Verbs. Frankfurt/M.: Lang, 83–104. Matthewson, Lisa 2003. Is the meta-language really natural? Theoretical Linguistics 29, 263–274. McCawley, James D. 1968. Lexical insertion in a Transformational Grammar without deep structure. In: B. J. Darden, C.-J. N. Bailey & A. Davison (eds.). Papers from the Fourth Regional Meeting of the Chicago Linguistic Society. Chicago, IL: Chicago Linguistics Society, 71–80. McCawley, James D. 1971. Prelexical syntax. In: R. J. O’Brien (ed.). Report of the 22nd Annual Round Table Meeting on Linguistics and Language Studies. Washington, DC: Georgetown University Press, 19–33. McCawley, James D. 1994. Generative Semantics. In: R. E. Asher & J. M. Y. Simpson (ed.). The Encyclopedia of Language and Linguistics. Oxford: Pergamon, 1398–1403. Montague, Richard 1973. The proper treatment of quantification in ordinary English. In: J. Hintikka, J. M. E. Moravcsik & P. Suppes (eds.). Approaches to Natural Language. Proceedings of the 1970 Stanford Workshop on Grammar and Semantics. Dordrecht: Reidel, 221–242. Morgan, Jerry L. 1969. On arguing about semantics. Papers in Linguistics 1, 49–70. Postal, Paul M. 1971. On the surface verb ‘remind’. In: C. J. Fillmore & D. T. Langendoen (eds.). Studies in Linguistic Semantics. New York: Holt, Rinehart & Winston, 180–270. Pustejovsky, James 1988. The geometry of events. In: C. Tenny (ed.). Studies in Generative Approaches to Aspect. Cambridge, MA: The MIT Press, 19–39. Pustejovsky, James 1991a. The syntax of event structure. Cognition 41, 47–81. Pustejovsky, James 1991b. The generative lexicon. Computational Linguistics 17, 409–441. Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA: The MIT Press. Pustejovsky, James & Pierrette Bouillon 1995. Aspectual coercion and logical polysemy. Journal of Semantics 12, 133–162. Ramchand, Gillian C. 2008. Verb Meaning and the Lexicon. A First-Phase Syntax. Cambridge: Cambridge University Press. Rappaport, Malka 1985. Review of Joshi’s “Factorization of verbs”. In: B. Levin (ed.). Lexical Semantics in Review. Cambridge, MA: The MIT Press, 137–148. Rappaport, Malka, Mary Laughren & Beth Levin 1987. Levels of Lexical Representation. Lexicon Project Working Papers 20. Cambridge, MA: MIT. Rappaport, Malka & Beth Levin 1988. What to do with θ-roles. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 7–36. Rappaport, Malka, Beth Levin & Mary Laughren 1993. Levels of lexical representation. In: J. Pustejovsky (ed.). Semantics and the Lexicon. Dordrecht: Kluwer, 37–54.
397
398
IV. Lexical semantics Rappaport Hovav, Malka & Beth Levin 1998. Building verb meanings. In: M. Butt & W. Geuder (eds.). The Projection of Arguments: Lexical Compositional Factors. Stanford, CA: CSLI Publications, 97–134. Rappaport Hovav, Malka & Beth Levin 2005. Change-of-state verbs: Implications for theories of argument projection. In: N. Erteschik-Shir & T. Rapoport (eds.). The Syntax of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 274–286. Reinhart, Tanya 2002. The theta system – an overview. Theoretical Linguistics 28, 229–290. Riemer, Nick 2006. Reductive paraphrase and meaning: A critique of Wierzbickian semantics. Linguistics & Philosophy 29, 347–379. Ritter, Elizabeth & Sara Thomas Rosen 1998. Delimiting events in syntax. In: M. Butt & W. Geuder (eds.). The Projection of Arguments: Lexical Compositional Factors. Stanford, CA: CSLI Publications, 135–164. Ross, John Robert 1972. Act. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 70–126. Rozwadowska, Bozena 1988. Thematic restrictions on derived nominals. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 147–165. Ruppenhofer, Josef, Michael Ellsworth, Miriam R. L. Petruck, Christopher R. Johnson & Jan Scheffczyk 2006. Frame Net II: Extended Theory and Practice, http://framenet.icsi.berkeley. edu/index.php?option=com_wrapper&Itemid=126. December 30, 2006. Shibatani, Masayoshi 1976. The grammar of causative constructions: A conspectus. In: M. Shibatani (ed.). The Grammar of Causative Constructions. New York: Academic Press, 1–40. Stiebels, Barbara 1996. Lexikalische Argumente und Adjunkte. Zum semantischen Beitrag von verbalen Präfixen und Partikeln. Berlin: Akademie Verlag. Stiebels,Barbara 1998.Complex denominal verbs in German and the morphology-semantics interface. In: G. Booij & J. van Marle (eds.). Yearbook of Morphology 1997. Dordrecht: Kluwer, 265–302. Stiebels, Barbara 2002. Typologie des Argumentlinkings. Ökonomie und Expressivität. Berlin: Akademie Verlag. Stiebels, Barbara 2006. From rags to riches. Nominal linking in contrast to verbal linking. In: D. Wunderlich (ed.). Advances in the Theory of the Lexicon. Berlin: de Gruyter, 167–234. Summers, Della (Med.) 1995. Longman Dictionary of Contemporary English. München: Langenscheidt-Longman. Taylor, John R. 2000. The network model and the two-level model in comparison. In: B. Peeters (ed.). The Lexicon-Encyclopedia Interface. Amsterdam: Elsevier, 115–141. Tenny, Carol 1987. Grammaticalizing Aspect and Affectedness. Ph.D. dissertation. MIT, Cambridge, MA. Tenny, Carol 1988. The aspectual interface hypothesis: The connection between syntax and lexical semantics. In: C. Tenny (ed.). Studies in Generative Approaches to Aspect. Cambridge, MA: The MIT Press, 1–18. Travis, Lisa 2000. Event structure in syntax. In: C. Tenny (ed.). Events as Grammatical Objects. The Converging Perspectives of Lexical Semantics and Syntax. Stanford, CA: CSLI Publications, 145–185. van Valin, Robert D. Jr. 1993. A synopsis of Role and Reference Grammar. In: R. D. van Valin (ed.). Advances in Role and Reference Grammar. Amsterdam: Benjamins, 1–164. Vendler, Zeno 1957. Verbs and times. The Philosophical Review LXVI, 143–160. van Voorst, Jan 1988. Event Structure. Amsterdam: Benjamins. Wierzbicka, Anna 1972. Semantic Primitives. Frankfurt/M.: Athenäum. Wierzbicka, Anna 1980. Lingua Mentalis. The Semantics of Natural Language. Sydney: Academic Press. Wierzbicka, Anna 1985. Lexicography and Conceptual Analysis. Ann Arbor, MI: Karoma. Wierzbicka, Anna 1987. English Speech Act Verbs. A Semantic Dictionary. Sydney: Academic Press. Wierzbicka, Anna 1988. The Semantics of Grammar. Amsterdam: Benjamins. Wierzbicka, Anna 1992. Semantics, Culture, and Cognition. Universal Human Concepts in CultureSpecific Configurations. Oxford: Oxford University Press.
18. Thematic roles
399
Wierzbicka, Anna 1996. Semantics. Primes and Universals. Oxford: Oxford University Press. Wierzbicka, Anna 1999. Emotions across Languages and Cultures. Cambridge: Cambridge University Press. Wierzbicka, Anna 2009. The theory of the mental lexicon. In: S. Kempgen et al. (eds.). Die slavischen Sprachen – The Slavic Languages. Ein internationales Handbuch zu ihrer Geschichte, ihrer Struktur und ihrer Erforschung – An International Handbook of their History, their Structure and their Investigation. (HSK 32.1). Berlin: de Gruyter, 848–863. von Wright, George Henrik 1963. Norm and Action. London: Routledge & Kegan Paul. Wunderlich, Dieter 1991. How do prepositional phrases fit into compositional syntax and semantics? Linguistics 25, 283–331. Wunderlich, Dieter 1994. Towards a lexicon-based theory of agreement. Theoretical Linguistics 20, 1–35. Wunderlich, Dieter 1996. Models of lexical decomposition. In: E. Weigand & F. Hundsnurscher (eds.). Lexical Structures and Language Use. Proceedings of the International Conference on Lexicology and Lexical Semantics, Münster, September 13–15, 1994. Vol. 1: Plenary Lectures and Session Papers. Tübingen: Niemeyer, 169–183. Wunderlich, Dieter 1997a. cause and the structure of verbs. Linguistic Inquiry 28, 27–68. Wunderlich, Dieter 1997b. Argument extension by lexical adjunction. Journal of Semantics 14, 95–142. Wunderlich, Dieter 2000. Predicate composition and argument extension as general options – a study in the interface of semantic and conceptual structure. In: B. Stiebels & D. Wunderlich (eds.). Lexicon in Focus. Berlin: Akademie Verlag, 247–270. Wunderlich, Dieter 2006. Towards a structural typology of verb classes. In: D. Wunderlich (ed.). Advances in the Theory of the Lexicon. Berlin: de Gruyter, 57–166. Zubizarreta, Maria Luisa & Eunjeong Oh 2007. On the Syntactic Composition of Manner and Motion. Cambridge, MA: The MIT Press.
Stefan Engelberg, Mannheim (Germany)
18. Thematic roles 1. 2. 3. 4. 5. 6.
Introduction Historical and terminological remarks The nature of thematic roles Thematic role systems Concluding remarks References
Abstract Thematic roles provide one way of relating situations to their participants. Thematic roles have been widely invoked both within lexical semantics and in the syntax-semantics interface, in accounts of a wide range of phenomena, most notably the mapping between semantic and syntactic arguments (argument realization). This article addresses two sets of issues. The first concerns the nature of thematic roles in semantic theories: what are thematic roles, are they specific to individual predicates or more general, how do they Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 399–420
400
IV. Lexical semantics figure in semantic representations, and what interactions do they have with event and object individuation and with the semantics of plurality and aspect? The second concerns properties of systems of thematic roles: what is the inventory of thematic roles, and what relationships, such as an ordering of roles in a thematic hierarchy, or the consistency of roles across semantic domains posited by the thematic relations hypothesis, exist among roles? Various applications of thematic roles will be noted throughout these two sections, in some cases briefly mentioning alternative accounts that do not rely on them. The conclusion notes some skepticism about the necessity for thematic roles in linguistic theory.
1. Introduction Thematic roles provide one way of relating situations to their participants. Somewhat informally, we can paraphrase this by saying that participant x plays role R in situation e. Still more informally, the linguistic expression denoting the participant is said to play that role. Thematic roles have been widely invoked both within lexical semantics and in the syntax-semantics interface, in accounts of a wide range of phenomena, including the mapping between semantic and syntactic arguments (argument realization), controller choice of infinitival complements and anaphors, constraints on relativization, constraints on morphological or syntactic phenomena such as passivization, determinants of telicity and distributivity, patterns of idiom frequencies, and generalizations about the lexicon and lexical acquisition. Automatic role labeling is now an active area of investigation in computational linguistics as well (Marquez et al. 2008). After a brief terminological discussion, this article addresses two sets of issues. The first concerns the nature of thematic roles in semantic theories: what are thematic roles, are they specific to individual predicates or more general, how do they figure in semantic representations, and what interactions do they have with event and object individuation and with the semantics of plurality and aspect? The second concerns properties of systems of thematic roles: what is the inventory of thematic roles, and what relationships, such as an ordering of roles in a thematic hierarchy, or the consistency of roles across semantic domains posited by the thematic relations hypothesis, exist among roles? Various applications of thematic roles will be noted throughout these two sections, in some cases briefly mentioning alternative accounts that do not rely on them. The conclusion notes some skepticism about the necessity for thematic roles in linguistic theory.
2. Historical and terminological remarks Pāṇini’s kārakas are frequently noted as forerunners of thematic roles in modern linguistics. Gruber (1965) and Fillmore (1968) are widely credited with initiating the discourse on thematic roles within generative grammar and research relating to thematic roles has blossomed since the 1980s in conjunction with growing interest in the lexicon and in semantics. A variety of terms appear in the literature to refer to essentially the same notion of thematic role: thematic relation, theta-role, (deep) case role, and participant role. A distinction between “broad”, general roles and “narrow”, predicate-specific roles is worth noting as well. General roles (also termed absolute roles (Schein 2002) or thematic role types (Dowty 1989)) apply to a wide range of predicates or events; they include the well-known Agent, Patient, Theme, Goal, and so on. Predicate-specific roles (also
18. Thematic roles termed relativized (Schein 2002), individual thematic roles (Dowty 1989), or “relationspecific roles”) apply only to a specific event, situation, or predicate type; such roles as Devourer or Explainer are examples. There need not be a sharp distinction between broad and narrow roles; indeed, roles of various degrees of specificity, situated in a subsumption hierarchy, can be posited. In this article the term thematic role will cover both broad and narrow roles, though some authors restrict its use to broad roles.
3. The nature of thematic roles Thematic roles have been formally defined as relations, functions, and sets of entailments or properties. As Rappaport & Levin (1988: 17) state: “Theta-roles are inherently relational notions; they label relations of arguments to predicators and therefore have no existence independent of predicators.” Within model-theoretic semantics, an informal definition like the one that begins this article needs to be made explicit in several respects. First, what are the entities to be related? Are thematic roles present in the semantic representations of verbs and other predicators (i.e., lexical items that take arguments), or added compositionally? Are they best viewed as relations or as more complex constructs such as sets of entailments, bundles of features, or merely epiphenomena defined in terms of other linguistic representations that need not be reified at all?
3.1. Defining thematic roles in model-theoretic semantics Attempts to characterize thematic roles explicitly within model-theoretic semantics begin with works such as Chierchia (1984) and Carlson (1984). Both Chierchia and Carlson treat thematic roles as relations between an event and a participant in it. Dowty (1989: 80) provides the following version of Chierchia’s formulation. Events are regarded as tuples of individuals, and thematic roles are therefore defined thus: (1) A θ-role θ is a partial function from the set of events into the set of individuals such that for any event k, if θ(k) is defined, then θ(k) ∈ k. For example, given (2a), an event tuple in which ^kill’ is the intension of the verb kill and x is the killer of y, the functions Agent and Patient yield the values in (2b) and (2c:) (2) a. 〈^kill’, x, y〉 b. Agent(〈^kill', x, y〉) = x c. Patient (〈^kill', x, y〉) = y This shows how thematic roles can be defined within an ordered argument system for representing event types. The roles are distinguished by the order of arguments of the predicate, though there is no expectation that the same roles appear in the same order for all predicates (some predicates may not have an Agent role at all, for example). Another representation, extending Davidson’s (1967) insights on the representation of events, is the neo-Davidsonian one, in which thematic roles appear explicitly as relations between events and participants; see article 34 (Maienborn) Event semantics.The roles are labeled, but not ordered with respect to one another. In one form of such a representation, the equivalent of (2) would be (3):
401
402
IV. Lexical semantics (3) ∃e [killing(e) & Agent(e, x) & Patient(e, y)] Here, thematic roles are binary predicates, taking an eventuality (event or state) as first argument and a participant in that eventuality as second argument. For a discussion of other possibilities for the type of the first argument; see Bayer (1997: 24–28). Furthermore, as Bayer (1997: 5) notes, there are two options for indexing arguments in a neo-Davidsonian representation: lexical and compositional. In the former, used, e.g., in Landman (2000), the lexical entry for a verb includes the thematic roles that connect the verb and its arguments. This is also effectively the analysis implicit in structured lexical semantic representations outside model-theoretic semantics, such as Jackendoff (1987, 1990), Rappaport & Levin (1988), Foley & van Valin (1984), and van Valin (2004). But it is also possible to pursue a compositional approach, in which the lexical entry contains only the event type, with thematic roles linking the arguments through some other process, as does Krifka (1992, 1998), who integrates the thematic role assignments of the verb’s arguments through its subcategorization. Bayer (1997: 127–132) points out some difficulties with Krifka’s system, including coordination of VPs that assign different roles to a shared NP. A position intermediate between the lexical and compositional views is advocated by Kratzer (1996), who claims that what has been regarded as a verb’s external argument is in fact not an argument at all, although its remaining arguments are present in its lexical entry. Thus the representation of kill in (3) would lack the clause Agent(e, x), this role being assigned by a VoiceP (or “little v”) above VP. This, Kratzer argues, accounts for subject/object asymmetries in idiom frequencies and the lack of a true overt agent argument in gerunds. Svenonius (2007) extends Kratzer’s analysis to adpositions. However, Wechsler (2005) casts doubt on Kratzer’s prediction of idiom asymmetries and notes that some mechanism must still select which role is external to the verb’s lexical entry, particularly in cases where there is more than one agentive participant, such as a commercial transaction involving both a buyer and a seller. Characterizing a thematic role as a partial function leaves open the question of how to determine the function’s domain and range; that is, what events is a role defined for, and which participant does the role pick out? In practice, this problem of determining which roles are appropriate for a predicate plagues every system of broad thematic roles, but it is less serious for roles defined on individual predicates. Dowty (1989: 76) defines the latter in terms of a set of entailments, as in (4): (4) Given an n-place predicate δ and a particular argument xi, the individual thematic role 〈δ, i〉 is the set of all properties α such that the entailment □[δ(x1, ... xi, ... xn) → α(xi)] holds. What counts as a “property” must be specified, of course. And as Bayer (1997: 119–120) points out, nothing in this kind of definition tells us which of these properties are important for, say, argument realization. Formulating such cross-lexicon generalizations demands properties or relations that are shared across predicates. Dowty defines cross-predicate roles, or thematic role types, as he terms them, as “the intersection of all the individual thematic roles” (Dowty 1989: 77). Therefore, as stated by Dowty (1991: 552): “From the semantic point of view, the most general notion of thematic role (type) is A SET OF ENTAILMENTS OF A GROUP OF PREDICATES WITH RESPECT TO ONE OF THE ARGUMENTS OF EACH. (Thus a thematic role type is a kind of second-order
18. Thematic roles property, a property of multiplace predicates indexed by their argument positions.).” Thematic role types are numerous; however, he argues, linguists will generally be interested in identifying a fairly small set of these that play some vital role in linguistic theory. This definition of thematic role types imposes no restrictions on whether an argument can bear multiple roles to a predicate, whether roles can share entailments in their defining sets and thus overlap or subsume one another, whether two arguments of a predicate can bear the same role (though this is ruled out by the functional requirement in (1)), and whether every argument must be assigned a role. These constraints, if desired, must be independently specified (Dowty 1989: 78–79) As the discussion of thematic role systems below indicates, various models answer these questions differently. Whether suitable entailments for linguistically significant broad thematic roles can be found at all is a further issue that leads Rappaport & Levin (1988, 2005), Dowty (1991), Wechsler (1995), Croft (1991, 1998), and many others to doubt the utility of positing such roles at all. Finally, note that definitions such as (1) and (4) above, or Dowty’s thematic role types, make no reference to morphological, lexical, or syntactic notions. Thus they are agnostic as to which morphemes, words, or constituents have thematic roles associated with them; that depends on the semantics assigned to these linguistic entities and whether roles are defined only for event types or over a broader range of situation types. Verbs are the prototypical bearers of thematic roles, but nominalizations are typically viewed as role-bearing, and other nouns, adjectives, and adpositions (Gawron 1983, Wechsler 1995, Sevenonius 2007), as predicators denoting event or situation types, will have roles associated with them, too.
3.2. Thematic role uniqueness Many researchers invoking thematic roles have explicitly or implicitly adopted a criterion of thematic role uniqueness. Informally, this means that only one participant in a situation bears a given role. Carlson (1984: 271) states this constraint at a lexical level: “one of the more fundamental constraints is that of ‘thematic uniqueness’ – that no verb seems to be able to assign the same thematic role to two or more of its arguments.” This echoes the θ-criterion of Chomsky (1981), discussed below. Parsons (1990: 74) defines thematic uniqueness with respect to events and their participants: “No event stands in one of these relations to more than one thing.” Note that successfully connecting the lexical-level constraint and the event-level constraints requires the Davidsonian assumption that there is a single event variable in the semantic representation of a predicator, as Carlson (1998: 40) points out. In some models of lexical representation (e.g., Jackendoff’s lexical decomposition analyses and related work), this is not the case, as various subevents are represented, each of which can have a set of roles associated with it. The motivations for role uniqueness are varied. One is that it simplifies accounts of mapping from thematic roles to syntactic arguments of predicators (which are also typically regarded as unique). It is implicit in hypotheses such as the Universal Alignment Hypothesis (Rosen 1984) and the Uniformity of Theta Assignment Hypothesis (Baker 1988, 1997), described in greater detail in the section on argument realization below. Role uniqueness also provides a tool to distinguish situations – if two different individuals appear to bear the same role, then there must be two distinct situations involved (Landman 2000: 39). A further motivation, emphasized in the work of Krifka (1992, 1998),
403
404
IV. Lexical semantics is that role uniqueness and related conditions are crucial in accounting for the semantics of events in which a participant is incrementally consumed, created, or traversed. Role uniqueness does not apply straightforwardly to all types of situations. Krifka (1998: 209) points out that a simple definition of uniqueness – “it should not be the case that one and the same event has different participants” – is “problematic for see and touch.” If one sees or touches an orange, for example, then one also typically sees or touches the peel of the orange. But the orange and its peel are distinct; thus the seeing or touching event appears to have multiple entities bearing the same role. A corollary of role uniqueness is exhaustivity, the requirement that whatever bears a given thematic role to a situation is the only thing bearing that role. Thus if some group is designated as the Agent of an event, then there can be no larger group that is also the Agent of that same event. Exhaustivity is equivalent to role uniqueness under the condition that only one role may be assigned to an individual; as Schein (2002: 272) notes, however, if thematic roles are allowed to be “complex” – that is, multiple roles can be assigned conjunctively to individuals, then exhaustivity is a weaker constraint than uniqueness.
3.3. How fine-grained should thematic roles be? As noted in the introduction, thematic roles in the broad sense are frequently postulated as crucial mechanisms in accounts of phenomena at the syntax-semantics interface, such as argument realization, anaphoric binding, and controller choice. For thematic roles to serve many of these uses, they must be sufficiently broad, in the sense noted above, that they can be used in stating generalizations covering classes of lexical items or constructions. To be useful in this sense, therefore, a thematic role should be definable over a broad class of situation types. Bach (1989: 111) articulates this view: “Thematic roles seem to represent generalizations that we make across different kinds of happenings in the world about the participation of individuals in the eventualities that the various sentences are about.” Schein (2002: 265) notes that if “syntactic positions [of arguments given their thematic roles] are predictable, we can explain the course of acquisition and our understanding of novel verbs and of familiar verbs in novel contexts.” Within model-theoretic semantics, two types of critiques have been directed at the plausibility of broad thematic roles. One is based on the difficulty of formulating definitional criteria for such roles, arguing that after years of effort, no rigorous definitions have emerged. The other critique examines the logical implications of positing such roles, bringing out difficulties for semantic representations relying on broad thematic roles given basic assumptions, such as role uniqueness. It is clear that the Runner role of run and the Jogger role of jog share significant entailments (legs in motion, body capable of moving along a path, and so on). Indeed, to maintain that these two roles are distinct merely because there are two distinct verbs run and jog in English seemingly determines a semantic issue on the basis of a language-particular lexical accident. But no attempt to provide necessary, sufficient, and comprehensive criteria for classifying most or all of the roles of individual predicates into a few broad roles has yet to meet with consensus. This problem has been noted by numerous researchers; see, for example, Dowty (1991: 553–555), Croft (1991: 155–158), Wechsler (1995: 9–11), and Levin & Rappaport Hovav (2005: 38–41) for discussion. Theme in particular is notoriously defined in vague and differing ways, but it is not an isolated case; many partially overlapping criteria for Agenthood have been proposed, including volitionality, causal
18. Thematic roles involvement, and control or initiation of an event; see Kiparsky (1997: 476) for a remark on cross-linguistic variation in this regard and Wechsler (2005: 187–193) for a proposal on how to represent various kinds of agentive involvement. Moreover, there are arguments against the possibility of broad roles even for situations that seemingly semantically quite close, in cases where two predicators would appear to denote the same situation type. The classic case of this, examined from various perspectives by numerous authors, involves the verbs buy and sell. As Parsons (1990: 84) argues, given the two descriptions of a commercial transaction in (5): (5) a. Kim bought a tricycle from Sheehan. b. Sheehan sold a tricycle to Kim. and the assumptions that “Kim is the Agent of the buying, and Sheehan is the Agent of the selling”, then to “insist that the buying and the selling are one and the same event, differently described” entails that “Kim sold a tricycle to Kim, and Sheehan bought a tricycle from Sheehan.” Parsons concludes that the two sentences must therefore describe different, though “intimately related”, events; see article 29 (Gawron) Frame Semantics. Landman (2000: 31–33) and Schein (2002) make similar arguments; the import of which is that one must either individuate fine-grained event types or distinguish thematic roles in a fine-grained fashion. Schein (2002) suggests, as a possible alternative to fine-grained event distinctions, a ternary view of thematic roles, in which the third argument is essentially an index to the predicate, as in (6), rather than the dyadic thematic roles in, e.g., (3): (6) Agent(e, x, kill') The Agent role of kill is thereby distinguished from the Agent of murder, throw, explain, and so on. This strategy, applied to buy and sell, blocks the invalid inference above. However, this would allow the valid inference from one sentence in (5) to the other only if we separately ensure that the Agent of buy corresponds to the Recipient of sell, and vice versa. Moreover, other problems remain even under this relativized view of thematic roles, in particular concerning symmetric predicates, the involvement of parts of individuals or of groups of individuals in events, whether they are to be assigned thematic roles, and whether faulty inferences would result if they are. Assuming role uniqueness (or exhaustivity), and drawing a distinction between an individual car and the group of individuals constituting its parts, one is seemingly forced to conclude that (7a) and (7b) describe different events (Carlson 1984, Schein 2002). The issue is even plainer in (8), as the skin of an apple is not the same as the entire apple. (7) a. I weighed the Volvo. b. I weighed (all) the parts of the Volvo. (8) a. Kim washed the apple. b. Kim washed the skin of the apple. Similar problems arise with symmetric predicators such as face or border, as illustrated in (9) (based on Schein 2002) and (10).
405
406
IV. Lexical semantics (9) a. The Carnegie Deli faces Carnegie Hall. b. Carnegie Hall faces the Carnegie Deli. (10) a. Rwanda borders Burundi. b. Burundi borders Rwanda. If two distinct roles are ascribed to the arguments in these sentences, and the mapping of roles to syntactic positions is consistent, then the two sentences in each pair must describe distinct situations. Schein remarks that the strategy of employing fine-grained roles indexed to a predicator fails for cases like (7) through (10), since the verb in each pair is the same. Noting that fine-grained events seem to be required regardless of the availability of fine-grained roles, Schein (2002) addresses these issues by introducing additional machinery into semantic representations, including fine-grained scenes as perspectives on events, to preserve absolute (broad) thematic roles. Each sentence in (9) or (10) involves a different scene, even though the situation described by the two sentences is the same. “In short, scenes are fine-grained, events are coarser, and sentences rely on (thematic) relations to scenes to convey what they have to say about events.” (Schein 2002: 279). Similarly, the difficulties posed by (7) and (8) are dealt with through “a notion of resolution to distinguish a scene fine-grained enough to resolve the Volvo’s parts from one which only resolves the whole Volvo.” (Schein 2002: 279).
3.4. Thematic roles and plurality Researchers with an interest in the semantics of plurals have studied the interaction of plurality and thematic roles. Although Carlson (1984: 275) claims that “it appears to be necessary to countenance groups or sets as being able to play thematic roles”, he does not explore the issues in depth. Landman (2000: 167), contrasting collective and distributive uses of plural subjects, argues that “distributive predication is not an instance of thematic predication.” Thus, in the distributive reading of a sentence like The boys sing., the semantic properties of Agents hold only of the individual boys, so “no thematic implication concerning the sum of the boys itself follows.” For “on the distributive interpretation, not a single property that you might want to single out as part of agenthood is predicated of the denotation of the boys” (Landman 2000: 169). Therefore, Landman argues, “the subject the boys does not fill a thematic role” of sing, but rather a “nonthematic role” that is derived from the role that sing assigns to an individual subject. He then develops a theory of plural roles, derived from singular roles (which are defined only on atomic events), as follows (Landman 2000: 184), where E is the domain of events (both singular and plural): (11) If e is an event in E, and for every atomic part a of e, thematic role R is defined for a, then plural role *R is defined for e, and maps e onto the sum of the R-values of the atomic parts of e. Thus *R subsumes R (if e is itself atomic then R is defined for e). Although Landman does not directly address the issue, his treatment of distributive readings and plural roles seems to imply that a role-based theory of argument realization – indeed, any theory of
18. Thematic roles argument realization based on entailments of singular arguments – must be modified to extend to distributive plural readings.
3.5. Thematic roles and aspectual phenomena Krifka (1992, 1998) explores the ways in which entailments associated with participants can account for aspectual phenomena. In particular, he aims to make precise the notions of incremental theme and of the relationships of incremental participants to events and subevents, formalizing properties similar to some of the proto-role properties of Dowty (1991). Krifka (1998: 211–212) defines some of these properties as follows, where ⊕ denotes a mereological sum of objects or events, ≤ denotes a part or subevent relation, and ∃x means that there exists a unique entity x of which the property following x holds: (12) a. A role θ shows uniqueness of participants, UP(θ), iff: θ(x, e) & θ(y, e) → x = y b. A role θ is cumulative, CUM(θ), iff: θ(x, e) & θ(y, e⬘) → θ(x ⊕P y, e ⊕E e⬘) c. A role θ shows uniqueness of events, UE(θ), iff: θ(x, e) & y ≤P x → ∃!e⬘[e⬘ ≤E e & θ(y, e⬘)] d. A role θ shows uniqueness of objects, UO(θ), iff: θ(x, e) & e⬘ ≤E e → ∃!y[y ≤P x & θ(y, e⬘)] The first of these is a statement of role uniqueness, and the second is a weak property that holds of a broad range of participant roles, which somewhat resembles Landman’s definition of plural roles in (11). The remaining two are ingredients of incremental participant roles, including those borne by entities gradually consumed or created in an event. Krifka’s analysis treats thematic roles as the interface between aspectual characteristics of events, such as telicity, and the mereological structure of entities (parts and plurals). In the case of event types with incremental roles, the role’s properties establish a homomorphism between parts of the participant and subevents of the event. Slightly different properties are required for a parallel analysis of objects in motion and the paths they traverse.
4. Thematic role systems This section examines some proposed systems of thematic roles; that is, inventories of roles and relationships among them, if any. One simple version of such a system is an unorganized set of broad roles, such as Agent, Instrument, Experiencer, Theme, Patient, etc., each assumed to be atomic, primitive, and independent of one another. In other systems, roles may be treated as non-atomic, being defined either in terms of more basic features, as derived from positional criteria within structured semantic representations, or situated within a hierarchy of roles and subroles (for example, the Location role might have Source and Goal subroles). Dependencies among roles are also posited; it is reasonable to claim that predicates allowing an Instrument role should also then require an Agent role, or that Source or Goal are meaningless without a Theme in motion from one to the other. Another type of dependency among roles is a thematic hierarchy, a global
407
408
IV. Lexical semantics ordering of roles in terms of their prominence, as reflected in, for instance, argument realization or anaphoric binding phenomena. These applications of thematic roles at the syntax-semantic interface are examined at the end of this section.
4.1. Lists of primitive thematic roles Fillmore (1968) was one of the earliest in the generative tradition to present a system of thematic roles (which he terms “deep cases”). This foreshadows the hypotheses of later researchers about argument realization, such as the Universal Alignment Hypothesis (Perlmutter & Postal 1984, Rosen 1984), and the Uniformity of Theta Assignment Hypothesis (Baker 1988, 1997), discussed in the section on argument realization below. Fillmore’s cases are intended as an account of argument realization at deep structure, and are, at least implicitly, ranked. A version of role uniqueness is also assumed, and roles are treated as atomic and independent of one another. A simple use of thematic roles that employs a similar conception of them is to ensure a one-to-one mapping from semantic roles of a predicate to its syntactic arguments. This is exemplified by the θ-criterion, a statement of thematic unique-ness formulated by Chomsky (1981: 36), some version of which is assumed in most syntactic research in Government and Binding, Principles and Parameters, early Lexical-Functional Grammar, and related frameworks. (13) Each argument bears one and only one θ-role, and each θ-role is assigned to one and only one argument. This use of thematic roles as “OK marks”, in Carlson’s (1984) words, makes no commitments as to the semantic content of θ-roles, nor to the ultimate syntactic effects. As Ladusaw & Dowty (1988: 62) remark, “the θ-criterion and θ-roles are a principally diacritic theory: what is crucial in their use in the core of GB is whether an argument is assigned a θ-role or not, which limits possible structures and thereby constrains the applications of rules.” The θ-criterion as stated above is typically understood to apply to coreference chains; thus “each chain is assigned a θ-role.” and “the θ-criterion must be reformulated in the obvious way in terms of chains and their members..” (Chomsky 1982: 5–6). Other researchers have continued along the lines suggested by Fillmore (1968), furnishing each lexical item with a list of labeled thematic roles. These representations are typically abstracted away from the issue of whether roles should be represented in a Davidsonian fashion; an example would look like (14), where the verb cut is represented as requiring the two roles Agent and Patient and allowing an optional Instrument: (14)
cut (Agent, Patient, (Instrument))
This is one version of the “θ-grid” in the GB/P&P framework. Although the representation in (14) uses broad roles, the granularity of roles is a independent issue. For an extensive discussion of such models, see chapter 2 of Levin & Rappaport Hovav (2005). They note several problems with thematic list approaches: difficulties in determining which role to assign “symmetric” predicates like face and border with two arguments seemingly bearing the same role, “the assumption that semantic roles are taken to be discrete
18. Thematic roles and unanalyzable” Levin & Rappaport Hovav (2005: 42) rather than exhibiting relations and dependencies amongst themselves, and the failure to account for restrictions on the range of possible case frames (Davis & Koenig 2000: 59–60). An additional feature of some thematic list representations is the designation of one argument as the external argument. Belletti & Rizzi (1988: 344), for example, annotate this external argument, if present, by italicizing it, as in (15a), as opposed to (15b) (where lexically specified case determines argument realization, and there is no external argument). (15) a. temere (‘fear’) (Experiencer, Theme) b. preocupare (‘worry’) (Experiencer, Theme) This leaves open the question of how the external argument is to be selected; Belletti & Rizzi address this issue only in passing, suggesting the possibility that a thematic hierarchy is involved.
4.2. Thematic hierarchies It is not an accident that thematic list representations, though ostensibly consisting of unordered roles, typically list the Agent role first if it is present. Apart from the intuitive sense of Agents being the most “prominent”, the widespread use of thematic role lists in argument realization leads naturally to a ranking of thematic roles in a thematic hierarchy, in which prominence on the hierarchy corresponds to syntactic prominence, whether configurationally or in terms of grammatical functions. As Levin & Rappaport Hovav (2005: chapters 5 and 6) make clear, there are several distinct bases on which a thematic hierarchy might be motivated, independently of its usefulness in argument realization: prominence in lexical semantic representations (Jackendoff 1987, 1990), event structure, especially causal structure (Croft 1991, Wunderlich 1997), and topicality or salience (Fillmore 1977). However, many authors motivate a hierarchy primarily by argument realization, adopting some version of a correspondence principle to ensure that the thematic roles defined for a predicate are mapped to syntactic positions (or grammatical functions) in prominence order. The canonical thematic hierarchy is a total ordering of all the thematic roles in a theory’s inventory (many hierarchies do not distinguish an ordering among Source, Goal, and Location, however). While numerous variants have been proposed – Levin & Rappaport Hovav (2005: 162–163) list over a dozen – they agree on ranking Agent/Actor topmost, Theme and/or Patient near the bottom, and Instrument between them. As with any model invoking broad thematic roles, thematic hierarchy approaches face the difficulty of defining the roles they use, and addressing the classification of roles of individual verbs that do not fit well. A thematic hierarchy of fine-grained roles would face the twin drawbacks of a large number of possible orderings and a lack of evidence for establishing a relative ranking of many roles. Some researchers emphasize that the sole function of the thematic hierarchy is to order the roles; the role labels themselves are invisible to syntax and morphology, which have access only to the ordered list of arguments. Thus Grimshaw (1990: 10) states that though she will “use thematic role labels to identify arguments ... the theory gives no status to this information.” Williams (1994) advocates a similar view of the visibility of
409
410
IV. Lexical semantics roles at the syntactic level. Wunderlich (1997) develops an approach that is similar in this regard, where depth of embedding in a lexical semantic structure determines the prominence of semantic arguments. It is worth noting that these kinds of rankings can be achieved by means other than a thematic hierarchy, or even thematic roles altogether. Rappaport & Levin’s (1988) predicate argument structures consist of an ordered list of argument variables derived from lexical conceptual structures like those in (17) below, but again no role information is present. Fillmore (1977) provides a saliency hierarchy of criteria for ascertaining the relative rank of two arguments; these include active elements outranking inactive ones, causal arguments outranking noncausal ones, and changed arguments outranking unchanged ones. Gawron (1983) also makes use of this system in his model of argument realization. Wechsler (1995) similarly relies on entailments between pairs of participants, such as one having a notion of the other, to determine a partial ordering amongst arguments. Dowty’s (1991) widely-cited system of comparing numbers of proto-agent and proto-patient entailments is a related though distinct approach, discussed in greater detail below. Finally, Grimshaw (1990) posits an aspectual hierarchy in addition to a thematic hierarchy, on which causers outrank other elements. Li (1995) likewise argues for a causation-based hierarchy independent of a thematic hierarchy, based on argument realization in Mandarin Chinese resultative compounds. Primus (1999, 2006) presents a system of two role hierarchies, corresponding to involvement and causal dependency. “Morphosyntactic linking , i.e., case in the broader sense, corresponds primarily to the degree and kind of involvement ... while structural linking responds to semantic dependency” (Primus 2006: 54). These multiple-hierarchy systems bear an affinity to systems that assign multiple roles to participants and to the structured semantic representations of Jackendoff, discussed in the following section.
4.3. Multiple role assignment Thematic role lists and hierarchies generally assume some principle of thematic uniqueness. However, there are various arguments for assigning multiple roles to a single argument of a predicator, as well as cases where it is hard to distinguish the roles of two arguments. Symmetric predicates such as those in (9) and (10) exemplify the latter situation. As for the former, Jackendoff (1987: 381–382) suggests buy, sell, and chase as verbs with arguments bearing more than one role, and any language with morphologically productive causative verbs furnishes examples such as ‘cause to laugh’, in which the laugher exhibits both Agent and Patient characteristics. Williams (1994) argues that the subject of a small clause construction like John arrived sad. is best analyzed as bearing two roles, one from each predicate. Broadwell (1988: 123) offers another type of evidence for multiple role assignment in Choctaw (a Muskogean language of the southeastern U.S.). Some verbs have suppletive forms for certain persons and numbers, and “1 [= subject] agreement is tied to the θ-roles Agent, Effector, Experiencer, and Source/Goal. Since the Choctaw verb ‘arrive’ triggers 1 agreement, its subject must bear one of these θ-roles. But I have also argued that suppletion is tied to the Theme, and so the subject must bear the role Theme.” Similarly, one auxiliary is selected if the subject is a Theme, another otherwise, and arrive in Choctaw selects the Theme-subject auxiliary. Cases like these have led many researchers to pursue representations in which roles are derived, not primitive. Defining roles in terms of features, or positions in lexical
18. Thematic roles representations based on semantic decomposition, can more readily accommodate those predicators that appear to violate role uniqueness within a system of broad thematic roles.
4.4. Structural and featural analyses of thematic roles Many researchers, perhaps dissatisfied with the seemingly arbitrary set of descriptions of the meanings of thematic roles such as Agent, Patient, Goal, and, notoriously, Theme, have sought organizing principles that would characterize the range of broad thematic roles. These efforts can be loosely divided into two, somewhat overlapping types. The first might be called structural or relational, situating roles within structures typically representing lexical entries, including properties of the event type they denote. The second approach is featural, analyzing roles in terms of another set of more fundamental features that provides some structure for the set of possible roles. Both approaches are compatible with viewing thematic roles as sets of entailments. Under the structural approach, the entailments are associated with positions in a lexical or event structure, while under the featural approach, the entailments can be regarded as features. Both approaches are also compatible with thematic hierarchies or other prominence schemes. However, a notion of prominence within event structures can do the work of a thematic hierarchy in argument selection within a structural approach, and prominence relations between features can do the same within a featural approach. While Fillmore’s cases are presented as an unstructured list, Gruber (1965) and Jackendoff (1983) develop a model in which the relationships between entities in motion and location situations provide an inventory of roles: Theme, Source, Goal, and Location. Through analogy, these extend to a wide range of semantic domains, as phrased by Jackendoff (1983: 188): (16) Thematic Relations Hypothesis In any semantic field of [events] and [states], the principal event-, state-, path-, and place-functions are a subset of those used for the analysis of spatial location and motion. Fields differ in only three possible ways: a. what sorts of entities may appear as theme; b. what sorts of entities may appear as reference objects; c. what kind of relation assumes the role played by location in the field of spatial expressions. This illustrates one form of lexical decomposition (see article 17 (Engelberg) Frameworks of decomposition) allowing thematic roles to be defined in terms of their positions within the structures representing the semantics of lexical items. Such an approach is compatible with the entailment-based treatments of thematic roles discussed above, but developing a model-theoretic interpretation of these structures that would facilitate this has not been a priority for those advocating this kind of analysis. However, lexical decomposition does reflect the internal complexity of natural language predicators. For example, causative verbs are plausibly analyzed as denoting two situation types standing in a causal relationship to one another, and transactional verbs such as buy, sell, and rent as denoting two oppositely-directed transfers. The Lexical Conceptual Structures of Rappaport & Levin (1988) (see article 19 (Levin & Rappaport Hovav)
411
412
IV. Lexical semantics Lexical Conceptual Structure) illustrate decomposition into multiple subevents of the two alternants of “spray/load” verbs; the LCS for the Theme-object alternant is shown in (17a) and that of the Location-object alternant in (17b), in which the LCS of the former is embedded as a substructure: (17) a. [x cause [y to come to be at z]] b. [x cause [z to come to be in STATE] BY MEANS OF [x cause [y to come to be at z]]] As noted above, in such representations, thematic roles are not primitive, but derived notions. And because LCSs – or their counterparts in other models – can be embedded as in (17b), there may be multiple occurrences of the same role type within the representation of a single predicate. A causative or transactional verb can then be regarded as having two Agents, one in each subevent. Furthermore, participants in more than one subevent can accordingly be assigned more than one role; thus the buyer and seller in a transaction are at once Agents and Recipients (or Goals). This leads to a view of thematic roles that departs from the formulations of role uniqueness developed within an analysis of predicators as denoting a unitary, undecomposed event, though uniqueness can still be postulated for each subevent in a decompositional representation. It also can capture some of the dependencies among roles; for example Jackendoff (1987: 398–402) analyzes the Instrument role in terms of conceptual structures representing intermediate causation, and this role exists only in relation to others in the chain of causation. Croft (1991, 1998) has taken causation as the fundamental framework for defining relationships between participants in events, with the roles more closely matching the Agent, Patient, and Instrument of Fillmore (1968). The causal ordering is plainly correlated with the ordering of roles found in thematic hierarchies and with the Actor-Undergoer cline of Role and Reference Grammar (Foley & van Valin 1984, van Valin 2004), which orders thematic roles according to positions in a structure intended to represent causal and aspectual characteristics of situations. In Jackendoff (1987, 1990) the causal and the motion/location models are combined in representations of event structures, some quite detailed and elaborate. Thematic roles within these systems are also derived notions, defined in terms of positions within these event representations. As the systems become more elaborate, one crucial issue is: how can we tell what the correct representation should be? Both types of systems exploit metaphorical extensions of space, motion, and causation to other, more abstract domains, and it is often unclear what the “correct” application of the metaphors should be. The implication for thematic roles defined within such systems is that their semantic foundations are not always sound. Notable featural analyses of thematic roles include Ostler (1979), whose system uses eight binary features that characterize 48 roles. His features are further specifications of the four generalized roles: Theme, Source, Goal, and Path from the Thematic Role Hypothesis, and include some that resemble entailments (volitional) and some that specify a semantic domain (positional, cognitive). Somers (1987) puts forward similar systems, again based on four broad roles that appear in various semantic domains, and Sowa (2000), takes this as a point of departure for a hierarchy of role types within a knowledge representation system. Sowa decomposes the four types of Somers into two pairs of roles characterized by features or entailments: Source (“present at the beginning of the process”) and Goal (“present at the end of the process”), and Determinant
18. Thematic roles (“determines the direction of the process”) and Immanent (“present throughout the process”, but “does not actively control what happens”). Sowa argues that this system allows for useful underspecification; in the dog broke the window, the dog might might be involved volitionally or nonvolitionally as initiator, or used as an instrument by some other initiator, but Source covers all of these possibilities. This illustrates another characteristic of Sowa’s system; he envisions an indefinitely large number of roles, of varying degrees of specificity, but all are subtypes of one of these four. In this kind of system, the set of features will also grow indefinitely, so that it is more naturally viewed as a hierarchy of roles induced from the hierarchy of event types, at least below the most general roles. Similar ideas have been pursued in Lehmann (1996) and the Framenet project (http://framenet.icsi.berkeley.edu); see article 29 (Gawron) Frame Semantics. Rozwadowska (1988) pursues a somewhat different analysis, with three binary features: ±sentient, ±cause, and ±change, intended to characterize broad thematic roles. This system permits natural classes of roles to be defined, which Rozwadowska argues are useful in describing restrictions on the interpretation of English and Polish specifiers of nominalizations. For example, the distinction between *the movie’s shock of the audience, vs. the audience’s shock at the movie is accounted for by a requirement that the specifier’s role not be Neutral; that is, not have negative values of all three features. Thus either an Agent or an Experiencer NP (both marked as +change) can appear in specifier position, but not a Neutral one. Featural analyses of thematic roles are close in spirit to entailment-based frameworks that dispense with reified roles altogether. If the features can be defined with sufficient clarity and consistency, then a natural question to ask is whether the work assigned to thematic roles can be accomplished simply by direct reference to these definitions of participant properties. The most notable exponent of this approach is Dowty (1991), with the two sets of proto-Agent and proto-Patient entailments. Wechsler (1995) is similar in spirit but employs entailments as a means of partially ordering a predicator’s arguments. Davis & Koenig (2000) and Davis (2001) combine elements of these with a limited amount of lexical decomposition. While these authors eschew the term “thematic role” for their characterizations of a predicate’s arguments, such definitions do conform to Dowty’s (1989) definition of thematic role types noted above, though they do not impose any thematic uniqueness requirements.
4.5. Thematic roles and argument realization Argument realization, the means by which semantic roles of predicates are realized as syntactic arguments of verbs, nominalizations, or other predicators, has been a consistent focus of linguists seeking to demonstrate the utility of thematic roles. This section examines some approaches to argument realization, and the following one briefly notes other syntactic and morphological phenomena that thematic roles have been claimed to play a role in. Levin & Rappaport Hovav (2005) provide an extensive discussion of diverse approaches to argument realization, including those in which thematic roles play a crucial part. Such accounts can involve a fixed, “absolute” rule, such as “map the Agent to the subject (or external argument)” or relative rules, which establish a correspondence between an ordering (possibly partial) among a predicate’s semantic roles and an ordering among
413
414
IV. Lexical semantics grammatical relations or configurationally-defined syntactic positions. The role ordering may correspond to a global ordering of roles, such as a thematic hierarchy, or be derived from relative depth of semantic roles or from relationships, such as entailments, holding among sets of roles. One example is the Universal Alignment Hypothesis developed within Relational Grammar; the version here is from Rosen (1984: 40): (18) There exists some set of universal principles on the basis of which, given the semantic representation of a clause, one can predict which initial grammatical relation each nominal bears. Another widely employed principle of this type is the Uniformity of Theta Assignment Hypothesis (UTAH), in Baker (1988: 46), which assumes a structural conception of thematic roles: (19) Identical thematic relationships between items are represented by identical structural relationships between those items at the level of D-structure. Baker (1997) examines a relativized version of this principle, under which the ordering of a predicator’s roles, according to the thematic hierarchy, must correspond to their relative depth in D-structure. In Lexical Mapping Theory (Bresnan & Kanerva 1989, Bresnan & Moshi 1990, Alsina 1992), the correspondence is more complex, because grammatical relations are decomposed into features, which in turn interface with the thematic hierarchy. A role’s realization is thus underspecified in some cases until default assignments fill in the remaining feature. For example, an Agent receives an intrinsic classification of –O(bjective), and the highest of a predicate’s roles is assigned the feature –R(estricted) by default, with the result that Agents are by default realized as subjects. These general principles typically run afoul of the complexities of argument realization, including cross-linguistic and within-language variation among predicators and diathesis alternations. A very brief mention of some of the difficulties follows. First, to avoid an account that is essentially stipulative, the inventory of roles cannot be too large or too specific; once again this leads to the problem of assigning roles to the large range of verbs whose arguments appear to fit poorly in any of the roles. Second, cross-linguistic variation in realization patterns poses problems for principles like (18) and (19) that claim universally consistent mapping; some languages permit a wide range of ditransitive constructions, for example, while others entirely lack them (Gerdts 1992). Third, there are some cases where argument realization appears to involve information outside what would normally be ascribed to lexical semantic representations; some semantically similar verbs require different subcategorizations (wish for vs. desire, look at vs. watch, appeal to vs. please) or display differing diathesis alternations (hide and dress permit an intransitive alternant with reflexive meaning, while conceal and clothe are only transitive) (Jackendoff 1987: 405–406, Davis 2001: 171–173). Fourth, there are apparent cases of the same roles being mapped differently, as in the Italian verbs temere and preocupare in (15) above; these cases prove problematic particularly for a model with a thematic hierarchy and a role list for each predicate. These difficulties have been addressed in large degree through entailment-based models described in the following section, and through models that employ more
18. Thematic roles
415
elaborate semantic decomposition that what is assumed by principles such as (18) and (19) (Jackendoff 1987, 1990, Alsina 1992, Croft 1991, 1998).
4.6. Lexical entailments as alternatives to thematic roles Dowty (1991) is an influential proposal for argument realization avoiding reified thematic roles in semantic representations. The central idea is that subject and object selection relies on a set of proto-role entailments, grouped into two sets, proto-agent properties and proto-patient properties, as follows (Dowty 1991: 572): (20) Contributing properties for the Agent Proto-Role a. volitional involvement in the event or state b. sentience (and/or perception) c. causing an event or change of state in another participant d. movement (relative to the position of another participant) e. exists independently of the event named by the verb) (21) Contributing properties for the Patient Proto-Role a. undergoes change of state b. incremental theme c. causally affected by another participant d. stationary relative to the movement of another participant e. does not exist independently of the event, or not at all) Each of these properties may or may not be entailed of a participant in a given type of event or state. Dowty’s Argument Selection Principle, in (22), characterizes how transitive verbs may realize their semantic arguments (Dowty 1991: 576): (22) In predicates with grammatical subject and object, the argument for which the predicate entails the greatest number of Proto-Agent properties will be lexicalized as the subject of the predicate; the argument having the greatest number of Proto-Patient properties will be lexicalized as the direct object. This numerical comparison across semantic roles accounts well for the range of attested transitive verbs in English, but fares less well with other kinds of data (Primus 1999, Davis & Koenig 2000, Davis 2001, Levin & Rappaport Hovav 2005). In the Finnish causative example in (23), for example (Davis 2001: 69), the subject argument bears fewer proto-agent entailments than the direct object. (23) Uutinen puhu-tt-i nais-i-a woman-pl-PART news-item talk-CAUS-PAST ‘The news made the women talk for a long time.’
pitkään. long-ILL
Causation appears to override the other entailments in (23) and similar cases. In addition, (22) makes no prediction regarding predicators other than transitive verbs, but their argument realization is not unconstrained; as with transitive verbs, a causally affecting argument or an argument bearing a larger number of proto-agent entailments is realized
416
IV. Lexical semantics as the subject of verbs such as English prevail (on), rely (on), hope (for), apply (to), and many more. Wechsler (1995) presents a model of argument realization that resembles Dowty’s in eschewing thematic roles and hierarchies in favor of a few relational entailments amongst participants in a situation, such as whether one participant necessarily has a notion of another, or is part of another. Davis & Koenig (2000) and Davis (2001) borrow elements of Dowty’s and Wechsler’s work but posit reified “proto-role attributes” in semantic representations reflecting the flow of causation in response to the difficulties noted above.
4.7. Other applications of thematic roles in syntax and morphology Thematic roles and thematic hierarchies have been invoked in accounts of various other syntactic phenomena, and the following provides only a sample. Some accounts of anaphoric binding make explicit reference to thematic roles, as opposed to structural prominence in lexical semantic representations. Typically, such accounts involve a condition that the antecedent of an anaphor must outrank it on the thematic hierarchy. Wilkins (1988) is one example of this approach. Williams (1994) advocates recasting the principles of binding theory in terms of role lists defined on predicators (including many nouns and adjectives). This allows for “implicit”, syntactically unrealized arguments to participate in binding conditions. For example, the contrast between admiration of him (admirer and admiree must be distinct) and admiration of himself (admirer and admiree must be identical) suggests that even arguments that are not necessarily syntactically realized play a crucial role in binding. Thematic role labels, however, play no part in this system; rather, the arguments of a predicator are simply in an ordered list, as in the argument structures of Rappaport & Levin (1988) and Grimshaw (1990), though ordering on the list might be determined by a thematic hierarchy. And Jackendoff (1987) suggests that indexing argument positions within semantically decomposed lexical representations can address these binding facts, without reference to thematic hierarchies. Everaert & Anagnostopoulou (1997) argue that local anaphors in Modern Greek display a dependence on the thematic hierarchy; as a Goal or Experiencer antecedent can bind a Theme, for example, but not the reverse. This holds even when the lower thematic role is realized as the subject, resulting in a subject anaphor. Nishigauchi (1984) argues for thematic role-based effect in controller selection for infinitival complements and purpose clauses, a view defended by Jones (1988). For example, the controller of a purpose clause is said to be a Goal. Ladusaw & Dowty (1988) counter that the data is better handled by verbal entailments and by general principles of world knowledge about human action and responsibility. Donohue (1996) presents data on relativization in Tukang Besi (an Austronesian language of Indonesia) that suggest a distinct relativization strategy for Instruments, regardless of their grammatical relation. Mithun (1984) proposes an account of noun incorporation in which Patient is preferred over other roles for incorporation, though in some languages arguments bearing other roles (Instrument or Location) may incorporate as well. But alternatives based on underlying syntactic structure (Baker 1988) and depth of embedding in lexical semantic representations (Kiparsky 1997) have also been advanced. Evans (1997) examines
18. Thematic roles noun-incorporation in Mayali (a non-Pama-Nyungan language of northern Australia) and finds a thematic-role based account inadequate to deal with the range of incorporated nominals. He instead suggests that constraints based on animacy and prototypicality in the denoted event are crucial in selecting the incorporating argument.
5. Concluding remarks Dowty (1989: 108–109) contrasts two positions on the utility of (broad) thematic roles. Those advocating that thematic roles are crucially involved in lexical, morphological, and syntactic phenomena have consequently tried to define thematic roles and develop thematic role systems. But, even 20 years later, the position Dowty states in (24) can also be defended: (24) Thematic roles per se have no priviledged [sic] status in the conditioning of syntactic processes by lexical meaning, except insofar as certain semantic distinctions happen to occur more frequently than others among natural languages. Given the range of alternative accounts of argument realization, lexical acquisition, and other phenomena for which the “traditional”, broad thematic roles have sometimes been considered necessary, and the additional devices required even in those approaches that do employ them, it is unclear how much is gained by introducing them as reified elements of linguistic theory. There do appear to be some niche cases of phenomena that depend on such notions, some of which are noted above, and there are stronger motivations for entailment-based approaches to argument realization, diathesis alternations, aspect, and complement control. Such entailments can certainly be viewed as thematic roles, some even as roles in a broad sense that apply to a large class of predicates. But the overall picture is not one that lends support to the “traditional” notion of a small inventory of broad roles, with each of a predicate’s arguments uniquely assigned one of them. Fine-grained roles serve a somewhat different function in model-theoretic semantics, one not dependent on the properties of a thematic role system but on the use of roles in individuating events and how they are related to their participants. But in this realm, too, they do not come without costs; as Landman and Schein have argued, dyadic thematic roles, coupled with principles of role uniqueness, lead both to unwelcome inferences that can be blocked only with additional mechanisms and to requiring events that are intuitively too fine-grained. These difficulties arise in connection with symmetric predicators, transactional verbs, and other complex event types that may warrant a more elaborated treatment than thematic roles defined on a unitary predicate can offer. The author gratefully acknowledges detailed comments on this article from Cleo Condoravdi and from the editors.
6. References Alsina, Alex 1992. On the argument structure of causatives. Linguistic Inquiry 23, 517–555. Bach, Emmon 1989. Informal Lectures on Formal Semantics. Albany, NY: State University of New York Press. Baker, Mark C. 1988. Incorporation. A Theory of Grammatical Function Changing. Chicago, IL: The University of Chicago Press.
417
418
IV. Lexical semantics Baker, Mark C. 1997. Thematic roles and syntactic structure. In: L. Haegeman (ed.). Elements of Grammar. Dordrecht: Kluwer, 73–137. Bayer, Samuel L. 1997. Confessions of a Lapsed Neo-Davidsonian. Events and Arguments in Compositional Semantics. New York: Garland. Belletti, Adriana & Luigi Rizzi 1988. Psych-verbs and θ-Theory. Natural Language and Linguistic Theory 6, 291–352. Bresnan, Joan & Jonni Kanerva 1989. Locative inversion in Chichewa. A case study of factorization in grammar. Linguistic Inquiry 20, 1–50. Bresnan, Joan & Lioba Moshi 1990. Object asymmetries in comparative Bantu syntax. Linguistic Inquiry 21, 147–185. Broadwell, George A. 1988. Multiple θ-role assignment in Choctaw. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 113–127. Carlson, Greg 1984. On the role of thematic roles in linguistic theory. Linguistics 22, 259– 279. Carlson, Greg 1998. Thematic roles and the individuation of events. In: S. Rothstein (ed.). Events and Grammar. Dordrecht: Kluwer, 35–51. Chierchia, Gennaro 1984. Topics in the Syntax and Semantics of Infinitives and Gerunds. Ph.D. dissertation. University of Massachusetts, Amherst, MA. Chomsky, Noam 1981. Lectures on Government and Binding. Dordrecht: Foris. Chomsky, Noam 1982. Some Concepts and Consequences of the Theory of Government and Binding. Cambridge, MA: The MIT Press. Croft, William 1991. Syntactic Categories and Grammatical Relations: The Cognitive Organization of Information. Chicago, IL: The University of Chicago Press. Croft, William 1998. Event structure in argument linking. In: M. Butt & W. Geuder (eds.). The Projection of Arguments. Lexical and Compositional Factors. Stanford, CA: CSLI Publications, 21–63. Davidson, Donald 1967. The logical form of action sentences. In: N. Rescher (ed.). The Logic of decision and Action. Pittsburgh, PA: University of Pittsburgh Press, 81–95. Davis, Anthony R. 2001. Linking by Types in the Hierarchical Lexicon. Stanford, CA: CSLI Publications. Davis, Anthony R. & Jean-Pierre Koenig 2000. Linking as constraints on word classes in a hierarchical lexicon. Language 76, 56–91. Donohue, Mark 1996. Relative clauses in Tukang Besi. Grammatical functions and thematic roles. Linguistic Analysis 26, 159–173. Dowty, David 1989. On the semantic content of the notion of ‘thematic role’. In: G. Chierchia, B. Partee & R. Turner (eds.). Properties, Types, and Meaning, vol. 2. Semantic Issues. Dordrecht: Kluwer, 69–129. Dowty, David 1991. Thematic proto-roles and argument selection. Language 67, 547–619. Everaert, Martin & Elena Anagnostopoulou 1997. Thematic hierarchies and Binding Theory. Evidence from Greek. In: F. Gorblin, D. Godard & J.-M. Marandin (eds.). Empirical Issues in Formal Syntax and Semantics. Selected Papers from the Colloque de Syntaxe et de Sémantique de Paris (CSSP 1995). Bern: Lang, 43–59. Evans, Nick 1997. Role or cast. In: A. Alsina, J. Bresnan & P. Sells (eds.). Complex Predicates. Stanford, CA: CSLI Publications, 397–430. Fillmore, Charles J. 1968. The case for case. In: E. Bach & R. T. Harms (eds.). Universals of Linguistic Theory. New York: Holt, Rinehart & Winston, 1–88. Fillmore, Charles J. 1977. Topics in lexical semantics. In: R. W. Cole (ed.). Current Issues in Linguistic Theory. Bloomington, IN: Indiana University Press, 76–138. Foley, William A. & Robert D. van Valin, Jr. 1984. Functional Syntax and Universal Grammar. Cambridge: Cambridge University Press. Gawron, Jean Mark 1983. Lexical Representations and the Semantics of Complementation. Ph.D. dissertation. University of California, Berkeley, CA.
18. Thematic roles Gerdts, Donna B. 1992. Morphologically-mediated relational profiles. In: L. A. Buszard-Welcher, L. Wee & W. Weigel (eds.). Proceedings of the 18th Annual Meeting of the Berkeley Linguistics Society (=BLS). Berkeley, CA: Berkeley Linguistic Society, 322–337. Grimshaw, Jane 1990. Argument Structure. Cambridge, MA: The MIT Press. Gruber, Jeffrey S. 1965. Studies in Lexical Relations. Ph.D. dissertation. MIT, Cambridge, MA. Jackendoff, Ray 1983. Semantics and Cognition. Cambridge, MA: The MIT Press. Jackendoff, Ray 1987. The Status of thematic relations in linguistic theory. Linguistic Inquiry 18, 369–411. Jackendoff, Ray 1990. Semantic Structures. Cambridge, MA: The MIT Press. Jones, Charles 1988. Thematic relations in control. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 75–89. Kiparsky, Paul 1997. Remarks on denominal verbs. In: A. Alsina, J. Bresnan & P. Sells (eds.). Complex Predicates. Stanford, CA: CSLI Publications, 473–499. Kratzer, Angelika 1996. Severing the external argument from its verb. In: J. Rooryck & L. Zaring (eds.). Phrase Structure and the Lexicon. Dordrecht: Kluwer, 109–137. Krifka, Manfred 1992. Thematic relations as links between nominal reference and temporal constitution. In: I. A. Sag & A. Szabolcsi (eds.). Lexical Matters. Stanford, CA: CSLI Publications, 29–53. Krifka, Manfred 1998. The origins of telicity. In: S. Rothstein (ed.). Events and Grammar. Dordrecht: Kluwer, 197–235. Ladusaw, William A. & David R. Dowty 1988. Towards a nongrammatical account of thematic roles. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 61–73. Landman, Fred 2000. Events and Plurality. The Jerusalem Lectures. Dordrecht: Kluwer. Lehmann, Fritz 1996. Big posets of participatings and thematic roles. In: P. W. Eklund, G. Ellis & G. Mann (eds.). Proceedings of the 4th International Conference on Conceptual Structures. Knowledge Representation as Interlingua. Berlin: Springer, 50–74. Levin, Beth & Malka Rappaport Hovav 2005. Argument Realization. Cambridge: Cambridge University Press. Li, Yafei 1995. The thematic hierarchy and causativity. Natural Language and Linguistic Theory 13, 255–282. Màrquez, Lluís et al. 2008. Semantic role labeling. An introduction to the special issue. Computational Linguistics 34, 145–159. Mithun, Marianne 1984. The evolution of noun incorporation. Language 60, 847–894. Nishigauchi, Taisuke 1984. Control and the thematic domain. Language 60, 215–250. Ostler, Nicholas 1979. Case Linking. A Theory of Case and Verb Diathesis Applied to Classical Sanskrit. Ph.D. dissertation. MIT, Cambridge, MA. Parsons, Terence 1990. Events in the Semantics of English. A Study in Subatomic Semantics. Cambridge, MA: The MIT Press. Perlmutter, David M. & Paul Postal. 1984. The I-advancement exclusiveness law. In: D.M. Perlmutter & C. Rosen (eds.). Studies in Relational Grammar, vol. 2. Chicago, IL: The University of Chicago Press, 81–125. Primus, Beatrice 1999. Cases and Thematic Roles. Ergative, Accusative and Active. Tübingen: Niemeyer. Primus, Beatrice 2006. Mismatches in semantic-role hierarchies and the dimensions of Role Semantics. In: I. Bornkessel et al. (eds.). Semantic Role Universals and Argument Linking. Theoretical, Typological, and Psycholinguistic Perspectives. Berlin: Mouton de Gruyter, 53–88. Rappaport, Malka & Beth Levin 1988. What to do with θ-roles. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 7–36. Rosen, Carol 1984. The interface between semantic roles and initial grammatical relations. In: D.M. Perlmutter & C. Rosen. (eds.). Studies in Relational Grammar, vol. 2. Chicago, IL: The University of Chicago Press, 38–77.
419
420
IV. Lexical semantics Rozwadowska, Bożena 1988. Thematic restrictions on derived nominals. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 147–165 Schein, Barry 2002. Events and the semantic content of thematic relations. In: G. Preyer & G. Peter (eds.). Logical Form and Language. Oxford: Oxford University Press, 263–344. Somers, Harold L. 1987. Valency and Case in Computational Linguistics. Edinburgh: Edinburgh University Press. Sowa, John F. 2000. Knowledge Representation. Logical, Philosophical, and Computational Foundations. Pacific Grove, CA: Brooks Cole Publishing Co. Svenonius, Peter 2007. Adpositions, particles and the arguments they introduce. In: E. Reuland, T. Bhattacharya & G. Spathas (eds.). Argument Structure. Amsterdam: Benjamins, 63–103. van Valin, Robert D., Jr. 2004. Semantic macroroles in Role and Reference Grammar. In: R. Kailuweit & M. Hummel (eds.). Semantische Rollen. Tübingen: Narr, 62–82. Wechsler, Stephen 1995. The Semantic Basis of Argument Structure. Stanford, CA: CSLI Publications. Wechsler, Stephen 2005. What is right and wrong about little v. In: M. Vulchanova & T. A. Åfarli (eds.). Grammar and Beyond. Essays in Honour of Lars Hellan. Oslo: Novus Press, 179–195. Wilkins, Wendy 1988. Thematic structure and reflexivization. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 191–213. Williams, Edwin 1994. Thematic Structure in Syntax. Cambridge, MA: The MIT Press. Wunderlich, Dieter 1997. CAUSE and the structure of verbs. Linguistic Inquiry 28, 27–68.
Anthony R. Davis, Washington, DC (USA)
19. Lexical Conceptual Structure 1. 2. 3. 4. 5. 6. 7. 8.
Introduction The introduction of LCSs into linguistic theory Components of LCSs Choosing primitive predicates Subeventual analysis LCSs and syntax Conclusion References
Abstract The term “Lexical Conceptual Structure” was introduced in the 1980s to refer to a structured lexical representation of verb meaning designed to capture those meaning components which determine grammatical behavior, particularly with respect to argument realization. Although the term is no longer much used, representations of verb meaning which share many of the properties of LCSs are still proposed in theories which maintain many of the aims and assumptions associated with the original work on LCSs. As LCSs and the representations that are their descendants take the form of predicate decompositions, the article reviews criteria for positing the primitive predicates that make up LCSs. Following an overview of the original work on LCS, the article traces the developments in the representation of verb meaning that characterize the descendants of the early LCSs. The more recent work exploits the distinction between root and event structure implicit in even the earliest LCS in the determination of grammatical behavior. This work Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 420–440
19. Lexical Conceptual Structure also capitalizes on the assumption that predicate decompositions incorporate a subeventual analysis which defines hierarchical relations among arguments, allowing argument realization rules to be formulated in terms of the geometry of the decomposition.
1. Introduction Lexical Conceptual Structure (LCS) is a term that was used in the 1980s and 1990s to refer to a structured lexical representation of verb meaning. Although the term “LCS” is no longer widely used, structured representations of verb meaning which share many of the properties of LCSs are still often proposed, in theories which maintain many of the aims and assumptions associated with those originally involving an LCS. These descendants of the original LCSs go by various names, including lexical relational structures (Hale & Keyser 1992, 1993), event structures (Rappaport Hovav & Levin 1998a, Levin & Rappaport Hovav 2005), semantic structures (Pinker 1989), L-syntax (Mateu 2001a, Travis 2000), l-structure (Zubizarreta & Oh 2007) and first phase syntax (Ramchand 2008); representations called Semantic Forms (Wunderlich 1997a, 1997b) and Semantic Representations (van Valin 1993, 2005, van Valin & LaPolla 1997) are also close in spirit to LCSs. Here we provide an overview of work that uses a construct called LCS, and we then trace the developments which have taken place in the representation of verb meaning in descendants of this work. We stress, however, that we are not presenting a single coherent or unified theory, but rather a synthetic perspective on a collection of related theories.
2. The introduction of LCSs into linguistic theory In the early 1980s, the idea emerged that major facets of the syntax of a sentence are projected from the lexical properties of the words in it (e.g. Chomsky 1981, Farmer 1984, Pesetsky 1982, Stowell 1981; see Fillmore 1968 for an earlier proposal of this sort), and over the course of that decade its consequences were explored. Much of this work assumes that verbs are associated with predicate-argument structures (e.g. Bresnan 1982, Grimshaw 1990), often called theta-grids (Stowell 1981, Williams 1981). The central idea is that the syntactic structure that a verb appears in is projected from its predicateargument structure, which indicates the number of syntactic arguments a verb has, and some information about how the arguments are projected onto syntax, for example, as internal or external arguments (Marantz 1984, Williams 1981). One insight arising from the closer scrutiny of the relationship between the lexical properties of verbs and the syntactic environments in which they appear is that a great many verbs display a range of what have been called argument – or diathesis – alternations, in which the same verb appears with more than one set of morphosyntactic realization options for its arguments, as in the causative and dative alternations, in (1) and (2), respectively. (1) a. Pat dried the clothes. b. The clothes dried. (2) a. Pat sold the rare book to Terry. b. Pat sold Terry the rare book. Some argument alternations seem to involve two alternate realizations of the same set of arguments (e.g. the dative alternation), while others seem to involve real changes in
421
422
IV. Lexical semantics the meaning of the verb (e.g. the causative alternation) (Rappaport Hovav & Levin 1998b). Researchers who developed theories of LCS assumed that in addition to a verb’s argument structure, it is possible to isolate a small set of recurring meaning components which determine the range of argument alternations a particular verb can participate in. These meaning components are embodied in the primitive predicates of predicate decompositions such as LCSs. Thus, LCSs are used both to represent systematic alternations in a verb’s meaning and to define the set of verbs which undergo alternate mappings to syntax, as we now illustrate. A case study which illustrates this line of investigation is presented by Guerssel et al. (1985). Their study attempts to isolate those facets of meaning which determine a verb’s participation in several transitivity alternations in four languages: Berber, English, Warlpiri, and Winnebago. Guerssel et al. compare the behavior of verbs corresponding to English break (as a representative of the class of change of state verbs) and cut (as a representative of the class of motion-contact-effect verbs) in several alternations, including the causative and conative alternations in these languages (cf. Fillmore 1970). They suggest that participation in the causative alternation is contingent on the LCS of a verb containing a constituent of the form ‘[come to be STATE]’ (represented via the predicate BECOME or CHANGE in some other work), while participation in the conative alternation requires an LCS with components of contact and effect. The LCSs suggested for the intransitive and transitive uses of break, which together make up the causative alternation, are given in (3), and the LCSs for the transitive and intransitive uses of cut, which together make up the conative alternation, illustrated in (4), are presented in (5). (3) a. break: y come to be BROKEN (Guerssel et al. 1985: 54, ex. (19)) b. break: x cause (y come to be BROKEN) (Guerssel et al. 1985: 55, ex. (21)) (4) a. I cut the rope around his wrists. b. I cut at the rope around his wrists. (5) a. cut: x produce CUT in y, by sharp edge coming into contact with y (Guerssel et al. 1985: 51, ex. (11)) b. cut Conative LCS: x causes sharp edge to move along path toward y, in order to produce CUT on y, by sharp edge coming into contact with y (Guerssel et al. 1985: 59, ex. (34)) We cite semantic representations in the forms given in the source, even though this leads to inconsistencies in notation; where we formulate representations for the purposes of this article, we adopt the representations used by Rappaport Hovav & Levin (1998a) and subsequent work. Although the LCSs for cut in (5) include semantic notions not usually encountered in predicate decompositions, central to them are the notions ‘move’ and ‘produce’, which have more common analogues: go and the combination of cause and become, respectively. The verb cut does not have an intransitive noncausative use, as in (6), since its LCS does not have an isolatable constituent of the form ‘[come to be in STATE]’, while the verb break lacks a conative variant, as in (7), because its LCS does not include a contact
19. Lexical Conceptual Structure component. Finally, verbs like touch, whose meaning does not involve a change of state and simply involves contact with no necessary effect, display neither alternation, as in (8). (6) *The bread cut. (7) *We broke at the box. (8) a. We touched the wall. b. *The wall touched. c. *We touched at the wall. For other studies along these lines, see Hale & Keyser (1987), Laughren (1988), and Rappaport, Levin & Laughren (1988). Clearly, the noncausative and causative uses of a verb satisfy different truth conditions, as do the conative and nonconative uses of a verb. As we have just illustrated, LCSs can capture these modulations in the meaning of a verb which, in turn, have an effect on the way a verb’s arguments are morphosyntactically realized. As we discuss in sections 5 and 6, subsequent work tries to derive a verb’s argument realization properties in a principled way from the structure of its LCS. However, as mentioned above, verbs with certain LCSs may also simply allow more than one syntactic realization of their arguments without any change in meaning. Rappaport Hovav & Levin (2008) argue that this possibility is instantiated by the English dative alternation as manifested by verbs that inherently lexicalize caused possession such as give, rent, and sell. They propose that these verbs have a single LCS representing the causation of possession, as in (9), but differ from each other with respect to the specific type of possession involved. The verb give lexicalizes nothing more than caused possession, while other verbs add further details about the event: it involves the exchange of money for sell and is temporary and contractual for rent. (9) Caused possession LCS: [ x cause [ y have z ] ] According to Rappaport Hovav & Levin (2008), the dative alternation arises with these verbs because the caused possession LCS has two syntactic realizations. (See Harley 2003 and Goldberg 1995 for an alternative view which takes the dative alternation to be a consequence of attributing both caused motion and caused possession LCSs to all alternating verbs; Rappaport Hovav & Levin only attribute both LCSs to verbs such as send and throw, which are not inherently caused possession verbs.) As these case studies illustrate, the predicate decompositions that fall under the rubric “LCS” are primarily designed to capture those facets of meaning which determine grammatical facets of behavior, including argument alternations. This motivation sets LCSs apart from other predicate decompositions, which are primarily posited on the basis of other forms of evidence, such as the ability to capture various entailment relations between sets of sentences containing morphologically related words and the ability to account for interactions between event types and various tense operators and temporal adverbials; cf. article 17 (Engelberg) Frameworks of decomposition. To give one example, it has been suggested that verbs which pass tests for telicity all have a state predicate
423
424
IV. Lexical semantics in their predicate decomposition (Dowty 1979, Parsons 1990). Nevertheless, LCS representations share many of the properties of other predicate decompositions used as explications of lexical meaning, including those proposed by Dowty (1979), Jackendoff (1976, 1983, 1990), and more recently in Role and Reference Grammar, especially, in van Valin & LaPolla (1997), based in large part on the work of generative semanticists such as Lakoff (1968, 1970), McCawley (1968, 1971), and Ross (1972). These similarities, of course, raise the question of whether the same representation can be the basis for capturing both kinds of generalizations. LCSs, however, are not intended to provide an exhaustive representation of a verb’s meaning, as mentioned above. Positing an LCS presupposes that it is possible to distinguish those facets of meaning that are grammatically relevant from those which are not; this assumption is not uncontroversial, see, for example, the debate between Taylor (1996) and Jackendoff (1996a). In addition, the methodology and aims of this form of “componential” analysis of verb meaning differs in fundamental ways from the type of componential analysis proposed by the structuralists (e.g. Nida 1975). For the structuralists, meaning components were isolatable insofar as they were implicated in semantic contrasts within a lexical field (e.g. “adult” to distinguish parent from child); the aim of a componential analysis, then, was to provide a feature analysis of the words in a particular semantic field that distinguishes every word in that field from every other. In contrast, the goal of the work assuming LCS is not to provide an exhaustive semantic analysis, but rather to isolate only those facets of meaning which recur in significant classes of verbs and determine key facets of the linguistic behavior of verbs. This approach makes the crucial assumption that verbs may be different in significant respects, while still having almost identical LCSs; for example, freeze and melt denote “inverse” changes of state, yet they would both share the LCS of change of state verbs. Although all these works assume the value of positing predicate decompositions (thus differing radically from the work of Fodor & Lepore 1999), the nature of the predicate decomposition and its place in grammar and syntactic structure varies quite radically from theory to theory. Here we review the work which takes the structured lexical representation to be a specifically linguistic representation and, thus, to be distinct from a general conceptual structure which interfaces with other cognitive domains. Furthermore, this work assumes that the information encoded in the LCSs is a small subset of the information encoded in a fully articulated explication of lexical meaning. In this respect, this work is different from the work of Jackendoff (1983, 1990), who assumes that there is a single conceptual representation, used for linguistic and nonlinguistic purposes; cf. article 30 (Jackendoff) Conceptual Semantics.
3. Components of LCSs Since verbs individuate and name events, LCS-style representations are taken to specify the limited inventory of basic event types made available by language for describing happenings in the world. Thus, our use of the term “event” includes all situation types, including states, similar to the notion of “eventuality” in some work on event semantics (Bach 1986). For this reason, such representations are often currently referred to as “event structures”. In this section, we provide an overview of the representations of the lexical meaning of verbs which are collectively called event structures and identify the properties which are common to the various instantiations of these representations.
19. Lexical Conceptual Structure In section 6, we review theories which differ in terms of how these representations are related to syntactic structures. All theories of event structure, either implicitly or explicitly, recognize a distinction between the primitive predicates which define the range of event types available and a component which represents what is idiosyncratic in a verb’s meaning. For example, all noncausative verbs of change of state have a predicate decomposition including a predicate representing the notion of change of state, as in (10); however, these verbs differ from one another with respect to an attribute of an entity whose value is specified as changing: the attribute relevant to cool involves temperature, while that relevant to widen involves a dimension. One way to represent these components of meaning is to allow the predicate representing the change to take an argument which represents the attribute, and this argument position can then be associated with distinct attributes. This idea is instantiated in the representations for the three change of state verbs in (11) by indicating the attribute relevant to each verb in capital italics placed within angle brackets. (10) [ become [ y ] ] (11) a. dry: [ become [ y ] ] b. widen: [ become [ y ] ] c. dim: [ become [ y ] ] As this example shows, LCSs are constructed so that common substructures in the representations of verb meanings can be taken to define grammatically relevant classes of verbs, such as those associated with particular argument alternations. Thus, the structure in (10), which is shared by all change of state verbs, can then be associated with displaying the causative alternation. Being associated with this LCS substructure is a necessary, but not sufficient, condition for participating in the causative alternation, since only some change of state verbs alternate in English. The precise conditions for licensing the alternation require further investigation, as does the question of why languages vary somewhat in their alternating verbs; see Alexiadou, Anagnostopoulou & Schäfer (2006), Doron (2003), Haspelmath (1993), Koontz-Garboden (2007), and Levin & Rappaport Hovav (2005) for discussion. The idiosyncratic component of a verb’s meaning has received several names, including “constant”, “root”, and even “verb”. We use the term “root” (Pesetsky 1995) in the remainder of this article, although we stress that it should be kept distinct from the notion of root used in morphology (e.g. Aronoff 1993). Roots may be integrated into LCSs in two ways: a root may fill an argument position of a primitive predicate, as in the change of state example (10), or it may serve as a modifier of a predicate, as with various types of activity verbs, as in (12) and (13). (Modifier status is indicated by subscripting the root to the predicate being modified.) (12) Casey ran. [ x act ] (13) Tracy wiped the table. [ x act y ]
425
426
IV. Lexical semantics Although early work on the structured representation of verb meaning paid little attention to the nature and contribution of the root (the exception being Grimshaw 2005), more recent work has taken seriously the idea that the elements of meaning lexicalized in the root determine the range of event structures that a root can be associated with (e.g. Erteschik-Shir & Rapoport 2004, 2005, Harley 2005, Ramchand 2008, Rappaport Hovav 2008, Rappaport Hovav & Levin 1998a, Zubizarreta & Oh 2007). Thus, Rappaport Hovav & Levin (1998a) propose that roots are of different ontological types, with the type determining the associated event structures. Two of the major ontological types of roots are manner and result (Levin & Rappaport Hovav 1991, 1995, Rappaport Hovav & Levin 2010; see also Talmy 1975, 1985, 2000). These two types of roots are best introduced through an examination of verbs apparently in the same semantic field which differ as to the nature of their root: the causative change of state verb clean, for example, has a result root that specifies a state that often results from some activity, as in (14), while the verb scrub has a manner root that specifies an activity, as in (15); in this and many other instances, the activity is one conventionally carried out to achieve a particular result. With scrub the result is “cleanness”, which explains the intuition of relatedness between the manner verb scrub and the result verb clean. (14) [ [ x act] cause [become [ y ] ] ] (15) [ x act ] Result verbs specify the bringing about of a result state – a state that is the result of some sort of activity; it is this state which is lexicalized in the root. Thus, the verbs clean and empty describe two different result states that are often brought about by removing material from a place; neither verb is specific about how the relevant result state comes about. Result verbs denote externally caused eventualities in the sense of Levin & Rappaport Hovav (1995). Thus, while a cave can be empty without having been emptied, something usually becomes empty as a result of some causing event. Result verbs, then, are associated with a causative change of state LCS; see also Hale & Keyser (2002) and Koontz-Garboden (2007) and for slightly different views Alexiadou, Anagnostopoulou & Schäfer (2006) and Doron (2003). A manner root is associated with an activity LCS; such roots describe actions, which are identified by some sort of means, manner, or instrument. Thus, the manner verbs scrub and wipe both describe actions that involve making contact with a surface, but differ in the way the hand or some implement is moved against the surface and the degree of force and intensity of this movement. Often such activities are characterized by the instrument used in performing them and the verbs themselves take their names from the instruments. Again among verbs describing making contact with a surface, there are the verbs rake and shovel, which involve different instruments, designed for different purposes and, thus, manipulated in somewhat different ways. Despite the differences in the way the instruments are used, linguistically all these verbs have a basic activity LCS. In fact, all instrument verbs have this LCS even though there is apparent diversity among them: Thus, the verb sponge might be used in the description of removing events (e.g. Tyler sponged the stain off the fabric) and the verb whisk in the description of adding events (e.g. Cameron whisked the sugar into the eggs), while the verbs rake and shovel might be used for either (e.g. Kelly shoveled the snow into the truck, Kelly shoveled the snow off the drive).
19. Lexical Conceptual Structure According to Rappaport Hovav & Levin (1998b), this diversity has a unified source: English allows the LCSs of all activity verbs to be “augmented” by the addition of a result state, giving rise to causative LCSs, such as those involved in the description of adding and removing events, via a process they call Template Augmentation. This process resembles Wunderlich’s (1997a, 2000) notion of argument extension; cf. article 84 (Wunderlich) Operations on argument structure; see also Rothstein (2003) and Ramchand (2008). Whether an augmented instrument verb receives an adding or removing interpretation depends on whether the instrument involved is typically used to add or remove stuff. In recent work, Rappaport Hovav & Levin (2010) suggest an independent characterization of manner and result roots by appealing to the notions of scalar and nonscalar change––notions which have their origins in Dowty (1979, 1991) and McClure (1994), as well as the considerable work on the role of scales in determining telicity (e.g. Beavers 2008, Borer 2005, Hay, Kennedy & Levin 1999, Jackendoff 1996b, Kennedy & Levin 2008, Krifka 1998, Ramchand 1997, Tenny 1994). As dynamic verbs, manner and result verbs all involve change, though crucially not the same type of change: result roots specify scalar changes, while manner roots do not. Verbs denoting events of scalar change in one argument lexically entail a scale: a set of degrees – points or intervals indicating measurement values – ordered on a particular dimension representing an attribute of an argument (e.g. height, temperature, cost) (Bartsch & Vennemann 1972, Kennedy 1999, 2001); the degrees indicate the possible values of this attribute. A scalar change in an entity involves a change in the value of the relevant attribute in a particular direction along the associated scale. The change of state verb widen is associated with a scale of increasing values on a dimension of width; and a widening event necessarily involves an entity showing an increase in the value along this dimension. A nonscalar change is any change that cannot be characterized in terms of a scale; such changes are typically complex, involving a combination of many changes at once. They are characteristic of manner verbs. For example, the verb sweep involves a specific movement of a broom against a surface that is repeated an indefinite number of times. See Rappaport Hovav (2008) for extensive illustration of the grammatical reflexes of the scalar/nonscalar change distinction. As this section makes clear, roots indirectly influence argument realization as their ontological type determines their association with a particular event structure. We leave open the question of whether roots can more directly influence argument realization. For example, the LCS proposed for cut in (5) includes elements of meaning which are normally associated with the root since “contact” or a similar concept has not figured among proposals for the set of primitive predicates constituting an LCS. Yet, this element of meaning is implicated by Guerssel et al. in the conative alternation. (In contrast, the notion “effect” more or less reduces to a change of state of some type.)
4. Choosing primitive predicates LCSs share with other forms of predicate decomposition the properties that are said to make such representations an improvement over lists of semantic roles, whether Fillmore’s (1968) cases or Gruber (1965/1976) and Jackendoff’s (1972) thematic relations, as structured representations of verb meaning. There is considerable discussion of the problems with providing independent, necessary and sufficient definitions of semantic roles (see e.g. Dowty 1991 and article 18 (Davis) Thematic roles), and one suggestion for dealing with this problem is the suggestion first found in Jackendoff (1972) that semantic
427
428
IV. Lexical semantics roles can be identified with particular open positions in predicate decompositions. For example, the semantic role “agent” might be identified with the first argument of a primitive predicate cause. There is a perception that the set of primitive predicates used in a verb’s LCS or event structure is better motivated than the set of semantic role labels for its arguments, and for this reason predicate decompositions might appear to be superior to a list of semantic role labels as a structured representation of a verb’s meaning. However, there is surprisingly little discussion of the explicit criteria for positing a particular primitive predicate, although see the discussion in Carter (1978), Jackendoff (1983: 203–204), and Joshi (1974). The primitive predicates which surface repeatedly in studies using LCSs or other forms of predicate decomposition are act or do, be, become or change, cause, and go, although the predicates have, move, stay, and, more recently, result are also proposed. Jackendoff (1990) posits a significantly greater number of predicates than in his previous work, introducing the predicates configure, extend, exchange, form, inch(oative), orient, and react. Article 32 (Hobbs) Word meaning and world knowledge discusses how some of these predicates may be grounded in an axiomatic semantics. Once predicates begin to proliferate, theories of predicate decomposition face many of the well-known problems facing theories of semantic roles (cf. Dowty 1991). The question is whether it is possible to identify a small, comprehensive, universal, and wellmotivated set of predicates accepted by all. It is worthwhile, therefore, to scrutinize the motivation for proposing a predicate in the first place and to try to make explicit when the introduction of a new predicate is justified. In positing a set of predicates, researchers have tried to identify recurring elements of verb meaning that figure in generalizations holding across the set of verbs within (and, ultimately, across) languages. Often these generalizations involve common entailments or common grammatical properties. Wilks (1987) sets out general desiderata for a set of primitive predicates that are implicit in other work. For instance, the set of predicates should be finite in size and each predicate in the set should indeed be “primitive” in that it should not be reducible to other predicates in the set, nor should it even be partially definable in terms of another predicate. Thus, in positing a new predicate, it is important to consider its effect on the overall set of predicates. Wilks also proposes that the set of predicates should be able to exhaustively describe and distinguish the verbs of each language, but LCS-style representations, by adopting the root–event structure distinction, simply require that the set of primitives should be able to describe all the grammatically relevant event types. It is the role of the root to distinguish between specific verbs of the same event type, and there is a general, but implicit assumption that the roots themselves cannot be reduced to a set of primitive elements. As Wilks (1987: 760) concludes, the ultimate justification for a set of primitive is in their “special organizing role in a language system”. We now briefly present several case studies chosen to illustrate the type of reasoning used in positing a predicate. One way of arriving at a set of primitive predicates is to adopt a hypothesis that circumscribes the basic inventory of event types, while allowing for all events to be analyzed in terms of these types. This approach is showcased in the work of Jackendoff (1972, 1976, 1983, 1987, 1990), who develops ideas proposed by Gruber (1965/1976). Jackendoff adopts the localist hypothesis: motion and location events are basic and all other events should be construed as such events. There is one basic type of motion event, represented
19. Lexical Conceptual Structure by the primitive predicate go, which takes as arguments a theme and the path (e.g. The cart went from the farm to the market). There are two basic types of locational events, represented by the predicate be (for stative events) and stay (for non-stative events); these predicates also take a theme and a location as arguments (e.g. The coat was/stayed in the closet). In addition, Jackendoff introduces the predicates cause and let, which are used to form complex events taking as arguments a causer and a motion or location event. Events that are not obviously events of motion or location are construed in terms of some abstract form of motion or location. For example, with events of possession, possessums can be taken as themes and possessors as locations in an abstract possessional “field” or domain. The verb give is analyzed as describing a causative motion event in the possessional field in which a possessum is transferred from one possessor to another. Physical and mental states and changes of state can be seen as involving an entity being “located” in a state or “moving” from one state to a second state in an identificational field; the verb break, for instance, describes an entity moving from a state of being whole to a state of being broken. Generalizing, correspondences are set up between the components of motion and location events and the components of other event domains or “semantic fields” in Jackendoff’s terms; this is what Jackendoff (1983: 188) calls the Thematic Relations Hypothesis; see article 30 (Jackendoff) Conceptual Semantics. In general on this view, predicates are most strongly motivated when they figure prominently in lexical organization and in cross-field generalizations. This kind of cross-field organization can be illustrated in a number of ways. First, many English verbs have uses based on the same predicate in more than one field (Jackendoff 1983: 203–204). Thus, the predicate stay receives support from the English verb keep, whose meaning presumably involves the predicates cause and stay, combined in a representation as in (16), because it shows uses involving the three fields just introduced. (16) [cause (x, (stay y, z))] (17) a. Terry kept the bike in the shed. (Positional) b. Terry kept the bike basket. (Possessional) c. Terry kept the bike clean. (Identificational) Without the notion of semantic field, there would be no reason to expect English to have verbs which can be used to describe events which on the surface seem quite different from each another, as in (17). Second, rules of inference hold of shared predicates across fields (Jackendoff 1976). One example is that “if an event is caused, it takes place” (Jackendoff 1976: 110), so that the entailment The bike stayed in the shed can be derived from (17a), the entailment The bike basket stayed with Terry can be derived from (17b), and the entailment The bike stayed clean can be derived from (17c). This supports the use of the predicate cause across fields. Finally, predicates are justified when they explain cross-field generalizations in the use of prepositions. The use of the allative preposition to is taken to support the analysis of give as a causative verb of motion. These very considerations lead Carter (1978: 70–74) to argue against the primitive predicate stay. Carter points out that few English words have meanings that include the notion captured by stay, yet if stay were to number among the primitive predicates,
429
430
IV. Lexical semantics such words would be expected to be quite prevalent. So, while the primitives cause and become are motivated because languages often contain a multitude of lexical items differentiated by just these predicates (e.g. the various uses of cool in The cook cooled the cake, The cake cooled, The cake was cool), there is no minimal pair differentiated by the existence of stay, for example, the verb cool cannot also mean ‘stay cool’. Carter also notes that if not is included in the set of predicates, then the predicate stay becomes superfluous, as it could be replaced by not plus change, a predicate which is roughly an analogue to Jackendoff’s go. Carter further notes that as a result simpler statements of certain inference rules and other generalizations might be possible. However, the primitive predicates which serve best as the basis for cross-field generalizations are not necessarily the ones that emerge from efforts to account for argument alternations––the efforts that lead to LCS-style representations and their descendants. This point can be illustrated by examining another predicate whose existence has been controversial, have. Jackendoff posits a possessional field that is modeled on the locational field: being possessed is taken to be similar to being at a location––existing at that location (see also Lyons 1967, 1968: 391–395). This approach receives support since many entailments involving location also apply to possession, such as the entailment described for stay. Furthermore, in some languages, including Hindi-Urdu, the same verb is used in basic locational and possessive sentences, suggesting that possession can be reduced to location (though the facts are often more complicated than they appear on the surface; see Harley 2003). Nevertheless, Pinker (1989: 189–190) and Tham (2004: 62–63, 74–85, 100–104) argue that an independent predicate have is necessary; see also Harley (2003). Pinker points out that in terms of the expression of its arguments, it is belong and not have which resembles locational predicates. The verb belong takes the possessum, which would be analyzed as a theme (i.e. located entity), as its subject and the possessor, which would be analyzed as a location, as an oblique. Its argument realization, then, parallels that of a locational predicate; compare (18a) and (18b). In contrast, have takes the possessor as subject and the possessum as object, as in (19), so its arguments show the reverse syntactic prominence relations – a “marked” argument realization, which would need an explanation, on the localist analysis. (18) a. One of the books belongs to me. b. One of the books is on the table. (19) I have one of the books. Pinker points out that an analysis which takes have to be a marked possessive predicate is incompatible with the observations that it is a high-frequency verb, which is acquired early and unproblematically by children. Tham (2004: 100–104) further points out that it is belong that is actually the “marked” verb from other perspectives: it imposes a referentiality condition on its possessum and it is used in a restricted set of information structure contexts – all restrictions that have does not share. Taking all these observations together, Tham concludes that have shows the unmarked realization of arguments for possessive predicates, while belong shows a marked realization of arguments. Thus, she argues that the semantic prominence relations in unmarked possessive and locative sentences are quite different and, therefore, warrant positing a predicate have.
19. Lexical Conceptual Structure
5. Subeventual analysis One way in which more recent work on event structure departs from earlier work on LCSs is that it begins to use the structure of the semantic representation itself, rather than reference to particular predicates in this representation, in formulating generalizations about argument realization. In so doing, this work capitalizes on the assumption, present in some form since the generative semantics era, that predicate decompositions may have a subeventual analysis. Thus, it recognizes a distinction between two types of event structures: simple event structures and complex event structures, which themselves are constituted of simple event structures. The prototypical complex event structure is a causative event structure, in which an entity or event causes another event, though Ramchand (2008) takes some causative events to be constituted of three subevents, an initiating event, a process, and a result. Beginning with the work of the generative semanticists, the positing of a complex event structure was supported using evidence from scope ambiguities involving various adverbial phrases (McCawley 1968, 1971, Morgan 1969, von Stechow 1995, 1996). Specifically, a complex event structure may afford certain adverbials, such as again, more scope-taking options than a simple event structure, and thus adverbials may show interpretations in sentences denoting complex events that are unavailable in those denoting simple events. Thus, (20) shows both so-called “restitutive” and “repetitive” readings, while (21) has only a “repetitive” reading. (20) Dale closed the door again. Repetitive: the action of closing the door was performed before. Restitutive: the door was previously in the state of being closed, but there is no presupposition that someone had previously closed the door. (21) John kicked the door again. Repetitive: the action of kicking the door was performed before. The availability of two readings in (20) and one reading in (21) is explained by attributing a complex event structure to (20) and a simple event structure to (21). A recent line of research argues that the architecture of event structure also matters to argument realization, thus motivating complex event structures based on argument realization considerations. This idea is proposed by Grimshaw & Vikner (1993), who appeal to it to explain certain restrictions on the passivization of verbs of creation (though the pattern of acceptability that they are trying to explain turns out to have been mischaracterized; see Macfarland 1995). This idea is further exploited by Rappaport Hovav & Levin (1998a) and Levin & Rappaport Hovav (1999) to explain a variety of facts related to objecthood. In this work, the notion of event complexity gains explanatory power via the assumption that there must be an argument in the syntax for each subevent in an event structure (Rappaport Hovav & Levin 1998a, 2001; see also Grimshaw & Vikner 1993 and van Hout 1996 for similar conditions). Given this assumption, a verb with a simple event structure may be transitive or intransitive, while a verb with a complex event structure,
431
432
IV. Lexical semantics say a causative verb, must necessarily be transitive. Rappaport Hovav & Levin (1998a) attribute the necessary transitivity of break and melt, which contrasts with the “optional” transitivity of sweep and wipe, to this constraint; the former, as causative verbs, have a complex event structure. (22) a. *Blair broke/melted. b. Blair wiped and swept. Levin & Rappaport Hovav (1999) use the same assumption to explain why a resultative based on an unergative verb can only predicate its result state of the subject via a “fake” reflexive object. (23) My neighbor talked *(herself) hoarse. Resultatives have a complex, causative event structure, so there must be a syntactically realized argument representing the argument of the result state subevent; in (23) it is a reflexive pronoun as it is the subject which ends up in the result state. Levin (1999) uses the same idea to explain why agent-act-on-patient verbs are transitive across languages, while other two-argument verbs vary in their transitivity: only the former are required to be transitive. As in other work on LCSs and event structure, the use of subeventual structure is motivated by argument realization considerations. Pustejovsky (1995) and van Hout (1996) propose an alternative perspective on event complexity: they take telic events, rather than causative events, to be complex events. Since most causative events are telic events, the two views of event complexity assign complex event structures in many of the same instances. A major difference is in the treatment of telic uses of manner of motion verbs such as Terry ran to the library and telic uses of consumption verbs such as Kerry ate the peach; these are considered complex predicates on the telicity approach, but not on the causative approach. See Levin (2000) for some discussion. The subeventual analysis also defines hierarchical relations among arguments, allowing rules of argument realization to be formulated in terms of the geometry of the LCS. We now discuss advantages of such a formulation over direct reference to semantic roles.
6. LCSs and syntax LCSs, as predicate decompositions, include the embedding of constituents, giving rise to a hierarchical structure. This hierarchical structure, which includes the subeventual structure discussed in section 5, allows a notion of semantic prominence to be defined, which mirrors the notion of syntactic prominence. For instance, Wunderlich (1997a, 1997b, 2006) introduces a notion of a-command defined over predicate decompositions, which is an analogue of syntactic c-command. By taking advantage of the hierarchical structure of LCSs, it becomes possible to formulate argument realization rules in terms of the geometry of LCSs and, more importantly, to posit natural constraints on the nature of the mapping between LCS and syntax. As discussed at length in Levin & Rappaport Hovav (2005: chapter 5), many researchers assume that the mapping between lexical semantics and syntax obeys a constraint of prominence preservation: relations of semantic prominence in a semantic representation are preserved in the corresponding
19. Lexical Conceptual Structure syntactic representation, so that prominence in the syntax reflects prominence in semantics (Bouchard 1995). This idea is implicit in the many studies that use a thematic hierarchy – a ranking of semantic roles – to guide the semantics-syntax mapping and explain various other facets of grammatical behavior; however, most work adopting a thematic hierarchy does not provide independent motivation for the posited role ranking. Predicate decompositions can provide some substance to the notion of a thematic hierarchy by correlating the position of a role in the hierarchy with the position of the argument bearing that role in a predicate decomposition (Baker 1997, Croft 1998, Kiparsky 1997, Wunderlich 1997a, 1997b, 2000). (There are other ways to ground the thematic hierarchy; see article 18 (Davis) Thematic roles.) Researchers such as Bouchard (1995), Kiparsky (1997), and Wunderlich (1997a, 1997b) assume that predicate decompositions constitute a lexical semantic representation, but many other researchers now assume that predicate decompositions are syntactic representations, built from syntactic primitives and constrained by principles of syntax. This move obviates the need for prominence preservation in the semantics-syntax mapping since the lexical semantic representation and the syntactic representation are one. The idea that predicate decompositions motivated by semantic considerations are remarkably similar to syntactic structures, and thus should be taken to be syntactic structures has previously been made explicit in the generative semantics literature (e.g. Morgan 1969). Hale & Keyser were the first to articulate this position in the context of current syntactic theory; their LCS-style syntactic structures are called “lexical relational structures” in some of their work (1992, 1993, 1997). The proposal that predicate decompositions should be syntactically instantiated has gained currency recently and is defended or assumed in a range of work, including Erteschik-Shir & Rapoport (2004), Mateu (2001a, 2001b), Travis (2000), and Zubizaretta & Oh (2007), who all build directly on Hale & Keyser’s work, as well as Alexiadou, Anagnostopoulou & Schäfer (2006), Harley (2003, 2005), Pylkkänen (2008), and Ramchand (2008), who calls the “lexical” part of syntax “first phase syntax”. We now review the types of arguments adduced in support of this view. Hale & Keyser (1993) claim that their approach explains why there are few semantic role types (although this claim is not entirely uncontroversial; see Dowty 1991 and Kiparsky 1997). For them, the argument structure of a verb is syntactically defined and represented. Furthermore, individual lexical categories (V, in particular) are constrained so as to project syntactic structures using just the syntactic notions of specifier and complement. These syntactic structures are associated with coarse-grained semantic notions, often corresponding to the predicates typical in standard predicate decompositions; the positions in these structures correspond to semantic roles, just as the argument positions in a standard predicate decomposition are said to correspond to semantic roles; see section 4. For example, the patient of a change of state verb is the specifier of a verbal projection in which a verb takes an adjective complement (i.e. the N position in ‘[v N [v V A]]’). Since the number of possible structural relations in syntax is limited, the number of expressible semantic roles is also limited. Furthermore, Hale & Keyser suggest that the nature of these syntactic representations of argument structure also provide insight into Baker’s (1988: 46, 1997) Uniformity of Theta Role Assignment, which requires that identical semantic relationships between items be represented by identical structural relations between those items at d-structure. On Hale & Keyser’s approach, this principle must follow since semantic roles are defined over a hierarchical syntactic structure, with a particular role always having the same instantiation. These ideas are developed further
433
434
IV. Lexical semantics in Ramchand (2008), who assumes a slightly more articulated lexical syntactic structure than Hale & Keyser do. Hale & Keyser provide support for their proposal that verb meanings have internal syntactic structure from the syntax of denominal verbs. They observe that certain impossible types of denominal verbs parallel impossible types of noun-incorporation. In particular, they argue that just as there is no noun incorporation of either agents or recipients in languages with noun-incorporation (Baker 1988: 453–454, fn. 13, 1996: 291–295), so there is no productive denominal verb formation where the base noun is interpreted as an agent or a recipient (e.g. *I churched the money, where churchn is understood as a recipient; *It cowed a calf, where cowN is understood as the agent). However, denominal verbs are productively formed from nouns analyzed as patients (e.g. I buttered the pan) and containers or locations (e.g. I bottled the wine). Hale & Keyser argue that the possible denominal verb types follow on the assumption that these putative verbs are derived from syntactic structures in which the base noun occupies a position which reflects its semantic role. This point is illustrated using their representations for the verbs paint and shelve given in (24) and (25), respectively, which are presented in the linearized form given in Wechsler (2006: 651, ex. (17)–(18)) rather than in Hale & Keyser’s (1993) tree representations. (24) a. We painted the house. b. We [v′ V1 [vp house [v′ V2 [pp Pwith paint]]]]. (25) a. We shelved the books. b. We [v′ V1 [vp books [v′ V2 [pp Pon shelf]]]]. The verbs paint and shelf are derived through the movement of the base noun – the verb’s root – into the empty V1 position in the structures in (b), after first merging it with the preposition Pwith or Pon and then with V2. Hale & Keyser argue that the movement of the root is subject to a general constraint on the movement of heads (Baker 1988, Travis 1984). Likewise, putative verbs such as bush or house, used in sentences such as *I bushed a trim (with the intended interpretation ‘I gave the bush a trim’) or *I housed a coat of paint (with the intended interpretation ‘I gave the house a coat of paint’) are also excluded by the same constraint. Hale & Keyser’s approach is sharply criticized by Kiparsky (1997). He points out that the syntax alone will not prevent the derivation of sentences such as *I bushed some fertilizer from a putative source corresponding to I put some fertilizer on the bush (cf. the source for I bottled the wine or I corralled the horse). The association of a particular root with a specific lexical syntactic structure is governed by conceptual principles such as “If an action refers to a thing, it involves a canonical use of the thing” (Kiparsky 1997: 482). Such principles ensure that bush will not be inserted into a structure as a location, since unlike a bottle, its canonical use is not as a container or place. Denominal verbs in English, although they do not involve any explicit verb-forming morphology, have been said, then, to parallel constructions in other languages which do involve overt morphological or syntactic derivation (e.g. noun incorporation), and this parallel has been taken to support the syntactic derivation of these words. Comparable arguments can be made in other areas of the English lexicon. For example, in most languages of the world, manner of motion verbs cannot on their own license a directional
19. Lexical Conceptual Structure phrase, though in English all manner of motion verbs can. Specifically, sentences parallel to the English Tracy ambled into the room are derived through a variety of morphosyntactic means in other languages, including the use of directional suffixes or applicative morphemes on manner of motion verbs and the combination of manner and directed motion verbs in compounds or serial verb constructions (Schaefer 1985, Slobin 2004, Talmy 1991). Japanese, for example, must compound a manner of motion verb with a directed motion verb in order to express manner of motion to a goal, as the contrast in (26) shows. (26) a. ?John-wa kishi-e oyoida. John-top shore-to swam ‘John swam to the shore.’ (intended; Yoneyama 1986: 1, ex. (1b)) b. John-wa kishi-e oyoide-itta. John-top shore-to swimming-went ‘John swam to the shore.’ (Yoneyama 1986: 2, ex. (3b)) The association of manner of motion roots with a direction motion event type is accomplished in theories such as Rappaport Hovav & Levin (1998b, 2001) and Levin & Rappaport Hovav (1999) by processes which augment the event structure associated with the manner of motion verbs. But such theories never make explicit exactly where the derivation of these extended structures takes place. Ramchand (2008) and Zubizarreta & Oh (2007) argue that since these processes are productive and their outputs are to a large extent compositional, they should be assigned to the “generative computational” module of the grammar, namely, syntax. Finally, as Ramchand explicitly argues, this syntacticized approach suggests that languages that might appear quite different, are in fact, underlyingly quite similar, once lexical syntactic structures are considered. Finally, we comment on the relation between structured event representations and the lexical entries of verbs. Recognizing that many roots in English can appear as words belonging to several lexical categories and that verbs themselves can be associated with various event structures, Borer (2005) articulates a radically nonlexical position: she proposes that roots are category neutral. That is, there is no association specified in the lexicon between roots and the event structures they appear with. Erteschik-Shir & Rapoport (2004, 2005), Levin & Rappaport Hovav (2005), Ramchand (2008), and Rappaport Hovav & Levin (2005) all point out that even in English the flexibility of this association is still limited by the semantics of the root. Ramchand includes in the lexical entries of verbs the parts of the event structure that a verbal root can be associated with, while Levin & Rappaport Hovav make use of “canonical realization rules”, which pair roots with event structures based on their ontological types, as discussed in section 3.
7. Conclusion LCSs are a form of predicate decomposition intended to capture those facets of verb meaning which determine grammatical behavior, particularly in the realm of argument realization. Research on LCSs and the structured representations that are their descendants has contributed to our understanding of the nature of verb meaning and the
435
436
IV. Lexical semantics relation between verb syntax and semantics. This research has shown the importance of semantic representations that distinguish between root and event structure, as well as the importance of the architecture of the event structure to the determination of grammatical behavior. Furthermore, such developments have led some researchers to propose that representations of verb meaning should be syntactically instantiated. This research was supported by Israel Science Foundation Grants 806–03 and 379–07 to Rappaport Hovav. We thank Scott Grimm, Marie-Catherine de Marneffe, and Tanya Nikitina for comments on an earlier draft.
8. References Alexiadou, Artemis, Elena Anagnostopoulou & Florian Schäfer 2006. The properties of anticausatives crosslinguistically. In: M. Frascarelli (ed.). Phases of Interpretation. Berlin: Mouton de Gruyter, 187–211. Aronoff, Mark 1993. Morphology by Itself. Stems and Inflectional Classes. Cambridge, MA: The MIT Press. Bach, Emmon 1986. The algebra of events. Linguistics & Philosophy 9, 5–16. Baker, Mark C. 1988. Incorporation. A Theory of Grammatical Function Changing. Chicago, IL: The University of Chicago Press. Baker, Mark C. 1996. The Polysynthesis Parameter. New York: Oxford University Press. Baker, Mark C. 1997. Thematic roles and syntactic structure. In: L. Haegeman (ed.). Elements of Grammar. Handbook of Generative Syntax. Dordrecht: Kluwer, 73–137. Bartsch, Renate & Theo Vennemann 1972. The grammar of relative adjectives and comparison. Linguistische Berichte 20, 19–32. Beavers, John 2008. Scalar complexity and the structure of events. In: J. Dölling & T. HeydeZybatow (eds.). Event Structures in Linguistic Form and Interpretation. Berlin: de Gruyter, 245–265. Borer, Hagit 2005. Structuring Sense, vol. 2: The Normal Course of Events. Oxford: Oxford University Press. Bouchard, Denis 1995. The Semantics of Syntax. A Minimalist Approach to Grammar. Chicago, IL: The University of Chicago Press. Bresnan, Joan 1982. The passive in lexical theory. In: J. Bresnan (ed.). The Mental Representation of Grammatical Relations. Cambridge, MA: The MIT Press, 3–86. Carter, Richard J. 1978. Arguing for semantic representations. Recherches Linguistiques de Vincennes 5–6, 61–92. Chomsky, Noam 1981. Lectures on Government and Binding. Dordrecht: Foris. Croft, William 1998. Event structure in argument linking. In: M. Butt & W. Geuder (eds.). The Projection of Arguments. Lexical and Compositional Factors. Stanford, CA: CSLI Publications, 21–63. Doron, Edit 2003. Agency and voice. The Semantics of the Semitic templates. Natural Language Semantics 11, 1–67. Dowty, David R. 1979. Word Meaning and Montague Grammar. The Semantics of Verbs and Times in Generative Semantics and in Montague’s PTQ. Dordrecht: Reidel. Dowty, David R. 1991. Thematic proto-roles and argument selection. Language 67, 547–619. Erteschik-Shir, Nomi & Tova R. Rapoport 2004. Bare aspect. A theory of syntactic projection. In: J. Guéron & J. Lecarme (eds.). The Syntax of Time. Cambridge, MA: The MIT Press, 217–234. Erteschik-Shir, Nomi & Tova R. Rapoport 2005. Path predicates. In: N. Erteschik-Shir & T. Rapoport (eds.). The Syntax of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 65–86.
19. Lexical Conceptual Structure Farmer, Ann K. 1984. Modularity in Syntax. A Study of Japanese and English. Cambridge, MA: The MIT Press. Fillmore, Charles J. 1968. The case for case. In: E. Bach & R. T. Harms (eds.). Universals in Linguistic Theory. New York: Holt, Rinehart & Winston, 1–88. Fillmore, Charles J. 1970. The grammar of hitting and breaking. In: R. A. Jacobs & P. S. Rosenbaum (eds.). Readings in English Transformational Grammar. Waltham, MA: Ginn, 120–133. Fodor, Jerry & Ernest Lepore 1999. Impossible words? Linguistic Inquiry 30, 445–453. Goldberg, Adele E. 1995. Constructions. A Construction Grammar Approach to Argument Structure. Chicago, IL: The University of Chicago Press. Grimshaw, Jane 1990. Argument Structure. Cambridge, MA: The MIT Press. Grimshaw, Jane 2005. Words and Structure. Stanford, CA: CSLI Publications. Grimshaw, Jane & Sten Vikner 1993. Obligatory adjuncts and the structure of events. In: E. Reuland & W. Abraham (eds.). Knowledge and Language, vol. 2: Lexical and Conceptual Structure. Dordrecht: Kluwer, 143–155. Gruber, Jeffrey S. 1965/1976. Studies in Lexical Relations. Ph.D. dissertation. MIT, Cambridge, MA. Reprinted in: J. S. Gruber. Lexical Structures in Syntax and Semantics. Amsterdam: NorthHolland, 1976, 1–210. Guerssel, Mohamed et al. 1985. A cross-linguistic study of transitivity alternations. In: W. H. Eilfort, P. D. Kroeber & K. L. Peterson (eds.). Papers from the Parasession on Causatives and Agentivity. Chicago, IL: Chicago Linguistic Society, 48–63. Hale, Kenneth L. & Samuel J. Keyser 1987. A View from the Middle. Lexicon Project Working Papers 10. Cambridge, MA: Center for Cognitive Science, MIT. Hale, Kenneth L. & Samuel J. Keyser 1992. The syntactic character of thematic structure. In: I. M. Roca (ed.). Thematic Structure. Its Role in Grammar. Berlin: Foris, 107–143. Hale, Kenneth L. & Samuel J. Keyser 1993. On argument structure and the lexical expression of syntactic relations. In: K. L. Hale & S. J. Keyser (eds.). The View from Building 20. Essays in Linguistics in Honor of Sylvain Bromberger. Cambridge, MA: The MIT Press, 53–109. Hale, Kenneth L. & Samuel J. Keyser 1997. On the complex nature of simple predicators. In: A. Alsina, J. Bresnan & P. Sells (eds.). Complex Predicates. Stanford, CA: CSLI Publications, 29–65. Hale, Kenneth L. & Samuel J. Keyser 2002. Prolegomenon to a Theory of Argument Structure. Cambridge, MA: The MIT Press. Harley, Heidi 2003. Possession and the double object construction. In: P. Pica & J. Rooryck (eds.). Linguistic Variation Yearbook, vol. 2. Amsterdam: Benjamins, 31–70. Harley, Heidi 2005. How do verbs get their names? Denominal verbs, manner incorporation and the ontology of verb roots in English. In: N. Erteschik-Shir & T. Rapoport (eds.). The Syntax of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 42–64. Haspelmath, Martin 1993. More on the typology of inchoative/causative verb alternations. In: B. Comrie & M. Polinsky (eds.). Causatives and Transitivity. Amsterdam: Benjamins, 87–120. Hay, Jennifer, Christopher Kennedy & Beth Levin 1999. Scalar structure underlies telicity in ‘degree achievements’. In: T. Matthews & D. Strolovich (eds.). Proceedings of Semantics and Linguistic Theory (=SALT) IX. Ithaca, NY: Cornell University, 127–144. van Hout, Angeliek 1996. Event Semantics of Verb Frame Alternations. A Case Study of Dutch and Its Acquisition. Doctoral dissertation. Katholieke Universiteit Brabant, Tilburg. Jackendoff, Ray S. 1972. Semantic Interpretation in Generative Grammar. Cambridge, MA: The MIT Press. Jackendoff, Ray S. 1976. Toward an explanatory semantic representation. Linguistic Inquiry 7, 89–150. Jackendoff, Ray S. 1983. Semantics and Cognition. Cambridge, MA: The MIT Press. Jackendoff, Ray S. 1987. The status of thematic relations in linguistic theory. Linguistic Inquiry 18, 369–411. Jackendoff, Ray S. 1990. Semantic Structures. Cambridge, MA: The MIT Press.
437
438
IV. Lexical semantics Jackendoff, Ray S. 1996a. Conceptual semantics and cognitive linguistics. Cognitive Linguistics 7, 93–129. Jackendoff, Ray S. 1996b. The proper treatment of measuring out, telicity, and perhaps even quantification in English. Natural Language and Linguistic Theory 14, 305–354. Joshi, Aravind 1974. Factorization of verbs. In: C. H. Heidrich (ed.). Semantics and Communication. Amsterdam: North-Holland, 251–283. Kennedy, Christopher 1999. Projecting the Adjective. The Syntax and Semantics of Gradability and Comparison. New York: Garland. Kennedy, Christopher 2001. Polar opposition and the ontology of ‘degrees’. Linguistics & Philosophy 24, 33–70. Kennedy, Christopher & Beth Levin 2008. Measure of change. The adjectival core of degree achievements. In: L. McNally & C. Kennedy (eds.). Adjectives and Adverbs. Syntax, Semantics, and Discourse. Oxford: Oxford University Press, 156–182. Kiparsky, Paul 1997. Remarks on denominal verbs. In: A. Alsina, J. Bresnan & P. Sells (eds.). Complex Predicates. Stanford, CA: CSLI Publications, 473–499. Koontz-Garboden, Andrew 2007. States, Changes of State, and the Monotonicity Hypothesis. Ph.D. dissertation. Stanford University, Stanford, CA. Krifka, Manfred 1998. The origins of telicity. In: S. Rothstein (ed.). Events and Grammar. Dordrecht: Kluwer, 197–235. Lakoff, George 1968. Some verbs of change and causation. In: S. Kuno (ed.). Mathematical Linguistics and Automatic Translation. Report NSF-20. Cambridge, MA: Aiken Computation Laboratory, Harvard University. Lakoff, George 1970. Irregularity in Syntax. New York: Holt, Rinehart & Winston. Laughren, Mary 1988. Towards a lexical representation of Warlpiri verbs. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 215–242. Levin, Beth 1999. Objecthood. An event structure perspective. In: S. Billings, J. Boyle & A. Griffith (eds.). Proceedings of the Chicago Linguistic Society (=CLS) 35, Part 1: Papers from the Main Session. Chicago, IL: Chicago Linguistic Society, 223–247. Levin, Beth 2000. Aspect, lexical semantic representation, and argument expression. In: L. J. Conathan et al. (eds.). Proceedings of the 26th Annual Meeting of the Berkeley Linguistics Society (=BLS). General Session and Parasession on Aspect. Berkeley, CA: Berkeley Linguistic Society, 413–429. Levin, Beth & Malka Rappaport Hovav 1991. Wiping the slate clean. A lexical semantic exploration. Cognition 41, 123–151. Levin, Beth & Malka Rappaport Hovav 1995. Unaccusativity. At the Syntax-Lexical Semantics Interface. Cambridge, MA: The MIT Press. Levin, Beth & Malka Rappaport Hovav 1999. Two structures for compositionally derived events. In: T. Matthews & D. Strolovich (eds.). Proceedings of Semantics and Linguistic Theory (=SALT) IX. Ithaca, NY: Cornell University, 199–223. Levin, Beth & Malka Rappaport Hovav 2005. Argument Realization. Cambridge: Cambridge University Press. Lyons, John 1967. A note on possessive, existential and locative sentences. Foundations of Language 3, 390–396. Lyons, John 1968. Introduction to Theoretical Linguistics. Cambridge: Cambridge University Press. Macfarland, Talke 1995. Cognate Objects and the Argument/Adjunct Distinction in English. Ph.D. dissertation. Northwestern University, Evanston, IL. Marantz, Alec P. 1984. On the Nature of Grammatical Relations. Cambridge, MA: The MIT Press. Mateu, Jaume 2001a. Small clause results revisited. In: N. Zhang (ed.). Syntax of Predication. ZAS Papers in Linguistics 26. Berlin: Zentrum für Allgemeine Sprachwissenschaft. Mateu, Jaume 2001b. Unselected objects. In: N. Dehé & A. Wanner (eds.). Structural Aspects of Semantically Complex Verbs. Frankfurt/M.: Lang, 83–104.
19. Lexical Conceptual Structure McCawley, James D. 1968. Lexical insertion in a transformational grammar without deep structure. In: B. J. Darden, C.-J. N. Bailey & A. Davison (eds.). Papers from the Fourth Regional Meeting of the Chicago Linguistic Society (=CLS). Chicago, IL: Chicago Linguistic Society, 71–80. McCawley, James D. 1971. Prelexical syntax. In: R. J. O’Brien (ed.). Report of the 22nd Annual Roundtable Meeting on Linguistics and Language Studies. Washington, DC: Georgetown University Press, 19–33. McClure, William T. 1994. Syntactic Projections of the Semantics of Aspect. Ph.D. dissertation. Cornell University, Ithaca, NY. Morgan, Jerry L. 1969. On arguing about semantics. Papers in Linguistics 1, 49–70. Nida, Eugene A. 1975. Componential Analysis of Meaning. An Introduction to Semantic Structures. The Hague: Mouton. Parsons, Terence 1990. Events in the Semantics of English. Cambridge, MA: The MIT Press. Pesetsky, David M. 1982. Paths and Categories. Ph.D. dissertation. MIT, Cambridge, MA. Pesetsky, David M. 1995. Zero Syntax. Experiencers and Cascades. Cambridge, MA: The MIT Press. Pinker, Steven 1989. Learnability and Cognition. The Acquisition of Argument Structure. Cambridge, MA: The MIT Press. Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA: The MIT Press. Pylkkänen, Liina 2008. Introducing Arguments. Cambridge, MA: The MIT Press. Ramchand, Gillian C. 1997. Aspect and Predication. The Semantics of Argument Structure. Oxford: Clarendon Press. Ramchand, Gillian C. 2008. Verb Meaning and the Lexicon. A First-Phase Syntax. Cambridge: Cambridge University Press. Rappaport Hovav, Malka 2008. Lexicalized meaning and the internal temporal structure of events. In: S. Rothstein (ed.). Theoretical and Crosslinguistic Approaches to the Semantics of Aspect. Amsterdam: Benjamins, 13–42. Rappaport Hovav, Malka & Beth Levin 1998a. Morphology and lexical semantics. In: A. Spencer & A. Zwicky (eds.). The Handbook of Morphology. Oxford: Blackwell, 248–271. Rappaport Hovav, Malka & Beth Levin 1998b. Building verb meanings. In: M. Butt & W. Geuder (eds.). The Projection of Arguments. Lexical and Compositional Factors. Stanford, CA: CSLI Publications, 97–134. Rappaport Hovav, Malka & Beth Levin 2001. An event structure account of English resultatives. Language 77, 766–797. Rappaport Hovav, Malka & Beth Levin 2005. Change of state verbs. Implications for theories of argument projection. In: N. Erteschik-Shir & T. Rapoport (eds.). The Syntax of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 276–286. Rappaport Hovav, Malka & Beth Levin 2008. The English dative alternation. The case for verb sensitivity. Journal of Linguistics 44, 129–167. Rappaport Hovav, Malka & Beth Levin 2010. Reflections on manner/result complementarity. In: E. Doron, M. Rappaport Hovav & I. Sichel (eds.). Syntax, Lexical Semantics, and Event Structure. Oxford: Oxford University Press 5, 21–38. Rappaport, Malka, Beth Levin & Mary Laughren 1988. Niveaux de représentation lexicale. Lexique 7, 13–32. English translation in: J. Pustejovsky (ed.). Semantics and the Lexicon. Dordrecht: Kluwer, 1993, 37–54. Ross, John Robert 1972. Act. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 70–126. Rothstein, Susan 2003. Structuring Events. A Study in the Semantics of Aspect. Oxford: Blackwell. Schaefer, Ronald P. (1985) Motion in Tswana and its characteristic lexicalization. Studies in African Linguistics 16, 57–87. Slobin, Dan I. 2004. The many ways to search for a frog. Linguistic typology and the expression of motion events. In: S. Strömqvist & L. Verhoeven (eds.). Relating Events in Narrative, vol. 2: Typological and Contextual Perspectives. Mahwah, NJ: Erlbaum, 219–257.
439
440
IV. Lexical semantics von Stechow, Arnim 1995. Lexical decomposition in syntax. In: U. Egli et al. (eds.). Lexical Knowledge in the Organization of Language. Amsterdam: Benjamins, 81–118. von Stechow, Arnim 1996. The different readings of wieder. A structural account. Journal of Semantics 13, 87–138. Stowell, Timothy 1981. Origins of Phrase Structure. Ph.D. dissertation. MIT, Cambridge, MA. Talmy, Leonard 1975. Semantics and syntax of motion. In: J. P. Kimball (ed.). Syntax and Semantics 4. New York: Academic Press, 181–238. Talmy, Leonard 1985. Lexicalization patterns. Semantic structure in lexical forms. In: T. Shopen (ed.). Language Typology and Syntactic Description, vol. 3: Grammatical Categories and the Lexicon. Cambridge: Cambridge University Press, 57–149. Talmy, Leonard 1991. Path to realization. A typology of event conflation. In: L. A. Sutton, C. Johnson & R. Shields (eds.). Proceedings of the 17th Annual Meeting of the Berkeley Linguistics Society (=BLS). General Session and Parasession. Berkeley, CA: Berkeley Linguistic Society, 480–519. Talmy, Leonard 2000. Toward a Cognitive Semantics, vol. 2: Typology and Process in Concept Structuring. Cambridge, MA: The MIT Press. Taylor, John R. 1996. On running and jogging. Cognitive Linguistics 7, 21–34. Tenny, Carol L. 1994. Aspectual Roles and the Syntax-Semantics Interface. Dordrecht: Kluwer. Tham, Shiao Wei 2004. Representing Possessive Predication. Semantic Dimensions and Pragmatic Bases. Ph.D. dissertation. Stanford University, Stanford, CA. Travis, Lisa D. 1984. Parameters and Effects of Word Order Variation. Ph.D. dissertation. MIT, Cambridge, MA. Travis, Lisa D. 2000. The l-syntax/s-syntax boundary. Evidence from Austronesian. In: I. Paul, V. Phillips & L. Travis (eds.). Formal Issues In Austronesian Linguistics. Dordrecht: Kluwer, 167–194. van Valin, Robert D. 1993. A synopsis of role and reference grammar. In: R. D. van Valin (ed.). Advances in Role and Reference Grammar. Amsterdam: Benjamins, 1–164. van Valin, Robert D. 2005. Exploring the Syntax-Semantics Interface. Cambridge: Cambridge University Press. van Valin, Robert D. & Randy J. LaPolla 1997. Syntax. Structure, Meaning, and Function. Cambridge: Cambridge University Press. Wechsler, Stephen 2006. Thematic structure. In: K. Brown (ed.). Encyclopedia of Language and Linguistics. 2nd edn. Amsterdam: Elsevier, 645–653. 1st edn. 1994. Wilks, Yorick 1987. Primitives. In: S. C. Shapiro (ed.). Encyclopedia of Artificial Intelligence. New York: Wiley, 759–761. Williams, Edwin 1981. Argument structure and morphology. The Linguistic Review 1, 81–114. Wunderlich, Dieter 1997a. Argument extension by lexical adjunction. Journal of Semantics 14, 95–142. Wunderlich, Dieter 1997b. cause and the structure of verbs. Linguistic Inquiry 28, 27–68. Wunderlich, Dieter 2000. Predicate composition and argument extension as general options. A study in the interface of semantic and conceptual structure. In: B. Stiebels & D. Wunderlich (eds.). Lexicon in Focus. Berlin: Akademie Verlag, 247–270. Wunderlich, Dieter 2006. Argument hierarchy and other factors determining argument realization. In: I. Bornkessel et al. (eds.). Semantic Role Universals and Argument Linking. Theoretical, Typological, and Psycholinguistic Perspectives. Berlin: Mouton de Gruyter, 15–52. Yoneyama, Mitsuaki 1986. Motion verbs in conceptual semantics. Bulletin of the Faculty of Humanities 22. Tokyo: Seikei University, 1–15. Zubizarreta, Maria Luisa & Eunjeong Oh 2007. On the Syntactic Composition of Manner and Motion. Cambridge, MA: The MIT Press.
Beth Levin, Stanford, CA (USA) Malka Rappaport Hovav, Jerusalem (Israel)
20. Idioms and collocations
441
20. Idioms and collocations 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
Introduction: Collocation, collocations, and idioms Lexical properties of idioms Compositionality How frozen are idioms? Syntactic flexibility Morphosyntax Idioms as constructions Diachronic changes Idioms in the lexicon Idioms in the mental lexicon Summary and conclusion References
Abstract Idioms constitute a subclass of multi-word units that exhibit strong collocational preferences and whose meanings are at least partially non-compositional. Drawing on English and German corpus data, we discuss a number of lexical, syntactic, morphological and semantic properties of Verb Phrase idioms that distinguish them from freely composed phrases. The classic view of idioms as “long words” admits of little or no variation of a canonical form. Fixedness is thought to reflect semantic non-compositionality: the nonavailability of semantic interpretation for some or all idiom constituents and the impossibility to parse syntactically ill-formed idioms block regular grammatical operations. However, corpus data testify to a wide range of discourse-sensitive flexibility and variation, weakening the categorical distinction between idioms and freely composed phrases. We cite data indicating that idioms are subject to the same diachronic developments as simple lexemes. Finally, we give a brief overview of psycholinguistic research into the processing of idioms and attempts to determine their representation in the mental lexicon.
1. Introduction: Collocation, collocations and idioms Words in text and speech do not co-occur freely but follow rules and patterns. We draw an initial three-fold distinction for recurrent lexical co-occurrences: collocation, patterns of words found in close neighborhood, and collocations and idioms, multi-word units with lexical status that are often distinguished in terms of their fixedness and semantic non-compositionality. The remainder of the paper will focus on idioms, in particular verb phrase idioms. We describe their syntactic, morphosyntactic and lexical properties with respect to frozenness and variation. The data examined here show a sliding scale of fixedness and a concomitant range of semantic compositionality. We next consider the diachronic behavior of idioms, which does not seem to differ from that of simple lexemes. Finally, several theories concerning the mental representation of idioms are discussed. Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 441–456
442
IV. Lexical semantics
1.1. Collocation: A statistical property of language Collocation, the co-occurrence patterns of words observed in spoken and written language, is constrained by syntactic (grammatical), semantic, and lexical properties of words. At each level, linguists have attempted to formulate rules and constraints for co-occurrence. The syntactic constraints on lexical co-occurrence are specific to a word but constrained by its syntactic class membership. For example, the adjective prone, unlike hungry and tired, subcategorizes for a Prepositional Phrase headed by to; moreover, hungry and tired, but not prone, can occur pre-nominally. Firth (1957) called the syntactic constraints on a word’s selection of neighboring words “colligation’’. Firth moreover recognized that words display collocational properties beyond those imposed by syntax. He coined the term “collocation’’ for the “habitual or customary places’’ of a word. Firth’s statement that “you can recognize a word by the company it keeps’’ has been given scientific validation by corpus studies establishing lexical profiles for words, which reflect their regular attraction to other words. Co-occurrence properties of individual lexical items can be expressed quantitatively (Halliday 1966, Sinclair 1991, Stubbs 2001, Krenn 2000, Evert 2005, inter alia). Church & Hanks (1990), and Church et al. (1991) proposed Mutual Information as a measure of the tendency of word forms to select or deselect one another in a context. Mutual Information compares the probability of two lexemes’ co-occurrence with the probability that they occur alone and that their co-occurrence is merely chance. Mutual Information is a powerful tool for measuring the degree of fixedness, and thus lexical status, of word groups. Highly fixed expressions that have been identified statistically are candidates for inclusion in the lexicon, regardless of their semantic transparency. Fixed expressions are also likely to be stored as units in speakers’ mental lexicons, and retrieved as units rather than composed anew each time. Psycholinguists are aware of people’s tendency to chunk (Miller 1956). Statistical analyses can also quantify the flexibility of fixed expressions, which are rarely complelety frozen (see sections 3 and 4). Fazly & Stevenson (2006) propose a method for the automatic discovery of the set of syntactic variations that VP idioms can undergo and that should be included in their lexical representation. Fazly & Stevenson further incorporate this information into statistical measures that effectively predict the idiomaticity level of a given expression. Others have measured the distributional similarity between the expression and its constituents (McCarthy, Keller & Carroll 2003, Baldwin et al. 2003). The collocational properties of a language are perhaps best revealed in the errors committed by learners and non-native speakers. A fluent speaker may have sufficient command of the language to find the words that express the intended concept. However, only considerable competence allows him to select that word from among several near-synonyms that is favored by its neighbors; the subtlety of lexical preference appears unmotivated and arbitrary.
1.2. Collocations: Phrasal lexical items We distinguish collocation, the linguistic and statistically measurable phenomenon of co-occurrence, from collocations, specific instances of collocation that have lexical status.
20. Idioms and collocations Collocations like beach house, eat up, and blood-red are mappings of word forms and word meanings, much like simple lexemes such as house and eat and red. Collocations are “pre-fabricated’’ combinations of simple lexemes. Jackendoff (1997) collected multi-word units (MWUs) from the popular U.S. television show “Wheel of Fortune’’ and showed the high frequency of collocations and phrases, both in terms of types and tokens. He estimates that MWUs constitute at least half the entire lexicon; Cowie (1998) reports the percentage of verb phrase idioms and collocations in news stories and feature articles to be around forty percent. Such figures make it clear that MWUs constitute a significant part of the lexicon and collocating is a pervasive aspect of linguistic behavior. Collocations include a wide range of MWUs. A syntax-based typology (e.g., Moon 1998) distinguishes sentential proverbs (the early bird gets the worm), routine formulae (thanks a lot), and adjective phrases expressing similes (sharp as a whistle). Frames or constructions (Fillmore, Kay & O’Connor 1988) consist of closed-class words with slots for open-class words, such as what’s X doing y? and the X-er the Y-er; these frames carry meaning independent of the particular lexemes that fill the open slots. A large class of collocations are support (or “light’’) verb constructions, where a noun collocates with a specific verbal head such as make a decision and take a photograph (Storrer 2007). Under a semantically-based classification, phrasal lexemes form a continuous scale of fixedness and semantic transparency. Following Moon (1998), we say that collocations are decomposable multi-word units that often follow the paradigmatic patterns of free language. For example, a number of German verb phrase collocations show the causative/inchoative/stative alternation pattern, e.g., in Rage bringen/kommen/sein (make/ become/be enraged).
1.3. Idioms Idioms like hit the ceiling, lose one’s head, and when the cows come home, constitute another class of MWUs with lexical status. They differ from the kinds of collocations discussed in the previous section in several respects. To begin with, the lexical make-up of idioms is usually unpredictable and often highly idiosyncratic, violating the usual rules of selectional restrictions. Examples are English rain cats and dogs and talk turkey, and German unter dem Pantoffel stehen (lit., ‘stand under the slipper’, be dominated). Some idioms have a possible literal (non-idiomatic) interpretation, such as drag one’s feet and not move a finger. Certain idioms are syntactically ill-formed, such as English trip the light fantastic and German an jemandem einen Narren gefressen haben (lit. ‘have eaten a fool on/at somebody’, be infatuated with somebody), or Bauklötze staunen (lit. ‘be astonished toy blocks’, be very astonished), which violate the verb’s subcategorization properties. Others, like by and large, do not constitute syntactic categories. Perhaps the most characteristic feature ascribed to idioms is their semantic noncompositionality. Because the meaning of idioms is not made up by the meanings of their constituents (or only to a limited extent), their meanings are considered largely opaque. A common argument says that if the components of idioms are semantically nontransparent to speakers, they are not available for the kinds of grammatical operations found in free, literal, language. As a result, idioms are often considered fixed, “frozen’’ expressions, or “long words’’ (Swinney & Cutler 1979, Bobrow & Bell 1973).
443
444
IV. Lexical semantics We will consider each of these properties attributed to idioms, focusing largely on verb phrase (VP) idioms with a verbal head that select for at least one noun phrase or prepositional phrase complement. Such idioms have generated much discussion, focused mostly on putative constraints on their syntactic frozenness. VP idioms were also found to be the most frequent type of MWUs in the British Hector Corpus (Moon 1998).
2. Lexical properties of idioms Idioms are perhaps most recognizable by their lexical make-up. Selectional restrictions are frequently violated, and the idiom constituents tend not to be semantically related to words in the surrounding context, thus signaling the need for a non-literal interpretation. But there are many syntactically and semantically well-formed strings that are polysemous between an idiomatic and a literal, compositional meaning, such as play first fiddle and fall off the wagon. Here, context determines which meaning is likely the intended one. Of course, the ambiguity between literal and idiomatic reading is related to the plausibility of the literal reading. Thus, pull yourself up by your bootstraps, give/lend somebody a hand, and German mit der Kirche ums Dorf fahren (lit. ‘drive with the church around the village’, deal with an issue in an overly complicated or laborious manner) denote highly implausible events in their literal readings.
2.1. Polarity Many idioms are negative polarity items: not have a leg to stand on, not give somebody the time of day, not lift a finger, be neither fish nor fowl, no love lost, not give a damn/ hoot/shit. Without the negation, the idiomatic reading is lost; in specific contexts, it may be preserved in the presence of a marked stress pattern. Corpus data show that many idioms do not require a fixed negation component but can occur in other negative environments (questions, conditionals, etc.), just like their non-idiomatic counterparts (Söhn 2006, Sailer & Richter 2002, Stantcheva 2006 for German). Idioms as negative polarity items occur crosslinguistically, and negative polarity may be one universal hallmark of idiomatic language.
2.2. Idiom-specific lexemes Many idioms contain lexemes that do not occur outside their idiomatic contexts. Examples are English thinking cap (in put on one’s thinking cap), humble pie (in eat humble pie), and German Hungertuch (am Hungertuch nagen; lit. ‘gnaw at the hunger-cloth’, be destitute). Similarly, the noun Schlafittchen in the idiom am Schlafittchen packen, (lit. ‘grab someone by the wing’, confront or catch somebody), is an obsolete word referring to “wing’’ and its meaning is opaque to contemporary speakers. Other idiom components have meaning outside their idiomatic context but may occur rarely in free language (gift horse in don’t look a gift horse in the mouth), cf. article 80 (Olsen) Semantics of compounds.
2.3. Non-referential pronouns Another lexical peculiarity of many English idioms is the obligatory presence of a nonreferential it: have it coming, give it a shot, lose it, have it in for somebody, etc. These
20. Idioms and collocations pronouns are a fixed component of the idioms and no number or gender variation can occur without loss of the idiomatic reading. They do not carry meaning, though they may have referred at an earlier stage before the idiom became fixed. Substituting a contextappropriate noun does not seem felicitous, either: have the punishment coming, give the problem a shot etc. seem odd at best. (A somewhat similar phenomenon are a number of German idioms containing eins, einen (lit. ‘one’): sich eins lachen, einen draufmachen, ‘have a good laugh/have a night on the town.’)
3. Compositionality Idioms are often referred to as non-compositional, i.e., their meaning is not composed of the meanings of their constituents, as is the case in freely composed language. In fact, idioms vary with respect to the degree of non-compositionality. One way to measure the extent to which an idiom violates the compositional norms of the language is to examine and measure statistically the collocational properties of its constituents. The verb fressen (eat like an animal) co-occurs with nouns that fill the roles of the eating and the eaten entities; among the latter Narr (“fool’’) will stand out as not belonging into the class of entities included in the verb’s selectional preferences; a thesaurus or WordNet (Fellbaum 1998) could firm up this intuition. In highly opaque idioms, none of the constituents can be mapped onto a referent, as is the case with and jemandem einen Narren gefressen haben and the much-cited kick the bucket. The literal equivalents of these verbs do not select for arguments that the idiom constituents could be mapped to in a one-to-one fashion so as to allow a “translation’’. An idiom’s lexeme becomes obsolete outside its idiomatic use, with the result that meaning becomes opaque. For example, Fittiche in the idiom unter die Fittiche nehmen seems not to be interpreted as “wings’’ by some contemporary speakers, as attested by corpus examples like unter der/seiner Fittiche, where the morphology shows that the speaker analyzed the noun as a feminine singular rather than the plural and possibly it assigned a new meaning. Lexical substitutions and adjectival modifications show that speakers often assign a meaning to one or more constituents of an idiom (see section Lexical variation), though such variations from the “canonical forms’’ tend to be idiosyncratic and are often specific to the discourse in which they are found. Many idioms are partially decomposable; they may contain one or more lexemes with a conventional meaning or a metaphor.
3.1. Metaphors Many idioms containt metaphors, and the notion of metaphor is closely linked to idioms; cf. article 26 (Tyler & Takahashi) Metaphors and metonymies. We use the term here to denote a single lexeme (most often a noun or noun phrase) with a figurative meaning. For example, jungle is often used as a metaphor for a complex, messy, competitive, and perhaps dangerous situation; this use of jungle derives from its literal meaning, “impenetrable forest’’ and preserves some of the salient properties or the original concept (Glucksberg 2001). A metaphor like jungle may become conventionalized and be readily interpreted in the appropriate contexts as referring to a competitive situation. Thus, the expression it’s a jungle out here is readily understandable due to the interpretability of
445
446
IV. Lexical semantics the noun. Idioms that contain conventional or transparent metaphors are thus at least partly compositional. Many idioms arguably contain metaphors, though their use is bound to a particular idioms. For example, nettle in grasp the nettle and bull in take the bull by the horns can be readily interpreted as referring to specific entities in the context where the idioms are used. But such metaphors do not work outside the idioms; this assignment involves numerous nettles/bulls would be difficult to interpret, despite the fact that nettle and bull seem like appropriate expressions referring to a difficult problem or a challenge, respectively, and could thus be considered “motivated’’. Arguments have been made for a cognitive, culturally-embedded basis of metaphors and idioms containing metaphors that shapes their semantic structure and makes motivation possible (Lakoff & Johnson 1980, Lakoff 1987, Gibbs & Steen 1999, Dobrovol’skij 2004, inter alia). Indeed, certain conceptual metaphors are crosslinguistically highly prevalent and productive. For example, the “time is money’’ metaphor is reflected in the many possession verbs (have, give, take, buy, cost) that take time as an argument. However, Burger (2007) points out that not all idioms are metaphorical and that not all metaphorical idioms can be traced back to general conceptual metaphors. For example, in the idiom pull strings, strings does not encode an established metaphor that is readily understandable outside the idiom and that is exploited in related idioms. Indeed, a broad claim to the universality and cultural independence of metaphor seems untenable. The English idiom pull strings structurally and lexically resembles the German idiom die Drähte ziehen (lit.: ‘pull the wires’), but the German idiom does not mean “to use one’s influence to one’s favor’’, but rather “to mastermind’’. Testing the conceptual metaphor hypothesis, Keysar & Bly (1995) asked speakers guess the meanings of unfamiliar idioms involving such salient concepts as “high’’ and “up’’, which many cognitivists claim to be universally associated with increased quality and quantity. Subjects could not guess the meanings of many of the phrases, casting doubt on the power and absolute universality of conceptual metaphors.
4. How frozen are idioms? A core property associated with idioms is lexical, morphosyntactic, and syntactic fixedness, which is said to be a direct consequence of semantic opacity. Many grammatical frameworks, beginning with early Transformational-Generative Grammar, explicitly classify idioms as “long words’’ that are exempt from regular and productive rules, much like strong verbs. Lexicographic resources implicitly reinforce the view of idioms as fixed strings by listing idioms in morphologically unmarked citation forms, or lemmata, usually based of the lexicographer’s intuition. And experimental psycholinguists studying idiom processing commonly use only such citation forms. The few much-cited classic examples, including kick the bucket and trip the light fantastic, served well to exemplify the properties ascribed in a wholesale fashion to all idioms. But soon linguists began to note that not all idioms were alike. Fraser (1970) distinguished several degrees of syntactic frozenness among idioms. Where a mapping can be made from the idiom’s constituents to those of its literal counterpart, syntactic
20. Idioms and collocations flexibility is possible; simply put, the degree of semantic opacity is reflected in the degree of syntactic frozenness. Echoing Weinreich (1969), Fraser states that there is no idiom that is completely free. In particular, he rules out focusing operations like clefting and topicalization, as these are conditional on the focused constituent bearing a meaning. Citing constructed data, Cruse (2004) also categorically rules out cleft constructions for idioms. Syntactic and lexical frozenness is commonly ascribed to the semantic noncompositionality of idioms. Indeed, it seems reasonable to assume that semantically unanalyzable strings cannot be subject to regular grammatical processes as syntactic variations typically serve to (de-)focus or modify particular constituents; when these carry no meanings, such operations are unmotivated. But corpus studies show that a “standard’’, fixed form cannot be determined for many idioms, at least not on a quantitative basis. Speakers use many idioms in ways that deviate from that given in dictionary in more than one way; some idioms exhibit the same degree of freedom as freely composed strings. We rely in this article on corpus examples to illustrate idiom variations. The English data come from Moon’s (1998) extensive analysis of the Hector Corpus. The German data are based on the in-depth analysis of 1,000 German idioms in a one billion word corpus of German (Fellbaum 2006, 2007b). We are not aware of similar large-scale corpus analyses in other languages and therefore the data in this article will be somewhat biased towards English and German.
5. Syntactic flexibility As a wider range of idioms were considered, linguists realized that syntactic flexibility was not an exception. The influential paper by Nunberg, Sag & Wasow (1994) argued for the systematic relation between semantic compositionality and flexibility. Both Abeillé (1995), who based her analysis on a number of French idioms, and Dobrovol’skij (1999) situate idioms on a continuum of flexibility that interacts with semantic transparency. But while it seems true that constituents that are assigned a meaning by speakers are open to grammatical operations as well as modification and even lexical substitution, semantic transparency is not a requirement for variation. We consider as an example the German idiom kein Blatt vor den Mund nehmen (lit. ‘take no leaf/sheet in front of one’s mouth’, be outspoken, speak one’s mind). Blatt (leaf, sheet) has no obvious referent. Yet numerous attested corpus examples are found where Blatt is passivized, topicalized, relativized and pronominalized; moreover, this idiom need not always appear as a Negative Polarity Item: (1) Bei BMW wird kein Blatt vor den Mund genommen. (2) Ein Blatt habe er nie vor den Mund genommen. (3) Das Blatt, das Eva vor ihr erregendes Geheimnis gehalten, ich nähme es nicht einmal vor den Mund. (4) Ein Regierungssprecher ist ein Mann, der sich 100 Blätter vor den Mund nimmt.
447
448
IV. Lexical semantics Even “cran-morphemes’’, i.e., words that do not usually occur outside the idiom, can behave like free lexemes. For example, Fettnäpfchen rarely occurs outside the idiom ins Fettnäpfchen treten (lit. ‘step into the little grease pot’, commit a social gaffe), and no meaning can be attached to it by a contemporary speaker that would map to the meaning of a constituent in the literal equivalent. Yet the corpus shows numerous uses of this idiom where the noun is relativied, topicalized, quantified, and modified. (5) Das Fettnäpfchen, in das die Frau ihres jüngsten Sohnes gestiegen ist, ist aber auch riesig. (6) Ins Fettnäpfchen trete ich bestimmt mal und das ist gut so. (7) Silvio Berlusconi: Ein Mann, viele Fettnäpfchen. (8) Immer trat der New Yorker ins bereitstehende Fettnäpfchen. Syntactic variation is probably due to speakers’ ad-hoc assignment of meanings of idiom components. Adjectival modification and lexical variations in particular indicate that the semantic interpretation of inherently opaque constituents are dependent on the particular context in which the idiom is embedded. For more examples and discussion see Moon (1998) and Fellbaum (2006, 2007b).
6. Morphosyntax The apparent fixedness of many idioms extends beyond constituent order and lexical selection to their morphosyntactic make-up.
6.1. Modality, tense, aspect The idiomatic reading of many strings requires a specific modality, tense, or aspect. Thus horses wouldn’t get (NP) to (VP) is at best odd without the conditional: horses don’t get (NP) to (VP). Similarly, the meaning of I couldn’t care less is not preserved in I could care less. Interestingly, the negation in the idiom is often omitted even when the speaker does not intend a change of polarity, as in I could care less; perhaps speakers are uneasy about the double negation in such cases but do not decompose the idiom to realize that a single negation changes the meaning. Another English idiom requiring a specific modal is will not/won’t hear of it. The German idioms in den Tuschkasten gefallen sein (lit. ‘have fallen into the paint box’, be overly made up) and nicht auf den Kopf gefallen sein (lit. ‘not have fallen on one’s head’, be rather intelligent) denote states and cannot be interpreted as idioms in any tense other than the perfect (Fellbaum 2007a).
6.2. Determiner and number The noun in many VP idioms is preceded by a definite determiner even though it lacks definiteness and, frequently, reference: kick the bucket, fall off the wagon, buy the farm. The determiner here may be a relic from the time when transparent phrases became lexicalized as idioms.
20. Idioms and collocations A regular and frequent grammatical process in German is the contraction of the determiner with a preceding preposition. Thus, the idiom jemanden hinters Licht führen (lit. ‘lead someone behind the light’, deceive someone), occurs most frequently with hinter (behind) and das (the) contracted to hinters. Eisenberg (1999, inter alia) assert that the figurative reading of many German idioms precludes a change in the noun’s determiner. This hypothesis seems appealing: if the noun phrase is semantically opaque, the choice of determiner does not follow the usual rules of grammar and is therefore arbitrary. As speakers cannot interpret the noun, the determiner is not subject to grammatical processes. Decontraction in cases like hinters Licht führen is thus ruled out by Eisenberg, as is contraction in cases where the idiom’s citation form found in dictionaries shows the preposition separated from the determiner. However, Firenze (2007) cites numerous corpus examples of idioms with decontraction. In some cases, the decontraction is conditioned by the insertion of lexical material such as adjectives between the preposition and the determiner or number variation of the noun; in other cases, contracted and decontracted forms alternate freely. Such data show that the NP is subject to the regular grammatical processes of free language, even when its meaning is not transparent: Licht in the idiom has no referent, like wool in the corresponding English idiom pull the wool over someone’s eyes. Corpus data also show number variations for the nouns. For example, in den Bock zum Gärtner machen (lit. ‘make the buck the gardener’, put an incompetent person in charge), both nouns occur predominantly in the singular. But we find examples with plural nouns, Böcke zu Gärtnern machen, where the speaker refers to several specific incompetent persons.
6.3. Lexical variation Variation is frequently attested for the nominal, verbal, and adjectival components of VP idioms. In most cases, this variation is context-dependent, and shows conscious playfulness on the part of the speaker or writer. Fellbaum & Stathi (2006) discuss three principal cases of lexical variation: paradigmatic variation, adjectival modification, and compounding. In each case, the variation plays on the literal interpretation of the varied component. Paradigmatic substitution has occurred in the case where verb in jemandem die Leviten lesen (lit., ‘read the Levitus to someone’, read someone the riot act) has been replaced by quaken (croak like a frog) and brüllen (scream). An adjective has been added to the noun in an economics text in die marktwirtschaftlichen Leviten lesen, ‘read the market-economical riot act.’ An example for compounding is er nimmt kein Notenblatt vor den Mund where the idiom kein Blatt vor den Mund nehmen (lit. ‘take no leaf/sheet in front of one’s mouth’, meaning be outspoken) occurs in the context of musical performance and Blatt becomes Notenblatt, ‘sheet of music.’ Besides morphological and syntactic variation, corpus data show that speakers vary the lexical form of many idioms. Moon’s (1998) observation that lexical variation is often humorous is confirmed by the German corpus data, as is Moon’s finding that such variation is found most frequently in journalism. (See also Kjellmer 1991 for examples of playful lexical variations in collocations, such as not exactly my cup of tequila.) Ad-hoc lexical variation is dependent on the particular discourse and must play on the literal reading of the substituted constituent rather than on a metaphoric one. That is, spill
449
450
IV. Lexical semantics the secret would be hard to interpret, whereas rock the submarine would be interpretable in the appropriate context, as submarine invokes the paradigmatically related boat. Moon (1998) discusses another kind of lexical variation, called idiom schemas, exemplified by the group of idioms hit the deck/sack/hay. Idiom schemas correspond to Nunberg, Sag & Wasow’s (1994) idiom families, such as don’t give a hoot/damn/shit. The variation is limited, and each variant is lexicalized rather than ad-hoc. But in other cases, lexical variation is highly productive. An example is the German idiom hier tanzt der Bär (lit. ‘here dances the bear’, this is where the action is), which has spawned a family of new expressions with the same meaning, including hier steppt der Bär (lit. ‘the bear does a step dance here’) and even hier rappt der Hummer (lit. ‘the lobster raps here’) and hier boxt der Papst (lit., ‘the Pope is boxing here’). There are clear constraints on the lexical variations in terms of semantic sets, unlike in cases like don’t give a hoot/damn/shit; this may account for the productivity of the “bear’’ idiom.
6.4. Semantic reanalysis Motivation is undoubtedly a strong factor in the variability of idioms; if speakers can assign meaning to a component, even if only in a specific context, the component is available for modification and syntactic operations. Corpus examples with lexical and syntactic variations suggest that speakers attribute meaning to idiom components that are opaque to contemporary speakers and remotivate them. Gehweiler (2007) discusses the German idiom in die Röhre schauen (lit. ‘look into the pipe’, go empty-handed), which originates in the language of hunters and referred to dogs peering into foxholes. The meaning of noun here is opaque to contemporary speakers, but the idiom has aquired more recent, additional senses where the noun is re-interpreted.
7. Idioms as constructions Fillmore, Kay & O’Connor (1988) discuss the idiomaticity of syntact constructions like the X-er the Y-er, which carry meaning independent of the lexical items that fill the slots; cf. article 86 (Kay & Michaelis) Constructional meaning. Among VP idioms with a more regular phrase structure, some require the suppression of an argument for the idiomatic reading, resulting in a violation of the verb’s subcategorization properties. For example, werfen (‘throw’) in the German idiom das Handtuch werfen, lit. ‘throw the towel’ does not co-occur with a Location (Goal) argument, which is required in the verb’s non-idiomatic use. Other idioms require the presence of an argument that is optional in the literal language; this is the case for many ditransitive German VP idioms, including jemandem ein Bein stellen, lit., ‘place a leg for someone’, ‘trip someone up’ and jemandem eine Szene machen, lit. ‘make a scene for someone’, cause a scene that embarrasses someone in German. Here, the additional indirect object (indicated by the placeholder someone) is most often an entity that is negatively affected by the event, a Maleficiary. Whereas in free language use, ditransitive constructions often denote the transfer of a Theme from a Source to a Goal or Recipient, this is rarely the case for ditransitive idioms (Moon 1998, Fellbaum 2007c). Instead, such idioms exemplify Green’s (1974) “symbolic action’’, events intended by the Agent to have a specific effect on a Beneficiary, or, more often, a
20. Idioms and collocations Maleficiary. Ditransitives that do not denote the transfer of an entity are highly restricted in the literal language (Green 1974); it is interesting that so many idioms denoting an event with a negatively affected entity are expressed with a ditransitive construction. Structurally defined classes of idioms can be accounted for under Goldberg’s (1995) Construction Grammar, where syntactic configurations carry meaning independent of lexical material; however, Goldberg (1995) and Goldberg & Jackendoff (2004) consider syntactically well-formed and lexically productive structures such as resultatives rather than idioms. The identification of classes of grammatically marked idioms points to idiom-specific syntactic configurations or frames that carry meaning.
8. Diachronic changes The origins of specific idioms is a subject of much speculation and folk etyomology. Among the idioms with an indisputable history are those found in the Bible (e.g., throw pearls before the swine, fall from grace, give up the ghost from the King James Bible). These tend to be found in many of the languages into which the Bible was translated. Many other idioms originate in specific domains: pull one’s punches, go the distance (boxing), have an ace up your sleeve, let the chips fall where they may (gambling), and fall on one’s sword, bite the bullet (warfare). Idioms are subject to the same diachronic processes that have been observed for lexemes with literal interpretation. Longitudinal corpus studies by Gehweiler, Höser & Kramer (2007) shows how German VP idioms undergo extension, merging, and semantic splitting and may develop new, homonymic readings. Idioms may also change their usage over time. Thus, the German idiom unter dem Pantoffel stehen (lit., ‘stand underneath somebody’s slipper’, be under someone’s thumb) used to refer to domestic situations where a husband is dominated by his wife. But corpus data from the past few decades show that this idiom is extended to not only to female spouses but also to other social relations, such as employees in a workplace dominated by a boss (Fellbaum 2005). Idioms change their phrase structure over time. Kwasniak (2006) examines cases where a sentential idiom turns into a VP idiom when a fixed component becomes a “free’’ constituent that is no longer part of the idiom.
9. Idioms in the lexicon As form-meaning pairs, idioms belong in the lexicon. A dual nature – semantic simplicity but structural complexity – is often ascribed to them. But the question arises as to why natural languages show complex encoding for concepts whose semantics are as straightforward as those of simple lexemes. It has often been observed that many idioms express concepts already covered by simple lexemes but with added connotational nuances or restrictions to specific contexts and social situations; classic examples are the many idioms meaning to “die’’, ranging from disrespectful to euphemistic. But many – perhaps most – idioms, including cut one’s teeth on, live from hand to mouth, have eyes for, lend a hand, lose one’s head are neutral with respect to register. Similar idioms are found across languages, for example idioms expressing a range of judgments of physical appearance, mental ability, and social aptitude.
451
452
IV. Lexical semantics Subtle register differences alone do not seem to warrant the structural and lexical extravagance of idioms, the constraints on their use, and the added burden on language acquisition and processing. An examination of how English VP idioms fit into the structure of the lexicon reveals that many lack non-idiomatic synonyms and express meanings not covered by simple lexemes, arguably filling “lexical gaps”. Moreover, many idioms appear not to fit the regular lexicalization patterns of English. A typology of idioms based on semantic criteria is suggested in (Fellbaum 2002, 2007a). It includes idioms expressing negations of events or states (miss the bus, fall through the cracks, go begging) and idioms expressing several events linked by a Boolean operator (fish or cut bair, have one’s cake and eat it). Such structurally and semantically complex idioms can be found across languages. One function of idioms may be to encode pre-packaged complex messages that cannot be expressed by simple words and whose salience makes them candidates for lexical encoding.
10. Idioms in the mental lexicon How are idioms represented in the mental lexicon and speakers’ grammar? On the one hand, they are more or less fixed MWUs – long words – that speakers produce and recognize as such, which suggests that they are represented exactly like simple lexemes. On the other hand, attested idiom use shows a wide range of syntactic, morphological, and especially lexical variation, indicating that speakers access the internal structure of idioms and subject them to all the grammatical processes found in literal language. To determine whether idioms are represented and processed as unanalyzable units or as of potentially (and often partially) decomposable strings, psycholinguistic experiments have investigated both the production and the comprehension of idioms. The materials used in virtually all experiments are constructed and not based on corpus examples, yet the findings and the hypothesis based on the results are fully compatible with naturally occurring data. Comprehension time studies show that familiar idioms like kick the bucket are processed faster in their idiomatic meaning (‘die’) that in a literal one (kicking a pail). Glucksberg (2001) asserts that the literal meaning may be inhibited by the figurative meaning of a string, though both may be accessed. However, the timing experiments argue against a processing model where the idiomatic reading kicks in only after a literal one has failed. Cutting & Bock (1997), based on a number of experiments involving idiom production, propose a “hybrid’’ theory of idiom representation. They argue for the existence of a lexical concept node for each idiom; at the same time, idioms are syntactically and semantically analyzed during production, independent of the idioms’ degree of compositionality. This “hybrid account’’ is also supported by Sprenger, Levelt & Kempen (2006), who show that idioms can be primed with lexemes that are semantically related to constituents of the idioms. Sprenger, Levelt & Kempen propose the notion of a “superlemma’’ as a conceptual unit whose lexemes are bound both to their idiomatic use and their use in the free language. The Superlemma theory is compatible with the Configuration Hypothesis that Cacciari & Tabossi (1988) formulated on the basis of idiom comprehension experiments. The Configuration Theory maintains that speakers activate the literal meanings of words in a
20. Idioms and collocations phrase and recognize the idiomatic meaning of a polysemous string only when they recognize an idiom-specific configuration of lexems or encounter a “key’’ lexeme. One such key in many idioms may be the definite article (kick the bucket, fall off the wagon, buy the farm) which suggests that a referent for the noun has been previously introduced into the discourse; when no matching antecedent can be found, another interpretation of the string must be attempted (Fellbaum 1993). The keyhypothesis is compatible with Cutting & Bock’s proposal concerning idioms’ representation in the mental lexicon. Kuiper (2004) analyzed a collection of slips of the tongue for idioms, comprising 1,000 errors. He proposes a taxonomy of sources for the errors from all levels of grammar. Kuiper’s analysis of the data shows that idioms are not simply stored as frozen long words, consistent with the superlemma theory of idiom representation.
10.1. Idioms in natural language processing If one inputs an idiom into a machine translation engine (such as Babelfish or Google translate), it does not – in many cases – return a corresponding idiom or an adequate non-idiomatic translation in the target language. This is one indication that the recognition and processing especially of non-compositional idioms are still a challenge. One reason is that lexical resources that many NLP applications rely on do not include many idioms and fixed collocations. When idioms are listed in computational lexicons, it is often in a fixed form; idioms exhibiting morphosyntactic flexibility and lexical variations make automatic recognition very challenging. A more promising approach than lexical look-up is to search for the co-occurrence of the components of an idiom within a specific window, regardless of syntactic configuration and morphological categories. Lexical variation can be accounted for by searching for words that are semantically similar, as reflected in a thesaurus. Another difficulty for the automatic processing of idioms is polysemy. Many idioms are “plausible’’ and have a literal reading (keep the ball rolling, make a dent in, not move a finger). To distinguish the literal and the idiomatic readings, a system would have to perform a semantic analysis of the wider context, a task similar to that performed by human when disambiguating between literal and idiomatic meanings.
11. Summary and conclusion A prevailing view in linguistics represents idioms as “long words’’, largely noncompositional multi-word units with little or no room for deviation from a canonical form; any morphosyntactic flexibility is often thought to be directly related to semantic transparency. Corpus investigations show, first, that idioms are subject to far more variation than the traditional view would allow, and, second, that speakers use idioms in creative ways even in the absence of full semantic interpretation. The boundary between compositional and non-compositional strings appears to be soft, as speakers assign ad-hoc, discourse-specific meanings to idiom constituents that are opaque outside of certain contexts. Psycholinguistic experiments, too, indicate that idiomatic and non-idiomatic language is not strictly separated in our mental lexicon and grammar.
453
454
IV. Lexical semantics Perhaps the most important function of many idioms, which may account for their universality and ubiquity, is that they provide convenient, pre-fabricated, conventionalized encodings of often complex messages.
12. References Abeillé, Anne 1995. The flexibility of French idioms. A representation with lexicalized tree adjoining grammar. In: M. Everaert et al. (eds.). Idioms. Structural and Psychological Perspectives. Hillsdale, NJ: Erlbaum, 15–42. Baldwin, Timothy, Colin Bannard, Takaaki Tanaka & Dominic Widdows 2003. An empirical model of multiword expression decomposability. In: Proceedings of the ACL-03 Workshop on Multiword Expressions. Analysis, Acquisition and Treatment. Stroudsburg, PA: ACL, 89–96. Bobrow, Daniel & Susan Bell 1973. On catching on to idiomatic expressions. Memory & Cognition 1, 343–346. Burger, Harald 2007. Phraseologie. Eine Einführung am Beispiel des Deutschen. 3rd edn. Berlin: Erich Schmidt. 1st edn. 1998. Cacciari, Cristina & Patrizia Tabossi 1988. The comprehension of idioms. Journal of Memory and Language 27, 668–683. Church, Kenneth W. & Patrick Hanks 1990. Word association norms, mutual information, and lexicography. Computational Linguistics 16, 22–29. Church, Kenneth W., William Gale, Patrick Hanks & Donald Hindle 1991. Using statistics in lexical analysis. In: U. Zernik (ed.). Lexical Acquisition. Exploiting On-Line Resources to Build a Lexicon. Hillsdale, NJ: Erlbaum, 115–164. Cowie, Anthony P. 1998. Phraseology. Theory, Analysis, and Applications. Oxford: Oxford University Press. Cruse, D. Alan 2004. Meaning in Language. An Introduction to Semantics and Pragmatics. 2nd edn. Oxford: Oxford University Press. 1st edn. 2000. Cutting, J. Cooper & Kathryn Bock 1997. That’s the way the cookie bounces. Syntactic and semantic components of experimentally elicited idiom blends. Memory & Cognition 25, 57–71. Dobrovol’skij, Dmitrij 1999. Haben transformationelle Defekte der Idiomstruktur semantische Ursachen? In: N. Fernandez Bravo, I. Behr & C. Rozier (eds.). Phraseme und typisierende Rede. Tübingen: Stauffenburg, 25–37. Dobrovol’skij, Dmitrij 2004. Lexical semantics and combinatorial profile. A corpus-based approach. In: G. Williams & S. Vesser (eds.). Euralex 11. Lorient: Université de Bretagne Sud, 787–796. Eisenberg, Peter 1999. Grundriss der deutschen Grammatik, Vol. 2: Der Satz. Stuttgart: Metzler. Evert, Stefan 2005. The Statistics of Word Cooccurrences. Word Pairs and Collocations. Doctoral dissertation. University of Stuttgart. Fazly, Afsaneh & Suzanne Stevenson 2006. Automatically constructing a lexicon of verb phrase idiomatic combinations. In: Proceedings of the European Chapter of the Association for Computational Linguistics (EACL) 11. Stroudsburg, PA: ACL, 337–344. Fellbaum, Christiane 1993. The determiner in English idioms. In: C. Cacciari & P. Tabossi (eds.). Idioms. Processing, Structure, and Interpretation. Hillsdale, NJ: Erlbaum, 271–295. Fellbaum, Christiane 1998. WordNet. An Electronic Lexical Database. Cambridge, MA: The MIT Press. Fellbaum, Christiane 2002. VP idioms in the lexicon. Topics for research using a very large corpus. In: S. Busemann (ed.). Proceedings of Konferenz zur Verarbeitung natürlicher Sprache (= KONVENS) 6. Kaiserslautern: DFKI, 7–11. Fellbaum, Christiane 2005. Unter dem Pantoffel stehen. Circular der Berlin-Brandenburgischen Akademie der Wissenschaften 31, 18. Fellbaum, Christiane (ed.) 2006. Corpus-Based Studies of German Idioms and Light Verbs. Special issue of the Journal of Lexicography 19.
20. Idioms and collocations Fellbaum, Christiane 2007a. The ontological loneliness of idioms. In: A. Schalley & D. Zaefferer (eds.). Ontolinguistics. How Ontological Status Shapes the Linguistic Coding of Concepts. Berlin: Mouton de Gruyter, 419–434. Fellbaum, Christiane (ed.) 2007b. Idioms and Collocations. Corpus-Based Linguistic and Lexicographic Studies. London: Continuum. Fellbaum, Christiane 2007c. Argument selection and alternations in VP idioms. In: Ch. Fellbaum (ed.). Idioms and Collocations. Corpus-Based Linguistic and Lexicographic Studies. London: Continuum, 188–202. Fellbaum, Christiane & Ekaterini Stathi 2006. Idiome in der Grammatik und im Kontext. Wer brüllt hier die Leviten? In: K. Proost & E. Winkler (eds.). Von Intentionalität zur Bedeutung konventionalisierter Zeichen. Festschrift für Gisela Harras zum 65. Geburtstag. Tübingen: Narr, 125–146. Fillmore, Charles, Paul Kay & Mary O’Connor 1988. Regularity and idiomaticity in grammatical constructions. Language 64, 501–538. Firenze, Anna 2007. ‘You fool her’ doesn’t mean (that) ‘you conduct her behind the light’. (Dis)agglutination of the determiner in German idioms. In: Ch. Fellbaum (ed.). Idioms and Collocations. Corpus-Based Linguistic and Lexicographic Studies. London: Continuum, 152–163. Firth, John R. 1957. A Synopsis of Linguistic Theory 1930–1955. In: Studies in Linguistic Analysis. Oxford: Blackwell, 1–32. Fraser, Bruce 1970. Idioms within a transformational grammar. Foundations of Language 6, 22–42. Gehweiler, Elke 2007. How do homonymic idioms arise? In: M. Nenonen & S. Niemi (eds.). Proceedings of Collocations and Idioms 1. Joensuu: Joensuu University Press. Gehweiler, Elke, Iris Höser & Undine Kramer 2007. Types of changes in idioms. Some surprising results of corpus research. In: Ch. Fellbaum (ed.). Idioms and Collocations. Corpus-Based Linguistic and Lexicographic Studies. London: Continuum, 109–137. Gibbs, Ray & Gerard J. Steen (eds.) 1999. Metaphor in Cognitive Linguistics. Amsterdam: Benjamins. Glucksberg, Sam 2001. Understanding Figurative Language. From metaphors to Idioms. New York: Oxford University Press. Goldberg, Adele 1995. Constructions. A Construction Grammar Approach to Argument Structure. Chicago, IL: The University of Chicago Press. Goldberg, Adele & Ray Jackendoff 2004. The English resultative as a family of constructions. Language 80, 532–568. Green, Georgia M. 1974. Semantics and Syntactic Regularity. Bloomington, IN: Indiana University Press. Halliday, Michael 1966. Lexis as a linguistic level. In: C. E. Bazell et al. (eds.). In Memory of J.R. Firth. London: Longman, 148–162. Jackendoff, Ray 1997. Twistin’ the night away. Language 73, 534–559. Keysar, Boaz & Bridget Bly 1995. Intuitions of the transparency of idioms. Can one keep a secret by spilling the beans? Journal of Memory and Language 34, 89–109. Kjellmer, Göran 1991. A mint of phrases. In: K. Aijmer & B. Altenberg (eds.). English Corpus Linguistics. Studies in Honor of Jan Svartvik. London: Longman, 111–127. Krenn, Brigitte 2000. The Usual Suspects. Data-Oriented Models for Identification and Representation of Lexical Collocations. Doctoral dissertation. University of Saarbrücken. Kuiper, Konrad 2004. Slipping on superlemmas. In: A. Häcki-Buhofer & H. Burger (eds.). Phraseology in Motion, vol. 1. Baltmannsweiler: Schneider, 371–379. Kwasniak, Renata 2006. Wer hat nun den Salat? Now who’s got the mess? Reflections on phraseological derivation. From sentential to verb phrase idiom. International Journal of Lexicography 19, 459–478. Lakoff, George 1987. Women, Fire, and Dangerous Things. What Categories Reveal about the Mind. Chicago, IL: The University of Chicago Press.
455
456
IV. Lexical semantics Lakoff, George & Mark Johnson 1980. Metaphors We Live By. Chicago, IL: The University of Chicago Press. McCarthy, Diana, Bill Keller & John Carroll 2003. Detecting a continuum of compositionality in phrasal verbs. In: Proceedings of the ACL-03 Workshop on Multiword Expressions. Analysis, Acquisition and Treatment. Stroudsburg, PA: ACL, 73–80. Miller, George A. 1956. The magical number seven, plus or minus two. Some limits on our capacity for processing information. Psychological Review 63, 81–97. Moon, Rosamund 1998. Fixed Expressions and Idioms in English. A Corpus-Based Approach. Oxford: Clarendon Press. Nunberg, Geoffrey, Ivan Sag & Thomas Wasow 1994. Idioms. Language 70, 491–538. Sailer, Manfred & Frank Richter 2002. Not for love or money. Collocations! In: G. Jäger et al. (eds.). Proceedings of Formal Grammar 7. Stanford, CA: CSLI Publications, 149–160. Sinclair, John 1991. Corpus, Concordance, Collocation. Oxford: Oxford University Press. Söhn, Jan-Philipp 2006. On idiom parts and their contexts. Linguistik Online 27, 11–28. Sprenger, Simone A., William J. M. Levelt & Gerard A. M. Kempen 2006. Lexical access during the production of idiomatic phrases. Journal of Memory and Language 54, 161–184. Stantcheva, Diana 2006. The many faces of negation. German VP idioms with a negative component. International Journal of Lexicography 19, 397–418. Storrer, Angelika 2007. Corpus-based investigations on German support verb constructions. In: Ch. Fellbaum (ed.). Idioms and Collocations. Corpus-Based Linguistic and Lexicographic Studies. London: Continuum, 164–187. Stubbs, Michael 2001. Words and Phrases. Corpus Studies of Lexical Semantics. Oxford: Blackwell. Swinney, Douglas & Anne Cutler 1979. The access and processing of idiomatic expressions. Journal of Verbal Learning and Verbal Behavior 18, 523–534. Weinreich, Uriel 1969. Problems in the analysis of idioms. In: J. Puhvel (ed.). Substance and Structure of Language. Berkeley, CA: University of California Press, 23–81.
Christiane Fellbaum, Princeton, NJ (USA)
21. Sense relations 1. 2. 3. 4. 5.
Introduction The basic sense relations Sense relations and word meaning Conclusion References
Abstract This article explores the definition and interpretation of the traditional paradigmatic sense relations such as hyponymy, synonymy, meronymy, antonymy, and syntagmatic relations such as selectional restrictions. A descriptive and critical overview of the relations is provided in section 1 and in section 2 the relation between sense relations and different theories of word meaning is briefly reviewed. The discussion covers early to mid twentieth century structuralist approaches to lexical meaning, with its concomitant view of the lexicon as being structured into semantic fields, leading to more recent work on decompositional approaches Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 456–479
21. Sense relations to word meaning. The latter are contrasted with atomic views of lexical meaning and the capturing of semantic relations through the use of meaning postulates.
1. Introduction Naive discussions of meaning in natural languages almost invariably centre around the meanings of content words, rather than the meanings of grammatical words or phrases and sentences, as is normal in academic approaches to the semantics of natural languages. Indeed, at first sight, it might seem to be impossible to construct a theory of meaning of sentences without first uncovering the complexity of meaning relations that hold between the words of a language that make them up. So, it might be argued, to know the meaning of the sentence Matthew rears horses we need also to at least know the meaning of Matthew rears animals or Matthew breeds horses, since horses are a kind of animal and rearing tends to imply breeding. It is in this context that the notion of sense relations, the meaning relations between words (and expressions) of a language, could be seen as fundamental to the success of the semantic enterprise. Indeed, the study of sense relations has a long tradition in the western grammatical and philosophical traditions, going back at least to Aristotle with discussions of relevant phenomena appearing throughout the medieval and later literature. However, the systematisation and taxonomic classification of the system of sense relations was only taken up in the structuralist movements of the twentieth century, particularly in Europe following the swift developments in structuralist linguistics after de Saussure. This movement towards systematic analyses of word sense was then taken up in the latter part of that century and the early part of the twenty-first century in formal modelling of the sense relations and, in particular, the development of computational models of these for the purposes of natural language processing. The notion of ‘sense’ in this context may be variously interpreted, but is usually interpreted in contrast to the notion of reference (or, equivalently, denotation or extension). The latter expresses the idea that one aspect of word meaning is the relation between words and the things that they can be used properly to talk about. Thus, the reference/ denotation of cat is the set of all cats (that are, have been and will be); that of run (on one theoretical approach), the set of all past, present and future events of running (or, on another view, the set of all things that ever have, are or will engage in the activity we conventionally refer to in English as running). Sense, on the other hand, abstracts away from the things themselves to the property that allows us to pick them out. The sense of cat is thus the property that allows us to identify on any occasion an object of which it can truthfully be said that is a cat – ‘catness’ (however that might be construed, cognitively in terms of some notion of concept, see for instance Jackendoff 2002 or model-theoretically in terms of denotations at different indices, see Montague 1973). Sense relations are thus relations between the properties that words express, rather than between the things they can be used to talk about (although, as becomes clear very quickly, it is often very difficult to separate the two notions). Whether or not the study of sense relations can provide a solid basis for the development of semantic theories (and there are good reasons for assuming they cannot, see for example Kilgarriff 1997), nevertheless the elaboration and discussion of such meaning relations can shed light on the nature of the problems we confront in providing such theories, not least in helping to illuminate features of meaning that are truly amenable to semantic analysis and those that remain mysterious.
457
458
IV. Lexical semantics
2. The basic sense relations There are two basic types of sense relation. The most commonly presented in introductory texts are the paradigmatic relations that hold between words of the same general category or type and that are characterised in terms of contrast and hierarchy. Typically, a paradigmatic relation holds between words (or word-forms) when there is choice between them. So given the string John bought a, it is possible to substitute any noun that denotes something that can be bought: suit, T-shirt, cauliflower, vegetable, house, … Between some of these words there is more to the choice between them than just the fact they are nouns denoting commodities. So, for example, if John bought a suit is true then it follows that John bought a pair of trousers is also true by virtue of the fact that pairs of trousers are parts of suits and if John bought a cauliflower is true then John bought a vegetable is also true, this time by virtue of the fact that cauliflowers are vegetables. The second type of sense relations are syntagmatic which hold between words according to their ability to co-occur meaningfully with each other in sentences. Typically syntagmatic sense relations hold between words of different syntactic categories or (possibly) semantic types such as verbs and nouns or adverbs and prepositional phrases. In general, the closer the syntactic relation between two words such as between a head word and its semantic arguments or between a modifier and a head, the more likely it is that one word will impose conditions on the semantic properties the other is required to show. For example, in the discussion in the previous paragraph, the things that one can (non-metaphorically) buy are limited to concrete objects that are typically acceptable commodities in the relevant culture: in a culture without slavery adding boy to the string would be highly marked. As we shall see below, there is a sense in which these two dimensions, of paradigm and syntagm, cannot be kept entirely apart, but it is useful to begin the discussion as if they do not share interdependencies. Of the paradigmatic sense relations there are three basic ones that can be defined between lexemes, involving sense inclusion, sense exclusion and identity of sense. Within these three groups, a number of different types of relation can be identified and, in addition, to these other sorts of sense relations, such as part-whole, have been identified and discussed in the literature. As with most taxonomic endeavours, researchers may be ‘lumpers’, preferring as few primary distinctions as possible, and ‘splitters’ who consider possibly small differences in classificatory properties as sufficient to identify a different class. With respect to sense relations, the problem of when to define an additional distinction within the taxonomy gives rise to questions about the relationship between knowledge of the world and knowledge of a word: where does one end and the other begin (see article 32 (Hobbs) Word meaning and world knowledge). In this article, I shall deal with only those relations that are sufficiently robust as to have become standard within lexical semantics: antonymy, hyponymy, synonymy and meronymy. In general, finer points of detail will be ignored and the discussion will be confined to the primary, and generally accepted, sense relations, beginning with hyponymy.
2.1. Hyponymy Hyponymy involves specific instantiations of a more general concept such as holds between horse and animal or vermilion and red or buy and get. In each case, one word provides a more specific type of concept than is displayed by the other. The more specific
21. Sense relations word is called a hyponym and the more general word is the superordinate which may also be referred to as a hyperonym or hypernym, although the latter is dispreferred as in nonrhotic dialects of English it is homophonic with hyponym. Where the words being classified according to this relation are nouns, one can test for hyponymy by replacing X and Y in the frame ‘X is a kind of Y’ and seeing if the result makes sense. So we have ‘(A) horse is a kind of animal’ but not ‘(An) animal is a kind of horse’ and so on. A very precise definition of the relation is not entirely straightforward, however. One obvious approach is to have recourse to class inclusion, so that the set of things denoted by a hyponym is a subset of the set of things denoted by the superordinate. So the class of buying events is a subset of the class of getting events. This works fine for words that describe concrete entities or events, but becomes metaphysically more challenging when abstract words like thought emotion, belief, understand, think etc. are considered. More importantly there are words that may be said to have sense but no denotation such as phoenix, hobbit, light sabre and so on. As such expressions do not pick out anything in the real word they can be said to denote only the empty set and yet, obviously, there are properties that such entities would possess if they existed that would enable us to tell them apart. A better definition of hyponymy therefore is to forego the obvious and intuitive reliance on class membership and define the relation in terms of sense inclusion rather than class inclusion: the sense of the superordinate being included in the sense of the hyponym. So a daffodil has the sense of flower included in it and more besides. If we replace ‘sense’, as something we are trying to define, with the (perhaps) more neutral term ‘property’, then we have: (1) Hyponymy: X is a hyponym of Y if it is the case that if anything is such that it has the property expressed by X then it also has the property expressed by Y. Notice that this characterisation in terms of a universally quantified implication statement, does not require there to be actually be anything that has a particular property, merely that if such a thing existed it would have that property. So unicorn may still be considered a hyponym of animal, because if such things did exist, they would partake of ‘animalness’. Furthermore, in general if X and Y are hyponyms of Z they are called co-hyponyms, where two words can be defined as co-hyponyms just in case they share the same superordinate term and one is not a hyponym of the other. Co-hyponyms are generally incompatible in sense, unless they are synonymous (see below section 2.3). For example, horse, cat, bird, sheep, etc. are all co-hyponyms of mammal and all mutually incompatible with each other: *That sheep is a horse. Hyponymy is a transitive relation so that if X is a hyponym of Y and Y is a hyponym of Z then X is a hyponym of Z: foal is a hyponym of horse, horse is a hyponym of animal, and so foal is a hyponym of animal. Note that because of this transitivity, foal is treated as a co-hyponym of, and so incompatible with, not only filly and stallion, but also sheep, lamb and bull, etc. This sort of property indicates how hyponymy imposes partial hierarchical structure on a vocabulary. Such hierarchies may define a taxonomy of (say) natural kinds as in Fig. 21.1. Complete hierarchies are not common, however. Often in trying to define semantic fields of this sort, the researcher discovers that there may be gaps in the system where some expected superordinate term is missing. For example, in the lexical field defined by move we have hyponyms like swim, fly, roll and then a whole group of verbs involving movement using legs such as run, walk, hop, jump, skip, crawl, etc. There is, however, no word in English to express the concept that classifies the latter group together. Such gaps abound in any attempt to construct a fully
459
460
IV. Lexical semantics hierarchical lexicon based on hyponymy. Some of these gaps may be explicable through socio-cultural norms (for example, gaps in kinship terms in all languages), but many are simply random: languages do not require all hierarchical terms to be lexicalised. That is not to say, however, that languages cannot express such apparently superordinate concepts. As above, we can provide the concept required as superordinate by modifying its apparent superordinate to give move using legs. Indeed, Lyons (1977) argues that hyponymy can in general be defined in terms of predicate modification of a superordinate. Thus, swim is move through fluid, mare is female horse, lamb is immature sheep and so on. This move pushes a paradigmatic relation onto some prior syntagmatic basis:
Fig. 21.1: Hyponyms of animal Hyponymy is a paradigmatic relation of sense which rests upon the encapsulation in the hyponym of some syntagmatic modification of the sense of the superordinate relation. Lyons (1977: 294)
Such a definition does not work completely. For example, it makes no sense at the level of natural kinds (is horse to be defined as equine animal?) and there are other apparent syntagmatic definitions that are problematic in that precise definitions are not obvious (saunter is exactly what kind of walk?). Such considerations, of course, reflect the vagueness of the concepts that words express out of context and so we might expect any such absolute definition of hyponymy to fail. Hyponymy strictly speaking is definable only between words of the same (syntactic) category, but some groups of apparent co-hyponyms seem to be related to a word of some other category. This seems particularly true of predicate-denoting expressions like adjectives which often seem to relate to (abstract) nouns as superordinates rather than some other adjective. For example, round, square, tetrahedral, etc. all seem to be ‘quasi-hyponyms’ of the noun shape and hot, warm, cool, cold relate to temperature. Finally, the hierarchies induced by hyponymy may be cross-cutting. So the animal field also relates to fields involving maturity (adult, young) or sex (male, female) and perhaps other domains. This entails that certain words may be hyponyms of more than one superordinate, depending on different dimensions of relatedness. As we shall see below, such multiple dependencies have given rise to a number of theoretical approaches to word meaning that try to account directly for sense relations in terms of primitive sense components or inheritance of properties in some hierarchical arrangement of conceptual or other properties.
2.2. Synonymy Synonymy between two words involves sameness of sense and two words may be defined as synonyms if they are mutually hyponymous. For example, sofa and settee are both
21. Sense relations hyponyms of furniture and both mutually entailing since if Bill is sitting on a settee is true, then it is true that Bill is sitting on a sofa, and vice versa. This way of viewing synonymy defines it as occurring between two words just in case they are mutually intersubstitutable in any sentence without changing the meaning (or truth conditions) of those sentences. So violin and fiddle both denote the same sort of musical instrument so that Joan plays the violin and Joan plays the fiddle both have the same truth conditions (are both true or false in all the same circumstances). Synonyms are beloved of lexicographers and thesauri contain lists of putative synonyms. However, true or absolute synonyms are very rarely attested and there is a significant influence of context on the acceptability of apparent synonyms. Take an example from Roget’s thesaurus at random, 726, for combatant in which some of the synonyms presented are: (2) disputant, controversialist, litigant, belligerent; competitor, rival; fighter, assailant, agressor, champion; swashbuckler, duellist, bully, fighting-man, boxer, gladiator, ... Even putting to one side the dated expressions, it would be difficult to construct a single context in which all these words could be substituted for each other without altering the meaning or giving rise to pragmatic awkwardness. Context is the key for acceptability of synonym construal. Even the clear synonymy of fiddle = violin mentioned above shows differences in acceptability in different contexts: if Joan typically plays violin in a symphony orchestra or as soloist in classical concerti, then someone might object that she does not play the fiddle, where that term may be taken to imply the playing of more popular styles of music. Even the sofa = settee example might be argued to show some differences, for example in terms of the register of the word or possibly in terms of possible differences in the objects the words pick out. It would appear, in fact, that humans typically don’t entertain full synonymy and when presented with particular synonyms in context will try and provide explanations of (possibly imaginary) differences. Such an effect would be explicable in terms of some pragmatic theory such as Relevance Theory (Sperber & Wilson 1986/1995) in which the use of different expressions in the same contexts is expected to give rise to different inferential effects. A more general approach to synonymy allows there to be degrees of synonymy where this may be considered to involve degrees of semantic overlap and it is this sort of synonymy that is typically assumed by lexicographers in the construction of dictionaries. Kill and murder are strongly but not absolutely synonymous, differing perhaps in terms of intentionality of the killer/murderer and also the sorts of objects such expressions may take (one may kill a cockroach but does not thereby murder it). Of course, there are conditions on the degree of semantic similarity that we consider to be definitional of synonymy. In the first place, it should in general be the case that the denial of one synonym implicitly denies the other. Mary is not truthful seems correctly to implicitly deny the truth of Mary is honest and Joan didn’t hit the dog implies that she didn’t beat it. Such implications may only go one way and that is often the case with near synonyms. The second condition on the amount of semantic overlap that induces near synonymy is that the terms should not be contrastive. Thus, labrador and corgi have a large amount of semantic overlap in that they express breeds of dog, but there is an inherent contrast between these terms and so they cannot in general be intersubstitutable in any context and maintain the truth of the sentence. Nearsynonyms are often used to explain a word already used: John was dismissed, sacked in fact. But if the terms contrast in meaning in some way then the resulting expression
461
462
IV. Lexical semantics is usually nonsensical: #John bought a corgi, a labrador, in fact, where # indicates pragmatic markedness. The felicity of the use of particular words is strongly dependent on context. The reasons for this have to do with the ways in which synonyms may differ. At the very least two synonymous terms may differ in style or register. So, for example, baby and neonate both refer to newborn humans, but while the neonate was born three weeks premature means the same as the baby was born three weeks premature, What a beautiful neonate! is distinctly peculiar. Some synonyms differ in terms of stylistic markedness. Conceal and hide, for example, are not always felicitously intersubstitutable. John hid the silver in the garden and John concealed the silver in the garden seem strongly synonymous, but John concealed Mary’s gloves in the cupboard does not have the same air of normality about it as John hid Mary’s gloves in the cupboard. Other aspects of stylistic variation involve expressiveness or slang. So while gob and mouth mean the same, the former is appropriately used in very informal contexts only or for its shock value. Swear words in general often have acceptable counterparts and euphemism and dysphemism thus provides a fertile ground for synonyms: lavatory = bathroom = toilet = bog = crapper, etc.; fuck = screw = sleep with, and so on. Less obvious differences in expressiveness come about through the use of synonyms that indicate familiarity with the object being referred to such as the variants of kinship terms: mother = mum = mummy = ma. Regional and dialectal variations of a language may also give rise to synonyms that may or may not co-occur in the language at large: valley = dale = glen or autumn = fall. Sociolinguistic variation thus plays a very large part in the existence of near synonymy in a language.
2.3. Antonymy The third primary paradigmatic sense relation involves oppositeness in meaning, often called antonymy (although Lyons 1977 restricts the use of this term to gradable opposites) and is defined informally in terms of contrast, such that if ‘A is X’ then ‘A is not Y'. So, standardly, if John is tall is true then John is not small is also true. Unlike hyponymy, there are a number of ways in which the senses of words contrast. The basic distinction is typically made between gradable and ungradable opposites. Typically expressed by adjectives in English and other Western European languages, gradable antonyms form instances of contraries and implicitly or explicitly invoke a field over which the grading takes place, i.e. a standard of comparison. Assuming, for example, that John is human, then human size provides the scale against which John is tall is measured. In this way, John’s being tall for a human does not mean that he is tall when compared to buildings. Note that the implicit scale has to be the same scale invoked for any antonym: John is tall contrasts with John is not small for a human, not John is not small for a building. Other examples of gradable antonyms are easy to identify: cold/hot, good/bad, old/young and so on. (3) Gradable antonymy: Gradable antonyms form instances of contraries and implicitly or explicitly invoke a field over which the grading takes place, i.e. a standard of comparison. Non-gradable antonyms are, as the name suggests, absolutes and divide up the domain of discourse into discrete classes. Hence, not only does the positive of one antonym imply
21. Sense relations the negation of the other, but the negation of one implies the positive of the other. Such non-gradable antonyms are also called complementaries and include pairs such as male/ female, man/woman, dead/alive. (4) Complementaries (or binary antonyms) are all non-gradable and the sense of one entails the negation of the other and the negation of one sense entails the positive sense of the other. Notice that there is, in fact, a syntagmatic restriction that is crucial to this definition. It has to be the case that the property expressed by some word is meaningfully predicable of the object to which it is applied. So, while that person is male implies that that person is not female and that person is not female implies that that person is male, that rock is not female does not imply that that rock is male. Rocks are things that do not have sexual distinctions. Notice further that binary antonyms are quite easily coerced into being gradable, in which case the complementarity of the concepts disappears. So we can say of someone that they are not very alive without committing ourselves to the belief that they are very dead. Amongst complementaries, some pairs of antonyms may be classed as privative in that one member expresses a positive property that the other negates. These include pairs such as animate/inanimate. Others are termed equipollent when both properties express a positive concept such as male/female. Such a distinction is not always easy to make: is the relation dead/alive equipollent or privative? Some relational antonyms differ in the perspective from which a relation is viewed: in other words, according to the order of their arguments. Pairs such as husband/wife, parent/child are of this sort and are called converses. (5) Converses: involve relational terms where the argument positions involved with one lexeme are reversed with another and vice versa. So if Mary is John’s wife then John is Mary’s husband. In general, the normal antonymic relation stands, provided that the relation expressed is strictly asymmetric: Mary is Hilary’s child implies Mary is not Hilary’s parent. Converses may involve triadic predicates as well, such as buy/sell, although it is only the agent and the goal that are reversed in such cases. If Mary buys a horse from Bill then it must be the case that Bill sells a horse to Mary. Note that the relations between the two converses here are not parallel: the agent subject of buy is related to the goal object of sell whereas the agent subject of sell is related to the source object of buy. This may indicate that the converse relation, in this case at least, resides in the actual situations described by sentences containing these verbs rather than necessarily inherently being part of the meanings of the verbs themselves. So far, we have seen antonyms that are involved in binary contrasts such as die/live, good/bad and so on, and the relation of antonymy is typically said only to refer to such binary contrasts. But contrast of sense is not per se restricted to binary contrasts. For example, co-hyponyms all have the basic oppositeness property of exclusion. So, if something is a cow, it is not a sheep, dog, lion or any other type of animal and equally for all other co-hyponyms that are not synonyms. It is this exclusion of sense that makes corgi and labrador non-synonymous despite the large semantic overlap in their semantic properties (as breeds of dogs). Lyons (1977) calls such a non-binary relation ‘incompatibility’.
463
464
IV. Lexical semantics Some contrastive gradable antonyms form scales where there is an increase (decrease) of some characteristic property from one extreme of the scale to another. With respect to the property heat or temperature we have the scale {freezing, cold, cool, lukewarm, warm, hot, boiling}. These adjectives may be considered to be quasi-hyponyms of the noun heat and all partake of the oppositeness relation. Interestingly (at least for this scale), related points on the scale act like (gradable) antonyms: freezing/boiling, cold/hot, cool/warm. There are other types of incompatible relations such as ranks (e.g. {private (soldier), corporal, sergeant, staff sergeant, warrant officer, lieutenant, major, . . .}) and cycles (e.g. {monday, tuesday, wednesday, thursday, friday, saturday, sunday}). Finally on this topic, it is necessary again to point out the context-sensitivity of antonymy. Although within the colour domain red has no obvious antonym, in particular contexts it does. So with respect to wine, the antonym of red is white and in the context of traffic signals, its opposite is green. Without a context, the obvious antonym to dry is wet, but again within the context of wine, its antonym is sweet, in the context of skin it is soft and for food moist (examples taken from Murphy 2003). This contextual dependence is problematic for the definition of antonymy just over words (rather than concepts), unless it is assumed that the lexicon is massively homonymous (see below). (See also article 22 (Löbner) Dual oppositions.)
2.4. Part-whole relations The final widely recognised paradigmatic sense relation is that involving ‘part-of’ relations or meronymies. (6) Meronymy: If X is part-of Y or Y has X then X is a meronym of Y and Y is a holonym of X. Thus, toe is a meronym of foot and foot is a meronym of leg, which in turn is a meronym of body. Notice that there is some similarities between meronymy and hyponymy in that a (normal) hand includes fingers and finger somehow includes the idea of hand, but, of course, they are not the same things and so do not always take part in the same entailment relations as hyponyms and superordinates. So while Mary hurt her finger (sort of) entails Mary hurt her hand, just as Mary hurt her lamb entails Mary hurt an animal, Mary saw her finger does not entail Mary saw her hand, unlike Mary saw her lamb does entail Mary saw an animal. Hence, although meronymy is like hyponymy in that the part-whole relations define hierarchical distinctions in the vocabulary, it is crucially different in that meronyms and holonyms define different types of object that may not share any semantic properties at all: a finger is not a kind of hand, but it does share properties with hands such as being covered in skin and being made of flesh and bone; but a wheel shares very little with one of its holonyms car, beyond being a manufactured object. Indeed, appropriate entailment relations between sentences containing meronyms and their corresponding holonyms are not easily stated and, while the definition given above is a reasonable approximation, it is not unproblematic. Cruse (1986) attempts to restrict the meronymy relation just to those connections between words that allow both the ‘X is part of Y' and ‘Y has X' paraphrases. He points out that the ‘has a’ relation does not always involve a ‘part of’ one, at least between the two words: a wife has a husband but not #a husband is part of a wife. On the other hand, the reverse
21. Sense relations may also not hold: stress is part of the job does not mean that the job has stress, at least not in the sense of possession. However, even if one accepts that both paraphrases must hold of a meronymic pair, there remain certain problems. As an example, the pair of sentences a husband is part of a marriage and a marriage has a husband seems to be reasonably acceptable, it is not obvious that marriage is strictly a holonym of husband. It may be, therefore, that it is necessary to restrict the relation to words that denote things of the same general type: concrete or abstract, which will induce different ‘part of’ relations depending on the way some word is construed. So, a chapter is part of a book = Books have chapters if book is taken to be the abstract construal of structure, but not if it is taken to be the concrete object. Furthermore, it might be necessary to invoke notions like ‘discreteness’ in order to constrain the relation. For example, flesh is part of a hand and hands have flesh, but are these words thereby in a meronymic relationship? Flesh is a substance and so not individuated and if meronymy requires parts and wholes to be discretely identifiable, then the relationship would not hold of these terms. Again we come to the problem of worldknowledge, which tells us that fingers are prototypically parts of hands, versus wordknowledge: is it the case that the meaning of ‘finger’ necessarily contains the information that it forms part of a hand and thus that some aspect of the meaning of ‘hand’ is contained in the meaning of ‘finger’? If that were the case, how do we account for the lack of any such inference in extensions of the word to cover (e.g.) emerging shoots of plants (cf. finger of asparagus)? (See article 32 (Hobbs) Word meaning and world knowledge.) This short paragraph does not do justice to the extensive discussions of meronymy, but it should be clear that it is by far the most problematic of the paradigmatic relations to pin down, a situation that has led some scholars to reject its existence as a different type of sense relation altogether. (See further Croft & Cruse 2004: 159–163, Murphy 2003: 216–235.)
2.5. Syntagmatic relations Syntagmatic relations between words appear to be less amenable to the sort of taxonomies associated with paradigmatic relations. However, there is no doubt that some words ‘go naturally’ with each other, beyond what may be determined by general syntactic rules of combination. At one extreme, there are fixed idioms where the words must combine to yield a specific meaning. Hence we have idiomatic expressions in English meaning ‘die’ such as kick the bucket (a reference to death by hanging) or pass away or pass over (to the other side) (references to religious beliefs) and so on. There are certain words also that only have a very limited ability to appear with others such as is the case with addled which can only apply to eggs or brains. Other collocational possibilities may be much freer, although none are constrained solely by syntactic category. For example, hit is a typical transitive verb in English that takes a noun phrase as object. However, it further constrains which noun phrases it acceptably collocates with by requiring the thing denoted by that noun phrase to have concrete substance. Beyond that, collocational properties are fairly free; see article 20 (Fellbaum) Idioms and collocations: (7) The plane hit water/a building/#the idea. That words do have semantic collocational properties can be seen by examining strings of words that are ‘grammatical’ (however defined) but make no sense. An extreme
465
466
IV. Lexical semantics example of this is Chomsky (1965)’s ubiquitous colorless green ideas sleep furiously in which the syntactic combination of the words is licit (as an example of a subject noun phrase containing modifiers and a verb phrase containing an intransitive verb and an adverb) but no information is expressed because of the semantic anomalies that result from this particular combination. There are various sources of anomaly. In the first place, there may be problems resulting from what Cruse (2000) calls collocational preferences. Where such preferences are violated, various degrees of anomaly can arise, ranging from marginally odd through to the incomprehensible. For example, the sentence my pansies have passed away is peculiar because pass away is typically predicated of a human (or pet) not flowers. However, the synonymous sentence, my pansies have died, involves no such peculiarity since the verb die is predicable of anything which is capable of life such as plants and animals. A worse clash of meaning thus derives from the collocation of an inanimate subject and any verb or idiom meaning ‘die’. My bed has died is thus worse than my pansies have passed away, although notice that metaphorical interpretations can be (and often are) attributed to such strings. For example, one could interpret my bed has died as indicating that the bed has collapsed or is otherwise broken and such metaphorical extensions are common. Compare the use of die collocated with words such as computer, car, phone, etc. Notice further in this context that the antonym of die is not live but go or run. It is only when too many clashes occur that metaphorical interpretation breaks down and no information at all can be derived from a string of words. Consider again colorless green ideas sleep furiously. Parts of this sentence are less anomalous than the whole and we can assign (by whatever means) some interpretation to them: (8) a. Green ideas: ‘environmentally friendly ideas’ or ‘young, untried ideas’ (both via the characteristic property of young plant shoots); b. Colorless ideas: ‘uninteresting ideas’ (via lacklustre, dull); c. Colorless green ideas: ‘uninteresting ideas about the environment’ or ‘uninteresting untried ideas’ (via associated negative connotations of things without colour); d. Ideas sleep: ‘ideas are not currently active’ (via inactivity associated with sleeping); e. Green parrots sleep furiously: ‘parrots determinedly asleep (?)’ or ‘parrots restlessly asleep’. But in putting all the words together, the effort involved in resolving all the contradictions just gets beyond any possible effect on the context by the information content of the final proposition. The more contradictions that need to be resolved in processing some sentence the greater the amount of computation required to infer a non-contradictory proposition from it and the less information the inferred proposition will convey. A sentence may be said to be truly anomalous if there is no relevant proposition that can be deduced from it by pragmatic means. A second type of clash involving collocational preferences is induced when words are combined into phrases but add no new information to the string. Cruse calls this pleonasm and exemplifies it with examples such as John kicked the ball with his foot (Cruse 2000: 223). Since kicking involves contact with something by a foot the string final prepositional phrase adds nothing new and the sentence is odd (even though context may allow the apparent tautology to be acceptable). Similar oddities arise with collocations
21. Sense relations such as female mother or human author. Note that pleonasm does not always give rise to feelings of oddity. For example, pregnant female does not seems as peculiar as #female mother, even although the concept of ‘female’ is included in ‘pregnant’. This observation is indicative of the fact that certain elements in strings of words have a privileged status. For example, it appears that pleonastic anomaly is worse in situations in which a semantic head, a noun or verb that determines the semantic properties of its satellites, which does not always coincide with what may be identified as the syntactic head of a construction, appears with a modifier (or sometimes complement, but not always) whose meaning is contained within that of the head: bovine mammal is better than #mammalian cow (what other sort of cow could there be?). Pleonastic anomaly can usually be obviated by substituting a hyponym for one expression or a superordinate of the other since this will give rise to informativity with respect to the combination of words: female parent, He struck the ball with his foot and so on. Collocational preferences are often discussed with respect to the constraints imposed by verbs or nouns on their arguments and sometimes these constraints have been incorporated into syntactic theories. In Chomsky (1965), for example, the subcategorisation of verbs was defined not just in terms of the syntactic categories of their argument but also their semantic selectional properties. A verb like kick imposes a restriction on its direct object that it is concrete (in non-metaphorical uses) and on its subject that it is something with legs like an animal, whereas verbs like think require abstract objects and human subjects. Subsequent research into the semantic properties of arguments led to the postulation of participant (or case or thematic) roles which verbs ‘assign’ to their arguments with the effect that certain roles constrained the semantic preferences of the verb to certain sorts of subjects and objects. A verb like fear, therefore, assigns to its subject the role of experiencer, thus limiting acceptable collocations to things that are able to fear such as humans and other animals. Some roles such as experiencer, agent, recipient, etc. are more tightly constrained by certain semantic properties, such as animacy, volition, and mobility than others such as theme, patient (Dowty 1991). We have seen that paradigmatic relations cannot always be separated from concepts of syntagmatic relatedness. So Lyons’ attempts to define hyponymy in terms of the modification of a superordinate while the basic relation of antonymy holds only if the word being modified denotes something that can be appropriately predicated of the relevant properties. Given that meanings are constructed in natural languages by putting words together, it would be unsurprising if syntagmatic relations are, in some sense, primary and that paradigmatic relations are principally determined by collocational properties between words. Indeed the primacy of syntagmatic relations is supported by psycholinguistic and acquisition studies. It is reported in Murphy (2003), for example, that in word association experiments, children under 7 tend to provide responses that reflect collocational patterns rather than paradigmatic ones. So to a word like black the response of young children is more likely to give rise to responses such as bird or board rather than the antonym white. Older children and adults, on the other hand, tend to give paradigmatic responses. There is, furthermore, some evidence that knowledge of paradigmatic relations is associated with metalinguistic awareness: that is, awareness of the properties of, and interactions between, those words. The primacy of syntagmatic relations over paradigmatic relations seems further to be borne out by corpus and computational studies of collocation and lexis. For example, there are many approaches to the automatic disambiguation of homonyms, identical
467
468
IV. Lexical semantics word forms that have different meanings (see below), that rely on syntagmatic context to determine which sense is the most likely on a particular occasion of the use of some word. Such studies also rely on corpus work from which collocational probabilities between expressions are calculated. In this regard, it is interesting also to consider experimental research in computational linguistics which attempts to induce automatic recognition of synonyms and hyponyms in texts. Erk (2009) reports research which uses vector spaces to define representations of words meanings where such spaces are defined in terms of collocations between words in corpora. Without going into any detail here, she reports that (near) synonyms can be identified in this manner to a high degree of accuracy, but that hyponymic relations cannot be identified easily without such information being directly encoded, but that the use of vector spaces allows such encoding to be done and to yield good results. Although this is not her point, Erk’s results are interesting with respect to the possible relation between syntagmatic and paradigmatic sense relations. Synonymy may be defined not just semantically as involving sameness of sense, but syntactically as allowing substitution in the all the same linguistic contexts (an idealisation, of course, given the rarity of full synonymy, but probabilistic techniques may be used to get a definition of degree of similarity between the contexts in which two words can appear). Hence, we might expect that defining vector spaces in terms of collocational possibilities in texts will yield a high degree of comparability between synonyms. But hyponymy cannot be defined easily in syntactic terms, as hyponyms and their superordinates will not necessarily collocate in the same way (for example, four-legged animal is more likely than four-legged dog while Collie dog is fine but #Collie animal is marginal at best). Thus, just taking collocation into account, even in a sophisticated manner, will fail to identify words in hyponymous relations. This implies that paradigmatic sense relations are ‘higher order’ or ‘metalexical’ relations that do not emerge directly from syntagmatic ones. I leave this matter to one side from now on, because, despite the strong possibility that syntagmatic relations are cognitively primary, it remains the case that the study of paradigmatic relations remains the focus of studies of lexical semantics.
2.6. Homonymy and polysemy Although not sense relations of the same sort as those reviewed above in that they do not structure the lexicon in systematic ways, homonymy and polysemy have nevertheless an important place in considerations of word meaning and have played an important part in the development of theories of lexical semantics since the last two decades of the twentieth century. Homonymy involves formal identity between words with distinct meanings (i.e. interpretations with distinct extensions and senses) which Weinreich (1963) calls “contrastive ambiguity’’. Such formal identity may involve the way a word is spoken (homophony ‘same sound’) such as bank, line, taxi, can, lead (noun)/led (verb) and/or orthography bank, line, putting, in which case it is referred to as homography ‘same writing’. It is often the case that the term homonymy is reserved only for those words that are both homophones and homographs, but equally often the term is used for either relation. Homonymy may be full or partial: in the former case, every form of the lexeme is identical for both senses such as holds for the noun punch (the drink or the action) or it may be partial in which only some forms of the lexeme are identical for both senses, such as between the verb punch and its corresponding noun. Homonymy leads to
21. Sense relations the sort of ambiguity that is easily resolved in discourse context, whether locally through syntactic disambiguation (9a), the context provided within a sentence (9b) or from the topic of conversation (9c). (9) a. His illness isn’t terminal. b. My terminal keeps cutting out on me. c. I’ve just been through Heathrow Airport. The new terminal is rubbish. In general, there is very little to say about homonymy. It is random and generally only tolerated when the meanings of the homonyms are sufficiently semantically differentiated as to be easily disambiguated. Of more interest is polysemy in which a word has a range of meanings in different local contexts but in which the meaning differences are taken to be related in some way. While homonymy may be said to involve true ambiguity, polysemy involves some notion of vagueness or underspecification with respect to the meanings a polyseme has in different contexts (see article 23 (Kennedy) Ambiguity and vagueness). The classic example of a polysemous word is mouth which can denote the mouth of a human or animal or various other types of opening, such as bottle, and more remotely of river. Unlike homonymy no notion of contrast in sense is involved and polysemes are considered to have an apparently unique basic meaning that is modified in context. The word bank is both a homonym and a polyseme in its meaning of ‘financial establishment’ between its interpretation as the institution (The bank raised its interest rates yesterday ) and its physical manifestation (The bank is next to the school). One of the things that differentiates polysemy from homonymy is that the different senses of polysemes are not ‘suppressed’ in context (as with homonyms) but one aspect of sense is foregrounded or highlighted. Other senses are available in the discourse and can be picked up by other words in the discourse: (10) a. Mary tried to jump through the window (aperture), but it was closed (aperture/ physical object) and she broke it (physical object). b. *Mary walked along the bank of the river. It had just put up interest rates yet again. Polysemy may involve a number of different properties: change of syntactic category (11); variation in valency (12); and subcategorisation properties (13). (11) a. Rambo picked up the hammer (noun). b. Rambo hammered (verb) the nail into the tree. (12) a. The candle melted. b. The heat melted the candle. (13) a. Rambo forgot that he had buried the cat (clausal complement - factive interpretation) b. Rambo forgot to bury the cat (infinitival complement - non-factive interpretation) Some polysemy may be hidden and extensive, as often with gradable adjectives where the adjective often picks out some typical property associated with the head noun that it modifies which may vary considerably from noun to noun, as with the adjective good in
469
470
IV. Lexical semantics (15), where the interpretation varies considerably according to the semantics of the head noun and contrasting strongly with other adjectives like big, as illustrated in (14). Note that one might characterise the meaning of this adjective in terms of an underspecified semantics such as that given in (15e). (14) big car/big computer/big nose: ‘big for N’ (15) a. b. c. d. e.
good meal: ‘tasty, enjoyable, pleasant’ good knife: ‘sharp, easy to handle’ good car: ‘reliable, comfortable, fast’ good typist: ‘accurate, quick, reliable’ good N: ‘positive evaluation of some property associated with N’
There are also many common alternations in polysemy that one may refer to as constructional (or logical) polysemy since they are regular and result from the semantic properties of what is denoted. (16) a. b. c. d. e. f.
Figure/Ground: window, door, room Count/Mass: lamb, beer Container/Contained: bottle, glass Product/Producer: book, Kleenex Plant/Food: apple, spinach Process/Result: examination, merger
Such alternations depend to a large degree on the perspective that is taken with respect to the objects denoted. So a window may be viewed in terms of an aperture (e.g. when it is open) or in terms of what it is made of (glass, plastic in some sort of frame) while other nouns can be viewed in terms of their physical or functional characteristics, and so on. Polysemy is not exceptional but rather the norm for word interpretation in context. Arguably every content word is polysemous and may have its meaning extended in context, systematically or unsystematically. The sense extensions of mouth, for example, are clear examples of unsystematic metaphorical uses of the word, unsystematic because the metaphor cannot be extended to just any sort of opening: ?#mouth of a flask, #mouth of a motorway, ?#mouth of a stream, #mouth of a pothole. Of course, any of these collocations could become accepted, but it tends to be the case that until a particular collocation has become commonplace, the phrase will be interpreted as involving real metaphor rather than the use of a polysemous word. Unsystematic polysemy, therefore, may have a diachronic dimension with true (but not extreme) metaphorical uses becoming interpreted as polysemy once established within a language. It is also possible for diachronically homonymous terms to be interpreted as polysemes at some later stage. This has happened with the word ear where the two senses (of a head and of corn) derive from - respectively). different words in Old English (eare and ear, More systematic types of metaphorical extension have been noted above, but may also result from metonymy: the use of a word in a non-literal way, often based on a partwhole or ‘connected to’ relationships. This may happen with respect to names of composers or authors where the use of the name may refer to the person or to what they
21. Sense relations have produced. (17) may be interpreted as Mary liking the man or the music (and indeed listening to or playing the latter). (17) Mary likes Beethoven. Ad hoc types of metonymy may simply extend the concept of some word to some aspect of a situation that is loosely related to, but contextually determined by, what the word actually means. John has new wheels may be variously interpreted as John having a new car or, if he was known to be paraplegic, as him having a new wheelchair. A more extreme, but classic, example is one like (18) in which the actual meaning of the food lasagna is extended to the person who ordered it. (Such examples are also known as ‘ham sandwich’ cases after the examples found in Nunberg 1995). (18) The lasagna is getting impatient. Obviously context is paramount here. In a classroom or on a farm, the example would be unlikely to make any sense, whereas in a restaurant where the situation necessarily involves a relation between customers and food, a metonymic relation can be easily constructed. Some metonymic creations may become established within a linguistic community and thus become less context-dependent. For example, the word suit(s) may refer not only to the garment of clothing but also to people who wear them and thus the word gets associated with types of people who do jobs that involve the wearing of suits, such as business people.
3. Sense relations and word meaning As indicated in the discussion above, the benefit of studying sense relations appears to be that it gives us an insight into word meaning generally. For this reason, such relations have often provided the basis for different theories of lexical semantics.
3.1. Lexical fields and componential analysis One of the earliest modern attempts to provide a theory of word meaning using sense relations is associated with European structuralists, developing out of the work of de Saussure in the first part of the twentieth century. Often associated with theories of componential analysis (see below), lexical field theory gave rise to a number of vying approaches to lexical meaning, but which all share the hypothesis that the meanings (or senses) of words derive from their relations to other words within some thematic/ conceptual domain defining a semantic or lexical field. In particular, it is assumed that hierarchical and contrastive relations between words sharing a conceptual domain is sufficient to define the meaning of those words. Early theorists such as Trier (1934) or Porzig (1934) were especially interested in the way such fields develop over time with words shifting with respect to the part of a conceptual field that they cover as other words come into or leave that space. For Trier, the essential properties of a lexical field are that: (i)
the meaning of an individual word is dependent upon the meaning of all the other words in the same conceptual domain;
471
472
IV. Lexical semantics (ii) a lexical field has no gaps so that the field covers some connected conceptual space (or reflects some coherent aspect of the world); (iii) if a word undergoes a change in meaning, then the whole structure of the lexical field also changes. One of the obvious weaknesses of such an approach is that the identification of a conceptual domain cannot be identified independently of the meaning of the expressions themselves and so appears somewhat circular. Indeed, such research presented little more than descriptions of diachronic semantic changes as there was little or no predictive power in determining what changes are and are not possible within lexical fields, nor what lexical gaps are tolerated and what not. Indeed, it seems reasonable to suppose that no such theory could exist, given the randomness that the lexical development of contentive words displays and so there is no reason to suppose that sense relations play any part in determining such change. (See Ullman 1957, Geckeler 1971, Coseriu & Geckeler 1981, for detailed discussions of field theories at different periods of the twentieth century.) Although it is clear that ‘systems of interrelated senses’ (Lyons 1977: 252) exist within languages, it is not clear that they can usefully form the basis for explicating word meaning. The most serious criticism of lexical field theory as more than a descriptive tool is that it has unfortunate implications for how humans could ever know the meaning of a word: if a word’s meaning is determined by its relation to other words in its lexical field, then to know that meaning someone has to know all the words associated with that lexical field. For example, to know the meaning of tulip, it would not be enough to know that it is a hyponym of (plant) bulb and a co-hyponym of daffodil, crocus, anemone, lily, dahlia but also to trillium, erythronium, bulbinella, disia, brunsvigia and so on. But only a botanist specialising in bulbous plants is likely to know anything like the complete list of names, and even then this is unlikely. Of course, one might say that individuals might have a shallower or deeper knowledge of some lexical field, but the problem persists, if one is trying to characterise the nature of word meaning within a language rather than within individuals. And it means that the structure of a lexical field and thus the meaning of a word will necessarily change with any and every apparent change in knowledge. But it is far from clear that the meaning of tulip would be affected if, for example, botanists decided that that a disia is not bulbous but rhizomatous, and thus does not after all form part of the particular lexical field of plants that have bulbs as storage organs. It is obvious that someone can be said to know the meaning of tulip independently of whether they have any knowledge of any other bulb or even flowering plant. Field theory came to be associated in the nineteen-sixties with another theory of word meaning in which sense relations played a central part. This is the theory of componential analysis which was adapted by Katz & Fodor (1963) for linguistic meaning from similar approaches in anthropology. In this theory, the meaning of a word is decomposed into semantic components, often conceived as features of some sort. Such features are taken to be cognitively real semantic primitives which combine to define the meanings of words in a way that automatically predicts their paradigmatic sense relations with other words. For example, one might decompose the two meanings of dog as consisting of the primitive features [CANINE] and [CANINE, MALE, ADULT]. Since the latter contains the semantic structure of the former, it is directly determined to be a hyponym. Assuming that bitch has the componential analysis [CANINE, FEMALE, ADULT], the hyponym meaning of dog is easily identified as an antonym of bitch as they differ one
21. Sense relations just one semantic feature. So the theory provides a direct way of accounting for sense relations: synonymy involves identity of features; hyponymy involves extension of features; and antonymy involves difference in one feature. Although actively pursued in the nineteen sixties and seventies, the approach fell out of favour in mainstream linguistics for a number of reasons. From an ideological perspective, the theory became associated with the Generative Semantics movement which attempted to derive surface syntax from deep semantic meaning components. When this movement was discredited, the logically distinct semantic theory of componential analysis was mainly rejected too. More significantly, however, the theory came in for heavy criticism. In the first place, there is the problem of how primitives are to be identified, particularly if the assumption is that the set of primitives is universal. Although more recent work has attempted to give this aspect of decompositional theories as a whole a more empirically motivated foundation (Wierzbicka 1996), nevertheless there appears to be some randomness to the choice of primitives and the way they are said to operate within particular languages. Additionally, the theory has problems with things like natural kinds: what distinguishes [CANINE] from the meaning of dog or [EQUINE] from horse? And does each animal (or plant) species have to be distinguished in this way? If so, then the theory achieves very little beyond adding information about sex and age to the basic concepts described by dog and horse. Using Latinate terms to indicate sense components for natural kinds simply obscures the fact that the central meanings of these expressions are not decomposable. An associated problem is that features were often treated as binary so that, for example, puppy might be analysed as [CANINE,-ADULT]. Likewise, instead of MALE/FEMALE one might have ±MALE or [-ALIVE] for dead. The problem here is obvious: how does one choose a non-arbitrary property as the unmarked one? ±FEMALE and ±DEAD are just as valid as primitive features, as the reverse, reflecting the fact that male/female and dead/alive are equipollent antonyms (see section 2.3). Furthermore, restriction to binary values excludes the inclusion of relational concepts that are necessary for any analysis of meaning in general. Overall, then, while componential analysis does provide a means of predicting sense relations, it does so at the expense of a considerable amount of arbitrariness.
3.2. Lexical decomposition Although the structuralist concept of lexical fields is one that did not develop in the way that its proponents might have expected, nevertheless it reinforced the view that words are semantically related and that this relatedness can be identified and used to structure a vocabulary. It is this concept of a structured lexicon that persists in mainstream linguistics. In the same way, lexical decompositional analyses have developed in rather different ways than were envisaged at the time that componential semantic analysis was developed. See article 17 (Engelberg) Frameworks of decomposition. Decomposition of lexical meaning appears in Montague (1973), one of the earliest attempts to provide a formal analysis of a fragment of a natural language as one of two different mechanisms for specifying the interpretations of words. Certain expressions with a logical interpretation, like be and necessarily, are decomposed, not into cognitive primitives, but into complex logical expressions, reflecting their truth-conditional content. For example, necessarily receives the logical translation λp[⎕px], where the abstracted variable, p, ranges over propositions and the decomposition has the effect
473
474
IV. Lexical semantics of equating the semantics of the adverb with that of the logical necessity operator, ⎕. Montague restricted decomposition of this sort to those grammatical expressions whose truth conditional meaning can be given a purely logical characterisation. In a detailed analysis of word meaning within Montague Semantics, however, Dowty (1979) argued that certain entailments associated with content expressions are constant in the same way as those associated with grammatical expressions and he extended the decompositional approach to analyse such words in order to capture such apparently independent entailments. Dowty’s exposition is concerned primarily with inferences from verbs (and more complex predicates) that involve tense and modality. By adopting three operators DO, BECOME and CAUSE, he is able to decompose the meanings of a range of different types of verbs, including activities, accomplishments, inchoatives and causatives, to account for the entailments that can be drawn from sentences containing them. For example, he provides decomposition rules for de-adjectival inchoative and causative verbs in English that modify the predicative interpretation of base adjectives in English. Dowty uses the propositional operator BECOME for inchoative interpretations of (e.g.) cool: λx [BECOME cool⬘(x)] where cool is the semantic representation of the meaning of the predicative adjective and the semantics of BECOME ensures that the resulting predicate is true of some individual just in case it is now cool but just previously was not cool. The causative interpretation of the verb involves the CAUSE operator in addition: λy λx [x CAUSE BECOME cool’(y)] which guarantees the entailment between (e.g.) Mary cooled the wine and Mary caused the wine to become cool. Dowty also gives more complex (and less obviously logical) decompositions for other content expressions, such as kill which may be interpreted as x causes y to become not alive. The quasi-logical decompositions suggested by Dowty have been taken up in theories such as Role and Reference Grammar (van Valin & LaPolla 1997) but primarily for accounting for syntagmatic relations such as argument realisation, rather than for accounting for paradigmatic sense relations. The same is not quite true for other decompositional theories of semantics such as that put forward in the Generative Lexicon Theory of Pustejovsky (1995). Pustejovsky presents a theory designed specifically to account for structured polysemous relations such as those given in (11–13) utilising a complex internal structure for word meanings that goes a long way further than that put forward by Katz & Fodor (1963). In particular, words are associated with a number of different ‘structures’ that may be more or less complex. These include: argument structure which gives the number and semantic type of logical arguments; event structure specifying the type of event of the lexeme; and lexical inheritance structure, essentially hyponymous relations (which can be of a more general type showing the hierarchical structure of the lexicon). The most powerful and controversial structure proposed is the qualia structure. Qualia is a Latin term meaning ‘of whatever sort’ and is used for the Greek aitiai ‘blame’, ‘responsibility’ or ‘cause’ to link the current theory with Aristotle’s modes of explanation. Essentially the qualia structure gives semantic properties of a number of different sorts concerning the basic sense properties, prototypicality properties and encyclopaedic information of certain sorts. This provides an explicit model for how meaning shifts and polyvalency phenomena interact. The qualia structure provides the structural template over which semantic combinatorial devices, such as cocomposition, type coercion and subselection, may apply to alter the meaning of a lexical item. Pustejovsky (1991: 417) defines qualia structure as:
21. Sense relations –– The relation between [a word’s denotation] and its constituent parts –– That which distinguishes it within a larger domain (its physical characteristics) –– Its purpose and function –– Whatever brings it about Such information is used in interpreting sentences such as those in (19) (19) a. Bill uses the train to get to work. b. This car uses diesel fuel. The verb use is semantically underspecified and the factors that allow us to determine which sense is appropriate for any instance of the verb are the qualia structures for each phrase in the construction and a rich mode of composition, which is able to take advantage of this information. For example, in (19a) it is the function of trains to take people to places, so use here may be interpreted as ‘catches’, ‘takes’ or ‘rides on’. Analogously, cars contain engines and engines require fuel to work, so the verb in (19b) can be interpreted as ‘runs on’, ‘requires’, etc. Using these mechanisms Pustejovsky provides analyses of complex lexical phenomena, including coercion, polysemy and both paradigmatic and syntagmatic sense relations. Without going into detail, Pustejovskys fundamental hypothesis is that the lexicon is generative and compositional, with complex meanings deriving from less complex ones in structured ways, so that the lexical representations of words should contain only as much information as they need to express a basic concept that allows as wide a range of combinatorial properties as possible. Additionally, lexical information is hierarchically structured with rules specifying how phrasal representations can be built up from lexical ones as words are combined. A view which contrasts strongly with that discussed in the next section. See article 17 (Engelberg) Frameworks of decomposition.
3.3. Meaning postulates and semantic atomism In addition to decomposition for logico-grammatical word meaning, Montague (1973) also adopted a second approach to accounting for the meaning of basic expressions, one that relates the denotations of words (analogously, concepts) to each other via logical postulates. Meaning Postulates were introduced in Carnap (1956) and consist of universally quantified conditional or bi-conditional statements in the logical metalanguage which constrain the denotations of the constant that appears in the antecedent. For example, Montague provides an example that relates the denotations of the verb seek and the phrase try to find. (20) states (simplified from Montague) that for every instance of x seeking y there is an instance of x trying find y: (20) ⎕∀x∀y[seek′(x, y) ↔ try-to (x,∧ find′(x, y)] Note that the semantics of seek, on this approach, does not contain the content of try to find, as in the decompositional approach. The necessity operator, ⎕, ensures that the relation holds in all admissible models, i.e. in all states-of-affairs that we can talk about using the object language. This raises the bi-conditional statement to the status of a logical truth (an axiom) which ensures that on every occasion in which it is true to say of
475
476
IV. Lexical semantics someone that she is seeking something then it is also true to say that she is trying to find that something (and vice versa). Meaning postulates provide a powerful tool for encoding detailed information about non-logical entailments associated with particular lexemes (or their translation counterparts). Note that within formal, model-theoretic, semantics such postulates act, not as constraints on the meaning of words, but their denotations. In other words, they reflect world knowledge, how situations are, not how word meanings relate to each other. While it is possible to use meaning postulates to capture word meanings within model-theoretic semantics, this requires a full intensional logic and the postulation of ‘impossible worlds’, to allow fine-grained differentiations between worlds in which certain postulates do not hold. (See Cann 1993 for an attempt at this and a critique of the notion of impossible worlds in Fox & Lappin 2005, cf. also article 33 (Zimmermann) Model-theoretic semantics). A theory that utilises meaning postulates treats the meaning of words as atomic with their semantic relations specified directly. So, although traditional sense relations, both paradigmatic and syntagmatic, can easily be reconstructed in the system (see Cann 1993 for an attempt at this) they do not follow from the semantics of the words themselves. For advocates of this theory, this is taken as an advantage. In the first place, it allows for conditional, as opposed to bi-conditional, relations, as necessary in a decompositional approach. So while we might want to say that an act of killing involves an act of causing something to die, the reverse may not hold. If kill is decomposed as x CAUSE (BECOME(¬alive’(y))), then this fact cannot be captured. A second advantage of atomicity is that even if a word’s meaning can be decomposed to a large extent, there is nevertheless often a ‘residue of meaning’ which cannot be decomposed into other elements. This is exactly what the feature CANINE is in the simple componential analysis given above: it is the core meaning of dog/bitch that cannot be further decomposed. In decomposition, therefore, one needs both some form of atomic concept and the decomposed elements whereas in atomic approaches word meanings are individual concepts (or denotations), not further decomposed. What relations they have with the meanings of other words is a matter of the world (or of experience of the world) not of the meanings of the words themselves. Fodor & Lepore (1998) argue extensively against decompositionality, in particular against Pustejovsky’s notion of the generative lexicon, in a way similar to the criticism made against field theories above. They argue that while it might be that a dog (necessarily) denotes an animal, knowing that dogs are animals is not necessary for knowing what dog means. Given the non-necessity of knowing these inferences for knowing the meaning of the word means that they (including interlexical relations) should not be imposed on lexical entries, because these relations are not part of the linguistic meaning. Criticisms can be made of atomicity and the use of meaning postulates (see Pustejovsky 1998 for a rebuttal of Fodor & Lepore’s views). In particular, since meaning postulates are capable of defining any type of semantic relation, traditional sense relations form just arbitrary and unpredictable parts of the postulate system, impossible to generalise over. Nevertheless it is possible to define theories in which words have atomic meanings, but the paradigmatic sense relations are used to organise the lexicon. Such a one is WordNet developed by George A. Miller (1995) to provide a lexical database of English organised by grouping words together that are cognitive synonyms (a synset), each of which expresses a distinct concept with different concepts associated with a word being found in different synsets (much like a thesaurus). These synsets then are related
21. Sense relations to each other by lexical and conceptual properties, including the basic paradigmatic sense relations. Although it remains true that the sense relations are stated independently of the semantics of the words themselves, nonetheless it is possible to claim that using them as an organisational principle of the lexicon provides them with a primitive status with respect to human cognitive abilities. WordNet was set up to reflect the apparent way that humans process expressions in a language and so using the sense relations as an organisational principle is tantamount to claiming that they are the basis for the organisation of the human lexicon, even if the grouping of specific words into synsets and the relations defined between them is not determined directly by the meanings of the words themselves. (See article 110 (Frank & Padó) Semantics in computational lexicons). A more damning criticism of the atomic approach is that context-dependent polysemy is impossible because each meaning (whether treated as a concept or a denotation) is in principle independent of every other meaning. A consequence of this, as Pustejovsky points out, is that every polysemous interpretation of a word has to be listed separately and the interpretation of a word in context is a matter of selecting the right concept/ denotation a priori. It cannot be computed from aspects of the meaning of a word with those of other words with which it appears. For example, the meanings of gradable adjectives such as good in (15) will need different concepts associated with each collocation that are in principle independent of each other. Given that new collocations between words are made all the time and under the assumption that the number of slightly new senses that result are potentially infinite in number, this is a problem for storage given the finite resources of the human brain. A further consequence is that, without some means of computing new senses, the independent concepts cannot be learned and so must be innate. While Fodor (1998) has suggested the possibility of the consequence being true, this is an extremely controversial and unpopular hypothesis that is not likely to help our understanding of the nature of word meaning.
4. Conclusion In the above discussion, I have not been able to more than scratch the surface of the debates over the sense relations and their place in theories of word meaning. I have not discussed the important contributions of decompositionalists such as Jackendoff, or the problem of analyticity (Quine 1960), or the current debate between contextualists and semantic minimalists (Cappelen & Lepore 2005, Wedgwood 2007). Neither have I gone into any detail about the variations and extensions of sense relations themselves, such as is often found in Cognitive Linguistics (e.g. Croft & Cruse 2004). And much more besides. Are there any conclusions that we can currently draw? Clearly, sense relations are good descriptive devices helping with the compilation of dictionaries and thesauri, as well as the development of large scale databases of words for use in various applications beyond the confines of linguistics, psychology and philosophy. It would, however, appear that the relation between sense relations and word meaning itself remains problematic. Given the overriding context dependence of the latter, it is possible that pragmatics will provide explanations of observed phenomena better than explicitly semantic approaches (see for example, Blutner 2002, Murphy 2003, Wilson & Carston 2006). Furthermore, the evidence from psycholinguistic and developmental studies, as well as the collocational sensitivity of sense, indicates that syntagmatic relations may be cognitively primary and that paradigmatic relations may be learned, either explicitly or through experience as
477
478
IV. Lexical semantics part of the development of inferential capability, rather than as being a central part of the semantics of words themselves. (See articles 19 (Levin & Rappaport Hovav) Lexical Conceptual Structure, 27 (Talmy) Cognitive Semantics, 30 (Jackendoff) Conceptual Semantics.)
5. References Blutner, Reinhard 2002. Lexical semantics and pragmatics. In: F. Hamm & T. E. Zimmermann (eds.). Linguistische Berichte Sonderheft 10. Semantics, 27–58. Cann, Ronnie 1993. Formal Semantics. An Introduction. Cambridge: Cambridge University Press. Cappelen, Herman & Ernest Lepore 2005. Insensitive Semantics. A Defense of Semantic Minimalism and Speech Act Pluralism. Malden, MA: Blackwell. Carnap, Rudolf 1956. Meaning and Necessity. A Study in Semantics and Modal Logic. 2nd edn. Chicago, IL: The University of Chicago Press. 1st edn. 1947. Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge, MA: The MIT Press. Coseriu, Eugenio & Horst Geckeler 1981. Trends in Structural Semantics. Tübingen: Narr. Croft, William & D. Alan Cruse 2004. Cognitive Linguistics. Cambridge: Cambridge University Press. Cruse, D. Alan 1986. Lexical Semantics. Cambridge: Cambridge University Press. Cruse, D. Alan 2000. Meaning in Language. An Introduction to Semantics and Pragmatics. Oxford: Oxford University Press. Dowty, David R. 1979. Word Meaning and Montague Grammar. The Semantics of Verbs and Times in Generative Semantics and in Montague’s PTQ. Dordrecht: Reidel. Dowty, David R. 1991. Thematic proto-roles and argument selection. Language 67, 547–619. Erk, Katrin 2009. Supporting inferences in semantics space. Representing words as regions. In: H. Bunt, O. Petukhova & S. Wubben (eds.). Proceedings of the Eighth International Conference on Computational Semantics (= IWCS). Tilburg: Tilburg University, 104–115. Fodor, Jerry A. 1998. Concepts. Where Cognitive Science Went Wrong. Oxford: Clarendon Press. Fodor, Jerry A. & Ernest Lepore 1998.The emptiness of the lexicon. Reflections on James Pustejovsky’s ‘The generative lexicon’. Linguistic Inquiry 29, 269–288. Fox, Chris & Shalom Lappin 2005. Foundations of Intensional Semantics. Malden, MA: Blackwell. Geckeler, Horst 1971. Strukturelle Semantik und Wortfeldtheorie. München: Fink. Jackendoff, Ray 2002. Foundations of Language. Brain, Meaning, Grammar, Evolution. Oxford: Oxford University Press. Katz, Jerrold J. & Jerry A. Fodor 1963. The structure of a semantic theory. Language 39, 170–210. Kilgarriff, Adam 1997. I don’t believe in word senses. Computers and the Humanities 31, 91–113. Lyons, John 1977. Semantics. Cambridge: Cambridge University Press. Miller, George A. 1995. WordNet. A lexical database for English. Communications of the ACM 38, 39–41. Montague, Richard 1973. The proper treatment of quantification in ordinary English. In: J. Hintikka, J. M. E. Moravcsik & P. Suppes (eds.). Approaches to Natural Language. Dordrecht: Reidel, 221–242. Reprinted in: R. Thomason (ed.). Formal Philosophy. Selected Papers of Richard Montague. New Haven, CT: Yale University Press, 1974, 247–270. Murphy, Lynne 2003. Semantic Relations and the Lexicon. Cambridge: Cambridge University Press. Nunberg, Geoffrey 1995. Transfers of meaning. Journal of Semantics 12, 109–132. Porzig, Walter 1934. Wesenhafte Bedeutungsbeziehungen. Beiträge zur Deutschen Sprache und Literatur 58, 70–97. Pustejovsky, James 1991. The generative lexicon. Computational Linguistics 17, 409–441. Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA: The MIT Press. Pustejovsky, James 1998. Generativity and explanation in semantics. A reply to Fodor and Lepore. Linguistic Inquiry 29, 289–311.
22. Dual oppositions in lexical meaning
479
Quine, Willard van Orman 1960. Word and Object. Cambridge, MA: The MIT Press. Sperber, Dan & Deirdre Wilson 1986/1995. Relevance. Communication and Cognition. Oxford: Blackwell. 2nd edn. (with postface) 1995. Trier, Jost 1934. Das sprachliche Feld. Eine Auseinandersetzung. Neue Jahrbücher für Wissenschaft und Jugendbildung 10, 428–449. Reprinted in: L. Antal (ed.). Aspekte der Semantik. Zu ihrer Theorie und Geschichte 1662–1970. Frankfurt/M.: Athenäum, 1972, 77–104. Ullman, Stephen 1957. The Principles of Semantics. 2nd edn. Glasgow: Jackson. 1st edn. 1951. van Valin, Robert D. & Randy J. LaPolla 1997. Syntax. Structure, Meaning, and Function. Cambridge: Cambridge University Press. Wedgwood, Daniel 2007. Shared assumptions. Semantic minimalism and relevance theory. Journal of Linguistics 43, 647–681. Weinreich, Uriel 1963. On the semantic structure of language. In: J. H. Greenberg (ed.). Universals of Language. Cambridge, MA: The MIT Press, 142–216. Wierzbicka, Anna 1996. Semantics. Primes and Universals. Oxford: Oxford University Press. Wilson, Deirdre & Robyn Carston 2006. Metaphor, relevance and the ‘emergent property’ issue. Mind & Language 21, 404–433.
Ronnie Cann, Edinburgh (United Kingdom)
22. Dual oppositions in lexical meaning 1. 2. 3. 4. 5. 6. 7.
Preliminaries Duality Examples of duality groups Semantic aspects of duality groups Phase quantification Conclusion References
Abstract Starting from well-known examples, a notion of duality is presented that overcomes the shortcomings of the traditional definition in terms of internal and external negation. Rather duality is defined as a logical relation in terms of equivalence and contradiction. Based on the definition, the notion of duality groups, or squares, is introduced along with examples from quantification, modality, aspectual modification and scalar predication (adjectives). The groups exhibit remarkable asymmetries as to the lexicalization of their four potential members. The lexical gaps become coherent if the members of duality groups are consistently assigned to four types, corresponding e.g. to some, all, no, and not all. Among these types, the first two are usually lexicalized, the third is only rarely and the fourth almost never. Using the example of the German schon (“already”) group, scalar adjectives and standard quantifiers, the notion of phase quantification is introduced as a general pattern of second-order predication which subsumes quantifiers as well as aspectual particles and scalar adjectives. Four interrelated types of phase quantifiers form a duality group. According to elementary monotonicity criteria the four types rank on a scale of markedness that accounts for the lexical distribution within the duality groups. Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 479–506
480
IV. Lexical semantics
1. Preliminaries Duality of lexical expressions is a fundamental logical relation. However, unlike others such as antonymy it enjoys much less attention. Duality relates all and some, must and can, possible and necessary, already and still, become and stay. Implicitly, it is even involved in ordinary antonymy such as between big and small. Traditionally duality is defined in terms of inner and outer negation: two operators are dual iff the outer negation of the one is equivalent to the inner negation of the other; alternatively two operators are dual iff one is equivalent to the simultaneous inner and outer negation of the other. For example, some is equivalent to not all not. Duality, in fact, is a misnomer. Given the possibility of inner and outer negation, there are always four cases involved with duality: a given operator, its outer negation, its inner negation and its dual, i.e. inner plus outer negation. Gottschalk (1953) therefore proposed to replace the term duality by quaternality. In this article, a couple of representative examples are introduced before we proceed to a formal definition of the relation of duality. The general definition of duality is not as trivial as it might appear at first sight. Inner and outer negations are not always available for dual operators at the syntactic level whence it is necessary to base the definition on a semantic notion of negation. Following the formal definition of duality, a closer look is taken at a variety of complete duality groups of four, their general structure and their relationship to the so-called Square of Oppositions of Aristotle’s. Duality groups of four exhibit striking asymmetries: out of the four possible cases, two are almost always lexicalized, while the third is occasionally and the fourth almost never. (If the latter two are not lexicalized they are expressed by using explicit negation with one of the other two cases.) Criteria will be offered for assigning the four members of a group to four types defined in terms of monotonicity and “tolerance”. A general conceptual format is described that allows the analysis of dual operators as instances of the general pattern of “phase quantification”. This is a pattern of second-order predication; a phase quantifier predicates about a given first-order predication that there is, or is not, a transition on some scale between the predication being false and being true, i.e. a switch in truth-value. Four possibilities arise out of this setting: (i) there is a transition from false to true, (ii) there is no transition from false to true, (iii) there is a transition from true to false; (iv) there is no transition from true to false. These four possibilities of phase quantification form a duality group of four. It can be argued that all known duality groups semantically are instances of this general scheme.
1.1. First examples 1.1.1. Examples from logic Probably the best-known cases of duality are the quantifiers in standard predicate logic, ∃ and ∀. The quantifiers are attached a variable and combined with a sentence (formula, proposition), to yield a quantified sentence. (1) a. ∀x P for every x P b. ∃x P for at least one x P
22. Dual oppositions in lexical meaning
481
Duality of the two quantifiers is stated in the logical equivalences in (2): (2) a. b. c. d.
∃x P ∀x P ¬∃x P ¬∀x P
≡ ≡ ≡ ≡
¬∀x ¬P ¬∃x ¬P ∀x ¬P ∃x ¬P
Duality can be paraphrased in terms of EXTERNAL NEGATION and INTERNAL NEGATION (cf. article 63 (Herburger) Negation). External negation is the negation of the whole statement, as on the left formula in (2c,d) and on the right formula in (2a,b). Internal negation concerns the part of the formula following the quantifier, i.e. the “scope” of the quantifier (cf. article 62 (Szabolcsi) Scope and binding). For example, according to (2c) the external negation of existential quantification is logically equivalent to the internal negation of universal quantification, and vice versa in (2d). If both sides in (2c) and (2d) are negated and double negation is eliminated, one obtains (2a) and (2b), respectively. In fact the four equivalences in (2) are mutually equivalent: they all state that universal and existential quantification are duals. Dual operators are not necessarily operators on sentences. It is only required that at least one of their operands can undergo negation (“internal negation”), and that the result of combining the operator with its operand(s) can be negated, too (“external negation”). Another case of duality is constituted by conjunction ∧ and disjunction ∨; duality of the two connectives is expressed by De Morgan’s Laws, for example: (3) ¬(A ∧ B) ≡ (¬A ∨ ¬B) These dual operators are two-place connectives, operating on two sentences, and internal negation is applied to both operands.
1.1.2. First examples from natural language The duality relationship between ∃ and ∀ is analogously found with their natural language equivalents some and every. Note that sentences with some NPs as subject are properly negated by replacing some with no (cf. Löbner 2000: §1 for the proper negation of English sentences): (4) a. b. c. d.
some tomatoes are green every tomato is green no tomato is green not every tomato is green
≡ ≡ ≡ ≡
not every tomato is not green no tomato is not green every tomato is not green some tomatoes are not green
The operand of the quantificational subject NP is its ‘nuclear’ scope, the VP. Modal verbs are another field where duality relations are of central importance. Modal verbs combine with infinitives. A duality equivalence for epistemic must and can is stated in (5): (5) he must have lied
≡
he cannot have told the truth
482
IV. Lexical semantics Aspectual particles such as already and still are among the most thoroughly studied cases of dual operators. Their duality can be demonstrated by pairs of questions and negative answers as in (6). Let us assume that on and off are logically complementary, i.e. equivalent to the negations of each other: (6) a. Is the light already on? – No, the light is still off. b. Is the light still on? – No, the light is already off.
1.2. Towards a general notion of duality The relationship of duality is based on logical equivalence. Duality therefore constitutes a logical relation. In Model-theoretic semantics (cf. article 33 (Zimmermann) Modeltheoretic semantics), meaning is equated with truth conditions; therefore logical relations are considered sense relations (cf. article 21 (Cann) Sense relations). However, in richer accounts of meaning that assume a conceptual basis for meanings, logical equivalence does not necessarily amount to equal meanings (cf. Löbner 2002, 2003: §§4.6, 10.5). It could therefore not be inferred from equivalences such as in (4) to (6) that the meanings of the pairs of dual expressions match in a particular way. All one can say is that their meanings are such that they result in these equivalences. In addition, expressions which exhibit duality relations are rather abstract in meaning and, as a rule, can all be used in various different constructions and meanings. In general, the duality relationship only obtains when the two expressions are used in particular constructions and/or in particular meanings. For example, the dual of German schon “already” is noch “still” in cases like the one in (6), but in other uses the dual of schon is erst (temporal “only”); noch on the other hand has uses where it does not possess a dual altogether (cf. Löbner 1989 for a detailed discussion). The duality relation crucially involves external negation of the whole complex of operator with operands, and internal operand negation. The duality relationship may concern one operand (out of possibly more) as in the case of the quantifiers, modal verbs or aspectual particles, or more than one (witness conjunction and disjunction). In order to permit internal negation, the operands have to be of sentence type or else of some type of predicate expressions. For the need of external negation, the result of combining dual operators with their operands must itself be eligible for negation. A first definition of duality, in accordance with semantic tradition, would be: (7) Let q and q' be two operators that fulfil the following conditions: a. they can be applied to the same domain of operands b. the operands can be negated (internal negation) c. the results of applying the operators to appropriate operands can be negated (external negation) Then q and q' are DUALS iff external negation of one is equivalent to internal negation of the other. This definition, however, is in need of modification. First, “negation” must not be taken in a syntactic sense as it usually is. If it were, English already and still would not be candidates for duality, as they allow neither external nor internal syntactic negation.
22. Dual oppositions in lexical meaning This is shown in Löbner (1999: 89f) for internal negation; as to external negation, already and still can only be negated by replacing them with not yet and no more/not anymore, respectively. The term ‘negation’ in (7) has therefore to be replaced by a proper logical notion. A second inadequacy is hidden in the apparently harmless condition (a): dual operators need not be defined for the same domain of operands. For example, already and still have different domains: already presupposes that the state expressed did not obtain before, while still presupposes that it may not obtain later. Therefore, (8a) and (8b) are semantically odd if we assume that one cannot be not young before being young, or not old after being old: (8) a. She’s already young. b. She’s still old. These inadequacies of the traditional definition will be taken care of below.
1.3. Predicates, equivalence, and negation 1.3.1. Predicates and predicate expressions For proper semantic considerations, it is very important to carefully distinguish between the levels of expression and of meaning, respectively. Unfortunately there is a terminological tradition that conflates these two levels when talking of “predicates”, “arguments”, “operators”, “operands”, “quantifiers” etc.: these terms are very often used both for certain types of expressions and for their meanings. In order to avoid this type of confusion, the following terminological distinctions will be observed in this article: A “predicate” is a meaning; what a meaning is depends on semantic theory (cf. article 1 (Maienborn, von Heusinger & Portner) Meaning in linguistics). In a model-theoretic approach, a predicate would be a function that assigns truth values to one or more arguments; in a cognitive approach, a predicate can be considered a concept that assigns truth values to arguments. For example, the meaning of has lied would be a predicate (function or concept) which in a given context (or possible world) assigns the truth value true to everyone who has lied and false to those who told the truth. Expressions, lexical or complex, with predicate meanings will be called PREDICATE EXPRESSIONS. ARGUMENTS which a predicate is applied to are neither expressions nor meanings; they are objects in the world (or universe of discourse); such objects may or may not be denoted by linguistic expressions; if they are, let us call these expressions ARGUMENT TERMS. (For a more comprehensive discussion of these distinctions see Löbner 2002, 2003: §6.2) Sometimes, arguments of predicate expressions are not explicitly specified by means of an argument term. For example, sentences are usually considered as predicating about a time argument, the time of reference (cf. article 57 (Ogihara) Tense), but very often, the time of reference is not specified by an explicit expression such as yesterday. The terms OPERATOR, OPERAND and QUANTIFIER will all be used for certain types of expressions. If the traditional definition of duality given in (7) is inadequate, it is basically because it attempts to define duality at the level of expressions. Rather it has to be defined at the level of meanings because it is a logical relation and logical relations between expressions originate from their meanings.
483
484
IV. Lexical semantics
1.3.2. The operands of dual operators The first prerequisite for an adequate definition of duality is a precise semantic characterization of the operands of dual operators (“d-operators”, for short). Since the operands must be eligible for negation, their meanings have to be predicates, i.e. something that assigns a truth-value to arguments. Negation ultimately operates on truth-values; its effect on a predicate is the conversion of the truth value assigned. Predicate expressions range from lexical expressions such as verbs, nouns and adjectives to complex expressions like VPs, NPs, APs or whole sentences. In (4), the dual operators are the subject NPs some tomatoes and every tomato; their operands are the VPs is/are green and their respective negations is/are not green; they express predications about the tomatoes referred to. In (5), the operands of the dual modal verbs are the infinitives have lied and have told the truth; they express predications about the referent of the subject NP he; in (6), the operands of already and still are the remainders of the sentence: the light is on/off; in this case, these sentences are taken as predicates about the reference time implicitly referred to. Predicates are never universally applicable, but only in a specific DOMAIN of cases. For a predicate P, its domain D(P) is the set of those tuples of arguments the predicate assigns a truth value to. The notion of domain carries over to predicate expressions: their domain is the domain of the predicate that constitutes their meaning. In the following, PRESUPPOSITIONS (cf. article 91 (Beaver & Geurts) Presupposition) of a sentence or other predicate expression are understood as conditions that simply restrict the domain of the predicate that is its meaning. For example, the sentence Is the light already on in (6) presupposes (among other conditions) (p1) that there is a uniquely determined referent of the NP the light (cf. article 41 (Heim) Definiteness and indefiniteness) and (p2) that this light was not on before. The predication expressed by the sentence about the time of reference is thus restricted to those times when (p1) and (p2) are fulfilled, i.e. those times where there is a unique light which was not on before. In general, a predicate expression p will yield a truth-value for a given tuple of arguments if and only if the presuppositions of p are fulfilled. This classical Fregean view of presuppositions is adequate here, as we are dealing with the logical level exclusively. It follows from this notion of presupposition that predicate expressions which are defined for the same domain of arguments necessarily carry identical presuppositions. In particular that is the case if two predicate expressions are logically equivalent: Definition 1: logical equivalence Let p and p' be predicate expressions with identical domains. p and p' are LOGICALLY EQUIVALENT – p ≡ p' – iff for every argument tuple in their common domain, p and p' yield identical truth values.
1.3.3. The negation relation The crucial relation of logical contradiction can be defined analogously. It will be called ‘neg’, the ‘neg(ation) relation’. Expressions in this relationship, too, have identical presuppositions.
22. Dual oppositions in lexical meaning Definition 2: negation relation Let p and p' be predicate expressions with identical domains. p and p' are NEGOPPOSITES, or NEGATIVES, of each other – p neg p' – iff for every argument tuple out of their common domain, p and p' yield opposite truth values.
Note that this is a semantic definition of negation, as it is defined in terms of predicates, i.e. at the level of meaning. The tests of duality require the construction of pairs of negopposites, or NEG-PAIRS for short. This means to come up with means of negation at the level of expressions, i.e. with lexical or grammatical means of converting the meanings of predicate expressions. For sentences, an obvious way of constructing a negative is the application (or deapplication) of grammatical negation (‘g-negation’, in the following). In English, gnegation takes up different forms depending on the structure of the sentence (cf. Löbner 2000: §1 for more detail). The normal form is g-negation by VP negation, with do auxiliarization in the case of non-auxiliary verbs. If the VP is within the scope of a higherorder operator such as a focus particle or a quantifying expression, either the higher-order operator is subject to g-negation or it is substituted by its neg-opposite. Such higher-order operators include many instances of duality. English all, every, always, everywhere, can and others can be directly negated, while some, sometimes, somewhere, must, always, still etc. are replaced for g-negation by no, never, nowhere, need not, not yet and no more, respectively. For the construction of neg-pairs of predicate expressions other than sentences, sometimes g-negation can be used, e.g. for VPs. In other cases, lexical inversion may be available, i.e. the replacement of a predicate expression by a lexical neg-opposite such as on/ off, to leave/to stay, member/non-member. Lexical inversion is not a systematic means of constructing neg-pairs because it is contingent on what the lexicon provides. But it is a valuable instrument for duality tests.
1.4. Second-order predicates and subnegation 1.4.1. D-operators D-operators must be eligible to negation and therefore predicate expressions themselves; as we saw, at least one of their arguments, their operand(s) must, again, be a predicate expression. For example, the auxiliary must in (5) is a predicate expression that takes another predicate expression have lied as its operand. In this sense, the d-operators are second-order predicate expressions, i.e. predicate expressions that predicate about predicates. D-operators may have additional predicate or non-predicate arguments. For an operator q and its operand p let ‘q(p)’ denote the morpho-syntactic combination of q and p, whatever its form. In terms of the types of Formal Semantics, the simplest types of d-operators would be (t,t) (sentential operators) and ((α,t),t) (quantifiers); a frequent type, represented by focus particles, is (((α,t),α),t) (cf. article 33 (Zimmermann) Model-theoretic semantics for logical types).
1.4.2. Negation and subnegation When the definition of the neg-relation is applied to d-operators, it captures external negation. The case of internal negation is taken care of by the ‘subneg(ation) relation’.
485
486
IV. Lexical semantics Two operators are subneg-opposites if, loosely speaking, they yield the same truth values for neg-opposite operands. Definition 3: subnegation opposites Let q and q' be operators with a predicate type argument. Let the predicate domains of q and q' be such that q yields a truth value for a predicate expression p iff q' yields a truth value for the neg-opposites of p. q and q' are SUBNEG(ATION) OPPOSITES, or SUBNEGATIVES, of each other – q SUBNEG q' – iff >
for any predicate expressions p and p' eligible as operands of q and q', respectively: if p neg p' then q(p) ≡ q'(p').
(For operators with more than one predicate argument, such as conjunction and disjunction, the definition would have to be modified in an obvious way.) If two d-operators are subnegatives, their domains need not be identical. The definition only requires that if q is defined for p, any subnegative q' is defined for the negatives of p. If the predicate domain of q contains negatives for every predicate it contains, then q and q' have the same domain. Such are the domains of the logical quantifiers, but that does not hold for pairs of operators with different presuppositions, e.g. already and still (cf. §5.1). An example of subnegatives is the pair always/never: (9) Max always is late ≡ Max never is on time To be late and to be on time are neg-opposites, here in the scope of always and never, respectively. The two quantificational adverbials have the same domain, whence the domain condition in Definition 3 is fulfilled.
2. Duality 2.1. General definition Definition 3 above paths the way for the proper definition of duality: Definition 4: dual opposites Let q and q' be operators with a predicate type argument. Let the predicate domains of q and q' be such that q yields a truth value for a predicate expression p iff q' yields a truth value for the neg-opposites of p. q and q' are DUAL (OPPOSITE)S of each other – q DUAL q' – iff: >
for any predicate expressions p and p' eligible as operands of q and q', respectively: if p neg p' then q(p) neg q'(p').
The problem with condition (a) in the traditional definition in (7) is taken care of by the domain condition here; and any mention of grammatical negation is replaced by relating
22. Dual oppositions in lexical meaning
487
to the logical relation neg. If q and q' and an operand p can all be subjected to g-negation neg, duality of q and q' amounts to the equivalence of negq(p) and q'(negp).
2.2. Duality groups Any case of duality involves, in fact, not only two dual expressions, but also the negatives and subnegatives of the dual operators. In total, these are four cases, not more. First, a dual of a dual is equivalent to the operator itself; the analogue holds for negation and subnegation. Therefore, if q is a d-operator and N, S, D are any morpho-syntactic operations to the effect of creating a negative, subnegative or dual of q, respectively, we observe: (10) a. NNq b. SSq c. DDq
≡ ≡ ≡
q q q
Furthermore, the joint application of any two operations N, S or D amounts to the third: (11) a. NSq ≡ SNq ≡ b. NDq ≡ DNq ≡ c. DSq ≡ SDq ≡
Dq Sq Nq
For the logical quantifiers, these laws can be read off the equivalences in (3); as for natural language, consider the following illustration for already and still. Let p be the light is on; and let us accept that the light is off is its negative, Np. Let q be already. q(p) is thus (12a). The subnegation of q(p), Sq(p), is gained by replacing p with Np in (12b). The negation of already is expressed by replacing it with not yet (12c). The dual of already is still (12d). (12) a. b. c. d.
q(p) the light is already on Sq(p) = q(Np) = the light is already off Nq(p) the light is not yet on Dq(p) the light is still on
= = = =
already(the light is on) already(the light is off) not yet(the light is on) still(the light is on)
The combination of S and N (the order does not matter) yields (13a), the application of Nq to Np; this is obviously equivalent to Dq. NDq(p) would be the negation of the dual of already(p), i.e. the negation of the light is still on; this is accomplished by replacing still with not anymore: the light isn’t on anymore, which in turn is equivalent to the light is already off, i.e. Sq (13b). Finally, the combination of dual and subnegation is yielded by replacing already by its dual still and p by its negative. This is equivalent to applying the negative of already, i.e. not yet to p (13c): (13) a. NSq(p) b. NDq(p) c. DSq(p)
= Nq(Np) = the light is not yet off ≡ the light is still on = N still p = the light isn’t on anymore ≡ the light is already off = still Np = the light is still off ≡ the light not yet on
488
IV. Lexical semantics Due to the equivalences in (10) and (11), with any d-operator q, the operations N, S and D yield a group of exactly four cases: q, Nq, Sq, Dq – provided Nq, Sq and Dq each differ from q (see §2.3 for reduced groups of two members). Each of the four cases may be expressible in different, logically equivalent ways. Thus what is called a “case” here, is basically a set of logically equivalent expressions. According to (10) and (11), any further application of the operations N, S and D just yields one of these four cases: the group is closed under these operations. Within such a group, no element is logically privileged: instead of defining the group in terms of q and its correlates Nq, Sq and Dq, we might, for example, just as well start from Sq and take its correlates NSq = Dq, SSq = q, and DSq = Nq. Definition 5: duality group A DUALITY GROUP is a group of up to four operators in the mutual relations neg, subneg and dual that contains at least a pair of dual operators.
Duality groups can be graphically represented as a square of the structure depicted in Fig. 22.1.
Fig. 22.1: Duality square
Although the underlying groups of four operators related by neg, subneg and dual are perfectly symmetrical, duality groups in natural, and even in formal, languages are almost always deficient, in that not all operators are lexicalized. This issue will be taken up in §§3, 4 and 5 below.
2.3. Reduced duality groups and self-duality For the sake of completeness, we will briefly mention cases of reduced (not deficient) duality groups. The duality square may collapse into a constellation of two, or even one case, if the operations N, D, or S are of no effect on the truth conditions, i.e. if Nq, Dq or Sq are equivalent to q itself. Neutralization of N contradicts the Law of Contradiction: if q ≡ Nq, and q were true for any predicate operand p, it would at the same time be false. Therefore, the domain of q must be empty. This might occur if q is an expression with contradictory presuppositions, whence the operator is never put to work. For such an “idle” operator, negatives, subnegatives and duals are necessarily idle, too. Hence the whole duality group collapses into one case.
22. Dual oppositions in lexical meaning Neutralization of S results in equivalence of N and D, since in this case Dq ≡ NSq ≡ Nq. One such example is the quantifier some, but not all: if a predication p is true in some but not all cases, its negation, too, is true in some, but not all cases. Quantificational expressions, nominal or adverbial, meaning “exactly half of” represent another case: if a predication is true of exactly half of its domain, its opposite is true of the other half: the quantification itself logically amounts to the subnegation of the quantification. (14) a. q b. Sq
half of the students failed half of the students didn’t fail
When S is neutralized, the duality square melts down to q/Sq neg/dual Nq/Dq. More interesting is the case of D neutralization. If Dq ≡ q, q is its own dual, i.e. SELFDUAL. For self-dual operators, N and S are equivalent. The domain of self-dual operators is generally closed under neg: since Nq ≡ Sq, q is defined for p iff it is defined for Np. The square reduces to q/Dq neg/subneg Nq/Sq. The phenomenon of self-duality encompasses a heterogeneous set of examples. POLARITY. The simplest example is g-negation neg: N neg(p) ≡ NNp ≡ p and S neg(p) ≡ neg(Np) ≡ NNp ≡ p. Similarly, if there were a means of expressing just positive polarity, say pos, this would be self-dual, too, since pos(Np) ≡ Np ≡ N pos(p). ARGUMENT INSERTION. Let D be a domain of a first-order predicate p and u an element of D. Let Iu be the operation of supplying p with u as its argument. Then Iu applied to p yields the truth value that p yields for u: Iu(p) = p(u). If we apply Iu to a neg-opposite of p (subnegation), we obtain the opposite truth value, and so we do if we negate the application of Iu to p (negation). Iu is exerted, for example, when a definite NP is combined as an argument term with a predicate expression. This point is of importance for the discussion of NP semantics. It can be argued that all definite NPs are essentially individual terms; when combined with a predicate expression, they have the effect of Iu for their referent u. See Löbner (2000: §2) for an extensive discussion along this line. It is for that reason that (15b) and (15c) are logically equivalent: (15) p “is on”, p' “is off”, u “the light” a. Iu(p) = p(u) the light is on b. SIu(p) = Iu(p') the light is off the light is not on c. NIu(p) = “MORE THAN HALF”. If the scale of quantification is discrete, and the total number of cases is odd, “more than half” is self-dual, too. Under these circumstances, external negation “not more than half” amounts to “less than half”. (16) a. q more than half of the seven dwarfs carry a shovel b. Nq not more than half of the seven dwarfs carry a shovel c. Sq more than half of the seven dwarfs don’t carry a shovel NEG-RAISING VERBS. So-called neg-raising verbs (NR verbs), such as “want”, “believe”, “hope”, can be used with N to express S, as in (17). If this is not regarded as a displacement of negation, as the term ‘neg-raising’ suggests, but in fact as resulting from equivalence of N and S for these verbs, neg-raising is tantamount to self-duality:
489
490
IV. Lexical semantics (17) I don’t want you to leave ≡ I want you not to leave The question as to which verbs are candidates for the neg-raising phenomenon is, as far as I know, not finally settled (see Horn (1978) for a comprehensive discussion, also Horn (1989: §5.2)). The fact that NR verbs are self-dual allows, however, a certain general characterization. The condition of self-duality has the consequence that if v is true for p, it is false for Np, and if v is false for p, it is true for Np. Thus NR verbs express propositional attitudes that in their domain “make up their mind” for any proposition as to whether the attitude holds for the proposition or its negation. For example, NR want applies only to such propositions the subject has either a positive or a negative preference for. The claim of self-duality is tantamount to the presupposition that this holds for any possible operand. Thereby the domains of NR verbs are restricted to such pairs of p and Np to which the attitude either positively or negatively obtains. Among the modal verbs, those meaning “want to” like German wollen do not partake in the duality groups listed below. I propose that these modal verbs are NR verbs, whence their duality groups collapse to a group of two [q/Dq, Nq/Sq], e.g. [wollen, wollen neg]. THE GENERIC OPERATOR. In mainstream accounts of characterizing (i-generic) sentences (cf. Krifka et al. 1995, also article 47 (Carlson) Genericity) it is commonly assumed that their meanings involve a covert genericity operator GEN. For example, men are dumb would be analyzed as (18a): (18) a. GEN[x](x is a man; x is dumb] According to this analysis, the meaning of the negative sentence men are not dumb would yield the GEN analysis (18b), with the negation within the scope of GEN: (18) b. GEN[x](x is a man; ¬ x is dumb] The sentence men are not dumb is the regular grammatical negation of men are dumb, whence it should also be analysed as (18c), i.e. as external negation w.r.t. to GEN: (18) c.
¬GEN[x](x is a man; x is dumb]
It follows immediately that GEN is self-dual. This fact has not been recognized in the literature on generics. In fact, it invalidates all available accounts of the semantics of GEN which all agree in analyzing GEN as some variant of universal quantification. Universal quantification, however, is not self-dual, as internal and external negation clearly yield different results (see Löbner 2000: §4.2) for elaborate discussion. HOMOGENEOUS QUANTIFICATION. Ordinary restricted quantification may happen to be self-dual if the predicate quantified yields the same truth-value for all elements of the domain of quantification, i.e. if it is either true of all cases or false of all cases. (As a special case, this condition obtains if the domain of quantification contains just one element u; both ∃ and ∀ are then equivalent to self-dual Iu.) The “homogeneous quantifier” ∃∀ (cf. Löbner 1989: 179, and Löbner 1990: 27ff for more discussion) is a twoplace operator which takes one formula for the restriction and a second formula for the predication quantified. It will be used in §5.2.
22. Dual oppositions in lexical meaning Definition 6: homogeneous quantification For arbitrary predicate logic sentences b and p, ∃∀x(b : p) =df ∃x(b ∧ p) if [∃x(b ∧ p)] = [∀x(b → p)], otherwise undefined.
The colon in ‘∃∀x(b : p)' cannot be replaced by any logical connective; [A] represents the truth value of A. According to the definition, ∃∀x(b : p) presupposes that ∃x(b ∧ p) and ∀x(b → p) are either both true or both false. The presupposition makes sure that the truth of p in the domain defined by b is an all-or-nothing-matter: if p is true for at least one “b”, it is true for all, and if it is false for at least one “b”, it is false for all, whence it cannot be true for some. ∃∀x(b : p) can be read essentially as “the b’s are p”. (See Löbner 2000: §2.6 for the all-or-nothing-character of distributive predications with definite plural arguments.) ∃∀x(b : p) is self-dual: the b’s are not-p iff the b’s are not p. (A simple proof is given in Löbner 1990: 207f.) (19) ¬ ∃∀x(b : p) ≡ ∃∀x(b : ¬p) As a general trait of self-dual operators we may fix the following. Applying to a domain of predicates that necessarily is closed under negation, they cut the domain into two symmetric halves of mutually negative predicates: for every neg-pair of operands, they “make up their mind”, i.e. they are true of one member of the pair and false of the other.
3. Examples of duality groups 3.1. Duality tests The first thing to observe when testing for duality relations is the fact that they are highly sensitive to the constructions in which a d-operator can be used. Strictly speaking, duality relations are defined for more or less specific constructions. For example, there are different duality groups for German schon with stative IP focus, with focus on a scalar, time-dependent predicate, with focus on a temporal frame adverbial and others (Löbner 1989). Similarly, the duality groupings of modal verbs differ for epistemic vs. deontic uses. Many operators belong to duality groups only in certain constructions, but not when used in others. For example, German werden (“become”) is the dual of bleiben (“stay”) in the copula uses. As was pointed out by Schlücker (2008), there are, however, several uses of werden where it is not the dual match of bleiben, often because bleiben cannot even be used in certain constructions for werden. For a given construction, the duality test involves the use of subneg-opposites for the operands and of neg-opposites for the whole. Often, even if available, g-negation is a problematic tool due to potential scope ambiguities and ambivalence between neg and subneg readings. For example, VP negation in sentences with universal quantifier subjects has ambiguous scope, unlike, of course, scopeless lexical inversion: (20) every light wasn’t on subneg reading: ≡ every light was off (lexical inversion on vs. off) neg reading: ≡ not every light was on
491
492
IV. Lexical semantics Similarly, for modal verbs g-negation sometimes yields a neg reading, sometimes a subneg reading: (21) a. she may not stay (epistemic use) ≡ she may leave b. she may not stay (deontic use) ≡ she must leave, not she may leave Apart from these problems, g-negation may not be available, either for the operands or for the operators. For forming subnegatives it is generally recommended to use lexical inversion. Although not generally available, there are usually some cases of lexical neg-opposites in the domain of the operator which can be employed for tests. Since the operators can be assumed to operate in a logically uniform way on their operands, the findings on such cases can be generalized to the whole domain. If g-negation is not available at the d-operator level, pairs of questions and negative answers can be used. The negative answer has to be carefully formed as to exactly match the question. This is secured if any denial of the question entails that answer, i.e. if that answer is the weakest denial possible. As mentioned above, already and noch are operators that bar both internal and external g-negation. The duality relations can be proved by using lexical inversion for the assessment of subneg (22), and the negative-answer test plus lexical inversion for duality (23). (22) a. the lights are already off b. the lights are still off
≡ the lights are not on anymore ≡ the lights are not yet on
(23) a. Are the lights already on? – No. ≡ the lights are still off b. Are the lights still on? – No. ≡ the lights are already off
3.2. Duality groups 3.2.1. Quantifiers One group of instances of dual operators is constituted by various expressions of quantification. Different sets of quantifiers are used for quantifying over individuals, portions, times, places, and other types of cases. Assessing the duality relationships within the respective groups involves the distinction between count and mass reference, collective and distributive predication and generic or particular quantification (cf: Löbner 2000: §3, §4 for these distinctions). The groups include nominal and adverbial quantifiers. In the following (see Tab. 22.1), duality groups will be represented in the form ‘[operator, dual, negative, subnegative]’ with non-lexical members in parentheses, ‘–’ indicates a case that cannot be expressed. Throughout, the existential quantifier is chosen as the first member of each group. The additions ‘pl’ and ‘sg’ indicate the use with plural or singular, respectively. All these cases in Tab. 22.1 are obvious instances of existential and universal quantification and their negations. In no group the subnegative is lexicalized. The partly groups exhibit a peculiar gap for the negative. The conjunctions and and or can be subsumed under quantification, as they serve to express that all or some of the conjuncts are true. The case of negated conjunction needs careful intonation; its status is certainly marginal; one would prefer to say, e.g. Mary and Paul are not both sick.
22. Dual oppositions in lexical meaning
493
Tab. 22.1: Duality Groups of Quantifiers type of quantification
q
dual
negative
subnegative
particular or generic nominal distributive q. particular or generic nominal collective q. particular nominal distributive q. particular nominal q. over two cases particular or generic nominal mass q. particular or generic adverbial count q. particular or generic adverbial mass q. adverbial q. over times, adverbial generic q. over cases adverbial q. over places truth conditional connectives
some pl
every sg
no sg
(neg every)
some pl
all pl
no pl
(neg all)
some pl
each sg
no sg
(??neg each)
either sg
both pl
neither sg
(neg both)
some sg
all sg
no sg
(neg all)
partly
all
–
(neg all)
partly
–
sometimes
all or entirely always
(neg all) or (neg entirely) (neg always)
somewhere or
everywhere and
never
nowhere neither ... nor
(neg everywhere) (neg and)
3.2.2. Deontic modality From the point of view of modal logic, modalities such as possibility and necessity, too, are instances of quantification. Necessity corresponds to truth, or givenness, in all cases out of a given set of alternatives, while possibility means truth in some such cases. The expressions of modality include grammatical forms such as causatives, potentials, imperatives etc. as well as modal verbs, adverbs, adjectives, verbs, and nouns. (For a survey of modality, and mood, see Palmer 2001; also article 50 (Portner) Verbal mood, article 58 (Hacquard) Modality.) Modal verbs such as those in English and other Germanic languages express various kinds of modality, among them deontic and epistemic. The composition of duality groups out of the same pool of verbs differs for different modalities. Although duality relations constitute basic semantic data for modal verbs, they are hardly taken into account in the literature (e.g. Palmer 2001 or Huddleston & Pullum 2002 do not mention duality relations.). In the groups of modal verbs, may can often be replaced by can. The second group of modal verbs differs in that they have shall as the dual of may instead of must. Since the meanings of must and shall are not logically equivalent – they express different variants of deontic modality – may and need in the two duality groups have different meanings, too, since they are interrelated to shall and must by logical equivalence relations within their respective duality groups. Thus, the assessment of duality relations may serve as a means of distinguishing meaning variants of the expressions involved. The vocabulary for the adjective group is rich, comprising several near-synonyms for denoting necessity (obligatory, mandatory, imperative etc.) or possibility (permitted, allowed, admissible and others). Strictly speaking, each adjective potentially spans a duality group of its own. Again the vocabulary of the subneg type is the most restricted one.
494
IV. Lexical semantics Tab. 22.2: Duality Groups of Deontic Expressions type of expression
q
dual
negative
subnegative
modal verbs (deontic modality)
may/can
must
(must neg) (may neg)
(need neg)
modal verbs (2)
may
shall
(shall neg)
(need neg)
adjectives
possible
necessary
impossible
unnecessary
German causative deontic verbs
ermöglichen “render possible”
erzwingen “force”
verhindern “prevent”
erübrigen “render unnecessary”
imperative
imperative of permission
imperative of request
(neg imperative)
–
causative
causative of permission
causative of causation
(neg causative)
–
verbs of deontic and causal modality
accept allow let/admit
demand request let/make/force
refuse forbid prevent
(demand neg) (request neg) (force neg)
The imperative form has two uses, the prototypical one of request, or command, and a permissive use, corresponding to the first cell of the respective duality group. The negation of the imperative, however, only has the neg-reading of prohibition. The case of permitting not to do something cannot be expressed by a simple imperative and negation. Similarly, grammatical causative forms such as in Japanese tend to have a weak (permissive) reading and a strong reading (of causation), while their negation inevitably expresses prevention, i.e. causing not to. The same holds for English let and German lassen.
3.2.3. Epistemic modality In the groups of modal verbs in epistemic use, g-negation of may yields a subnegative, unlike the neg-reading of the same form in the deontic group. Can, however, yields a negreading with negation. Thus, the duality groups exhibit remarkable inconsistencies such as the near-equivalence of may and can along with a clear difference of their respective g-negations, or the equivalence of the g-negations of non-equivalent may and must. Tab. 22.3: Duality Groups of Epistemic Expressions type of expression
q
dual
negative
subnegative
modal verbs (1)
can
must
can neg
need neg
modal verbs (2)
may
must
can neg
may neg
epistemic adjectives
possible
certain
impossible
questionable
adverbs
possibly
certainly
in no case
–
verbs of attitude
hold possible
believe
exclude
doubt
adjectives for logical properties
satisfiable
tautological
contradictory unsatisfiable
(neg tautological)
verbs for logical relations
be compatible with
entail
exclude
(neg entail)
22. Dual oppositions in lexical meaning Logical necessity and possibility can be considered a variant of epistemic modality. Here the correspondence to quantification is obvious, as these properties and relations are defined in terms of existential and universal quantification over models (or worlds, or contexts).
3.2.4. Aspectual operators Aspectual operators deal with transitions in time between opposite phases, or equivalently, with beginning, ending and continuation. Duality groups are defined by verbs such as begin, become and by focus particles such as already and still. The German particle schon will be analyzed in §5.1 in more detail, as a paradigm case of ‘phase quantification’. The particles of the nur noch group have no immediate equivalents in English. See Löbner (1999: §5.3) for semantic explanations.
3.2.5. More focus particles, conjunctions More focus particles such as only, even, also are candidates for duality relations. A duality account of German nur (“only”, “just”) is proposed in Löbner (1990: §9). In some of the uses of only analyzed there, it functions as the dual of auch (“also”). An analysis of auch, however, is not offered there. König (1991a, 1991b) proposed to consider causal because and concessive although duals, due to the intuition that ‘although p, not q’ means something like ‘not (because p, q)’. However, Iten (2005) argues convincingly against that view.
3.2.6. Scalar adjectives Löbner (1990: §8) offers a detailed account of scalar adjectives which analyzes them as dual operators on an implicit predication of markedness. The analysis will be briefly sketched in §5.1. According to this view, pairs of antonyms, together with their negations, form duality groups such as [long, short, (neg long), (neg short)]. Independently of this analysis, certain functions of scalar adjectives exhibit logical relations that are similar to the duality relations. Consider the logical relationships between positive with enough and too with positive, as well as between equative and comparative: (24) a. x is not too short x is too long b. x is not as long as y x is as long as y
≡ ≡ ≡ ≡
x is long enough x is not short enough x is shorter than y x is not shorter than y
The negation of too with positive is equivalent to the positive of the antonym with enough, and the negation of the equative is equivalent to the comparative of the antonym. Antonymy essentially means reversal of the underlying common scale. Thus the equivalences in (24) represent instances of a “duality” relation that is based on negation and scale reversal, another self-inverse operation, instead of being based on negation and subnegation. In view of such data, the notion of duality might be generalized to analogous logical relations based on two self-inverse operations.
495
496
IV. Lexical semantics
3.3. Asymmetries within duality groups The duality groups considered here exhibit remarkable asymmetries. Let us refer to the first element of a duality group as type 1, its dual as type 2, its negative as type 3 and its subnegative as type 4. The first thing to observe is the fact that there are always lexical or grammatical means of directly expressing type 1 and type 2, sometimes type 3 and almost never type 4. This tendency has long been observed for the quantifier groups (see Horn 1972 for an early account, Döhmann 1974a,b for cross-linguistic data, and Horn to appear for a comprehensive recent survey). In addition to the lexical gaps, there is a considerable bias of negation towards type 3, even if the negated operand is type 2. Types 3 and 4 are not only less frequently lexicalized; if they are, the respective expressions are often derived from type 1 and 2, if historically, by negative affixes, cf. n-either, n-ever, n-or, im-possible. The converse never occurs. Thus, type 4 appears to be heavily marked: on a scale of markedness we obtain 1, 2 < 3 < 4. A closer comparison of type 1 and type 2 shows that type 2 is marked vs. type 1, too. As for nominal existential quantification, languages are somehow at pain when it comes to an expression of neutral existential quantification. This might at a first glance appear to indicate markedness of existential vs. universal quantification. However, nominal existential quantification can be considered practically “built into” mere predication under the most frequent mode of particular (non-generic) predication: particular predication entails reference, which in turn entails existence. Thus, existential quantification is so unmarked that it is even the default case not in need of overt expression. A second point of preference compared to universal quantification is the degree of elaboration of existential quantification by more specific operators such as numerals and other quantity specifications. For the modal types, this argumentation does not apply. What distinguishes type 1 from type 2 in these cases, is the fact that irregular negation, i.e. subneg readings of g-negation only occurs with type 2 operators; in this respect, epistemic may constitutes an exception. Tab. 22.4: Duality Groups of Aspectual Expressions type of expression
q
dual
negative
subnegative
verbs of beginning etc.
begin become
continue stay
end (become neg)
(neg begin) (neg stay)
German aspectual particles with stative operands (1)
schon “already”
noch “still”
(noch neg) “neg yet”
(neg mehr) “neg more”
German aspectual particles with focus on a specification of time
schon “already”
erst “only”
(noch neg) “neg yet”
(neg erst) “neg still”
German aspectual particles with stative operands (2)
endlich “finally”
noch immer “STILL”
noch immer neg “STILL neg”
(endlich neg mehr) “finally neg more”
German aspectual particles with scalar stative operands (3)
nur noch
noch
(neg nur noch)
(neg mehr)
22. Dual oppositions in lexical meaning
497
The aspectual groups will be discussed later. So much may, however, be stated. In all languages there are very many verbs that incorporate type 1 ‘become’ or ‘begin’ as opposed to very few verbs that would incorporate type 2 ‘continue’ or ‘stay’ or type 3 ‘stop’, and apparently no verbs incorporating ‘not begin’. These observations that result in a scale of markedness of type 1 < type 2 < type 3 < type 4 are tendencies. In individual groups, type 2 may be unmarked vs. type 1. Conjunction is unmarked vs. disjunction, and so is the command use of the imperative opposed to the permission use. Of course, the tendencies are contingent on which element is chosen as the first of the group. They emerge only if the members of the duality groups are assigned the types they are. Given the perfect internal symmetry of duality groups, the type assignment might seem arbitrary - unless it can be motivated independently. What is needed, therefore, is independent criteria for the assignment of the four types. These will be introduced in §4.2 and §5.3.
4. Semantic aspects of duality groups 4.1. Duality and the Square of Opposition The quantificational and modal duality groups (Tab. 22.1, 22.2, 22.3) can be arranged in a second type of logical constellation, the ancient Square of Opposition (SqO), established by the logical relations of entailment, contradiction (i.e. neg-opposition), contrariety and subcontrariety. The four relations are essentially entailment relations (cf. Löbner 2002, 2003: §4 for an introduction). They are defined for arbitrary, not necessarily second-order, predicates. The relation of contradictoriness is just neg. Definition 7: entailment relations for predicate expressions Let p and p' be arbitrary predicate expressions with the same domain D.
(i) p ENTAILS p' iff for every a in D, if p is true for a, then p' is true for a. (ii) p and p' are CONTRARIES iff p entails Np'. (iii) p and p' are SUBCONTRARIES iff Np entails p’. The traditional SqO has four vertices, A, I, E, O corresponding to ∀, ∃, ¬ ∃ and ¬∀, or type 1, 2, 3, 4, respectively, although in a different arrangement than in Fig. 22.2: ∀
¬∃
∃
¬∀
Fig. 22.2: Square of Oppositions
498
IV. Lexical semantics The duality square and the SqO depict different, in fact independent logical relations. Unlike the duality square, the SqO is asymmetric in the vertical dimension: the entailment relations are unilateral, and the relation between A and E is different from the one between I and O. The relations in the SqO are basically second-order relations (between first-order predicates), while the duality relations dual and subneg are third-order relations (between second-order predicates). The entailment relations can be established between arbitrary first-order predicate expressions; the duality relations would then simply be unavailable due to the lack of predicate-type arguments. For example, any pair of contrary first-order predicate expressions such as frog and dog together with their negations span a SqO with, say, A = dog, E = frog, I = N frog, O = N dog. There are also SqO’s of second-order operators which are not duality groups. For example, let A be more than one and E no with their respective negations O not more than one/at most one and I some/at least one. The SqO relations obtain, but A and I, i.e. more than one and some are not duals: the dual of more than one is not more than one not, i.e. at most one not. This is clearly not equivalent with some. On the other hand, there are duality groups which do not constitute SqO’s, for example the schon group. An SqO arrangement of this group would have schon and noch nicht, and noch and nicht mehr, as diagonally opposite vertices. However, in this group duals have different presuppositions, in fact this holds for all horizontal or vertical pairs of vertices. Therefore, since the respective relations of entailment, contrariety and subcontrariety require identical presuppositions, none of these obtains. In Löbner (1990: 210) an artificial example is constructed which shows that the duality relations do not entail the SqO relations even if all four operators share the same domain. There may be universal constraints for natural language which exclude such cases.
4.2. Criteria for the distinction among the four types 4.2.1. Monotonicity: types 1 and 2 vs. types 3 and 4 The most salient difference among the four types is that between types 1 and 2 and types 3 and 4: types 1 and 2 are positive, types 3 and 4 negative. The difference can be captured by the criterion of monotonicity. Barwise & Cooper (1981) first introduced this property for quantifiers; see also article 43 (Keenan) Quantifiers. Definition 8: monotonicity a. An operator q is UPWARD MONOTONE – MON↑– if for any operands p and p' if p entails p' then q(p) entails q(p'). b. An operator q is DOWNWARD MONOTONE – MON↓ – if for any operands p and p' if p entails p' then q(p') entails q(p). All type 1 and type 2 operators of the groups listed are mon↑ while the type 3 and type 4 operators are mon↓. Negation inverses entailment: if p entails p' then Np' entails Np. Hence, both N and S inverse the direction of monotonicity. To see this, consider (25):
22. Dual oppositions in lexical meaning
499
(25) have a coke entails have a drink, therefore a. every is mon↑: every student had a coke entails every student had a drink b. no is mon↓: no student had a drink entails no student had a coke Downward monotonicity is a general trait of semantically negative expressions (cf. Löbner 2000: §1.3). It is generally marked vs. upward monotonicity. Mon↓ operators, including most prominently g-negation itself, license negative polarity items (cf. article 64 (Giannakidou) Polarity items) and are thus specially restricted. Negative utterances in general are heavily marked pragmatically since they require special context conditions (Givón 1975).
4.2.2. Tolerance: types 1 and 4 vs. types 2 and 3 Intuitively, types 1 and 4 are weak as opposed to the strong types 2 and 3. The weak types make weaker claims. For example, one positive or negative case is enough to verify existential or negated universal quantification, respectively, whereas for the verification of universal and negated existential quantification the whole domain of quantification has to be checked. This distinction can be captured by the property of (in)consistency, or (in)tolerance: Definition 9: tolerance and intolerance a. An operator q is TOLERANT iff for some neg-pair p and Np of operands, q is true for both p and Np. b. An operator q is INTOLERANT iff it is not tolerant. Intolerant operators are “strong”, tolerant ones “weak”. (26) a. intolerant every: every light is off excludes every light is on b. tolerant some: some lights are on is compatible with some lights are off An operator q is intolerant iff for all operands p, if q(p) is true then q(Np) is false, i.e. Nq(Np) is true. Hence an operator is intolerant iff it entails its dual. Unless q is selfdual, it is different from its dual, hence only one of two different duals can entail its dual. Therefore, of two different dual operators one is intolerant and the other tolerant, or both are tolerant. In the quantificational and modal groups, type 2 generally entails type 1, whence type 2 is intolerant and type 1 tolerant. Since negation inverses entailment, type 3 entails type 4 if type 2 entails type 1. Thus in these groups, type 4 is tolerant, and type 3 intolerant. The criterion of (in)tolerance cannot be applied to the aspectual groups in Tab. 22.4. This gap will be taken care of in §5.3. As a result, for those groups that enter the SqO (the quantificational and modal groups), the four types can be distinguished as follows. Tab. 22.5: Type Distinctions in Duality Squares and the Square of Opposition type 1 / I
type 2 / A
type 3 / E
type 4 / O
mon↑ tolerant
mon↑ intolerant
mon↓ mon↓
intolerant tolerant
500
IV. Lexical semantics Horn (to appear: 14) directly connects (in)tolerance to the asymmetries of g-negation w.r.t. types 3 and 4. He states that ‘intolerant q may lexically incorporate S, but tends not to lexicalize N; conversely, tolerant q may lexically incorporate N, but bars lexicalization of S' (the quotations are modified as to fit the terminology and notation used here). Since intolerant operators are of type 2 and tolerant ones of type 1, both incorporations of S or N lead to type 3. The logical relations in the SqO can all be derived from the fact that ∀ entails ∃. They are logically equivalent to ∀ being intolerant which, in turn, is equivalent to each of the (in)tolerance values of the other three quantifiers. The monotonicity properties cannot be derived from the SqO relations. They are inherent to the semantics of universal and existential quantification.
4.3. Explanations of the asymmetries There is considerable discussion as to the reasons for the gaps observed in the SqO groups. Horn (1972) and Horn (to appear) suggest that type 4 is a conversational implicature of type 1 and hence in no need of extra lexicalization: some implicates not all, since if it were in fact all, one would have chosen to say so. This being so, type 4 is not altogether superfluous; some contexts genuinely require the expression of type 4; for example, only not all, not some, can be used with normal intonation as a refusal of all. Löbner (1990: §5.7) proposes a speculative explanation in terms of possible differences in cognitive cost. The argument is as follows. Assume that the relative unmarkedness of type 1 indicates that type 1 is cognitively basic and that the other types are cognitively implemented as type 1 plus some cognitive equivalents of N, S, or D. If the duality group is built up as [q, Dq, Nq, DNq], this would explain the scale of markedness, if one assumes that application of D is less expensive than application of N, and simultaneous application of both is naturally even more expensive than either. Since we are accustomed to think of D as composed of S and N, this might appear implausible; however, an analysis is offered (see §5.2), where, indeed, D is simple (essentially presupposition negation) and S the combined effect of N and D. Jaspers (2005) discusses the asymmetries within the SqO, in particular the missing lexicalization of type 4 / O, in more breadth and depth than any account before. He, too, takes type 1 as basic, “pivotal” in his terminology, and types 2 and types 3 as derived from type 1 by two different elementary relations, one of them N. The fourth type, he argues, does not exist at all (although it can be expressed compositionally). His explanation is therefore basically congruent with the one in Löbner (1990), although the argument is based not on speculations about cognitive effort, but on a reflection on the character of human logic. For details of a comparison between the two approaches see Jaspers (2005: §2.2.5.2 and §4).
5. Phase quantification In Löbner (1987, 1989, 1990) a theory of “phase quantification” was developed which was first designed to provide uniform analyses of the various schon groups in a way that captures the duality relationships. The theory later turned out to be also applicable to “only” (German nur), scalar adjectives and, in fact, universal and existential quantification in
22. Dual oppositions in lexical meaning
501
a procedural approach. To the extent that the quantificational and modal groups are all derivative of universal and existential quantification, this theory can be considered a candidate for the analysis of all known cases of duality groups, including all cases of quantification. It is for that reason that, somewhat misleadingly, the notion ‘phase quantifier’ was introduced. The theory will be introduced in a nutshell here. The reader is referred to the publications mentioned for a more elaborate introduction. Phase quantification is about some first-order predication p; the truth value of p depends on the location of its argument on some scale; for example, p may be true of t only if t is located beyond some critical point on the scale. For a given predicate p and a relevant scale, there are four possible phase quantifications: (27) (i) (ii) (iii) (iv)
p is true of t, but false for some cases lower on the scale p is true of t as it is for the cases lower on the scale p is false of t as it is for the cases lower on the scale p is false of t, but true of some cases lower on the scale.
Alternatively, the four cases can be put in terms of transitions. (28) (i) (ii) (iii) (iv)
up to t on the scale, there is a transition from false to true w.r.t. p up to t on the scale, there is no transition from true to false w.r.t. p up to t on the scale, there is no transition from false to true w.r.t. p up to t on the scale, there is a transition from false to true w.r.t. p
5.1. Instances of phase quantifications Schon. In the uses of the schon group considered here, the particles are associated with the natural focus of a stative sentence p, i.e. of imperfective, perfect or prospective aspect (for the aspectual distinctions see Comrie 1976). Other uses of schon and noch are discussed in Löbner (1989, 1999). Such sentences predicate over an evaluation time te. Consequently, p can be considered a one-place predicate over times. Due to this function of p, type 1, 2, 3, 4 are therefore referred to as schon(te, p), noch(te, p), noch nicht(te, p) and nicht mehr(te, p). The operators are about possible transitions in time between p being true and p being false. schon(te, p) and noch nicht(te, p) share the presupposition that before te there was a period of p being false, i.e. Np. schon(te, p) states that this period is over and at te, p is true; noch nicht(te, p) negates this: the Np-period is not over, p still is false at te. The other pair, noch(te, p) and nicht mehr(te, p) has the presupposition that there was a period of p before. According to noch(te, p) this period at te still continues; nicht mehr(te, p) states that it is over, whence p is false at te.
e
e
e
Fig. 22.3: Phase diagrams for schon, noch, noch nicht and nicht mehr
e
502
IV. Lexical semantics Tab. 22.6: Presuppositions and Assertions of the schon Group operator
relation to schon
presupposition: previous state not-p
p
dual
p
p
schon(te, p) noch(te, p)
assertion: state at te
noch nicht(te, p)
neg
not-p
not-p
nicht mehr(te, p)
subneg
p
not-p
Types 2, 3, and 4 can be directly analyzed as generated from type 1 by application of N and D, where D is just negation of the presupposition. Mittwoch (1993) and van der Auwera (1993) have questioned the presuppositions of the schon group. Their criticism, however, is essentially due to a confusion of types of uses (each use potentially comes with different presuppositions), or of presuppositions and conversational implicatures. Löbner (1999) offers an elaborate discussion, and refutation, of these arguments. Scalar adjectives. Scalar adjectives frequently come in pairs of logically contrary antonyms such as long/short, old/young, expensive/cheap. They relate to a scale that is based on some ordering. They encode a dimension for their argument such as size, length, age, price, and rank its degree on the respective scale. Pairs of antonyms consist of a positive element +A and a negative element –A (see Bierwisch 1989 for an elaborate general discussion). +A states for its argument that it occupies a high degree on the scale, a degree that is marked against lower degrees. –A states that the degree is low, i.e. marked against higher degrees. The respective negations predicate an unmarked degree on the scale as opposed to marked higher degrees (neg +A), or marked lower degrees (neg –A). The criteria of marked degrees are context dependent and need not be discussed here. Similar to the meanings of the schon group, +A can be seen as predicating of an argument t that on the given scale it is placed above a critical point where unmarkedness changes into markedness, and analogously for the other three cases. In the diagrams in Fig. 22.4, unmarkedness is denoted by 0 and markedness by 1, as (un)markedness coincides with the truth value that +A and –A assign to their argument.
Fig. 22.4: Phase diagrams for scalar adjectives
Other uses of scalar adjectives such as comparative, equative, positive with degree specification, enough or too can be analyzed analogously (Löbner 1990: §8). Existential and universal quantification. In order to determine the truth value of quantification restricted to some domain b and about a predicate expressed by p, the elements of b have to be checked in some arbitrary order as to whether p is true or
22. Dual oppositions in lexical meaning
503
false for them. Existential quantification can be regarded as involving a checking procedure which starts with the outcome 0 and switches to 1 as soon as a positive case of p is encountered in b. Universal quantification would start from the outcome 1 and switch to 0 if a negative case is encountered. This can be roughly depicted as in Fig. 22.5. In the diagrams, b marks the point where the domain of quantification is completely checked. It may be assumed without loss of generality that b is ordered in such a way that there is at most one change of polarity within the enumeration of the total domain. ∃
∀
¬∃
¬∀
Fig. 22.5: Phase diagrams for logical quantifiers
5.2. The general format of phase quantification The examples mentioned can all be considered instances of the general format of phase quantification which can be defined as follows. We first need the notion of an ‘admissible α-interval’. This is a section of the underlying scale with at most one positive and one negative subsection in terms of p, where α is the truth value of p for the first subsection. An admissible interval may or may not contain a switch of polarity, and this is what phase quantification is all about. Definition 10: admissible α-intervals in terms of ∃; ‘>’ indicates scope of its left argument over the right one) and (3b) ‘one man for all women’ (∃ > ∀). Here and in (21) below, unary branching nodes are omitted. I ignore the discussion of whether indefinite quantifiers indeed introduce scope (see Kratzer 1998), my argumentation does not depend on this issue. (1) Every woman loves a man. S
(2) DP
VP V
every woman
loves
DP a man
The arrangement of the formulae in (3) highlights the fact that they consist of the same three parts (roughly coinciding with the semantic contributions of the verb and its two arguments), and that the relation of loving as introduced by the verb always gets lowest scope. The only difference between the formulae is the ordering of the semantic contributions of the arguments of the verb. (3) a. ∀x.woman′(x) → ∃y.man′(y)∧ love′ (x,y)
b. ∃y.man′( y)∧ ∀x.woman′(x) → love′ (x,y)
Since quantifier scope ambiguity is the prototypical domain for the application of underspecification, involved cases of quantifier scope ambiguity are handled in advanced underspecification formalisms. Some of these cases have developed into benchmark cases for underspecification formalisms, e.g., (4)–(6):
538
V. Ambiguity and vagueness (4) Every researcher of a company saw most samples. (5) [Every man]i read a book hei liked. (6) Every linguist attended a conference, and every computer scientist did, too. The subject in (4) exhibits nested quantification, where one quantifier-introducing DP comprises another one. (4) is challenging because the number of its readings is less than the number of the possible permutations of its quantifiers (3! = 6). The scope ordering that is ruled out in any case is ∀ > most′ > ∃ (Hobbs & Shieber 1987). (See section 3.1. for further discussion of this example.) In (5), the anaphoric dependency of a book he liked on every man restricts the quantifier scope ambiguity in that the DP with the anaphor must be in the scope of its antecedent (Reyle 1993). In (6), quantifier scope is ambiguous, but must be the same in both sentences (i.e., if every linguist outscopes a conference, every computer scientist does, too), which yields two readings. In a third reading, a conference receives scope over everything else, i.e., both linguists and computer scientists attending the same conference (Hirschbühler 1982; Crouch 1995; Dalrymple, Shieber & Pereira 1991; Shieber, Pereira & Dalrymple 1996; Egg, Koller & Niehren 2001). Other scope-bearing items can also enter into scope ambiguity, e.g., negation and modal expressions, as in (7) and (8): (7) Everyone didn’t come. (∀ > ¬ or ¬ > ∀) (8) A unicorn seems to be in the garden. (∃ > seem or seem > ∃) Such cases can also be captured by underspecification. For instance, Minimal Recursion Semantics (Copestake et al. 2005) describes them by underspecifying the scope of the quantifiers and fixing the other scope-bearing items scopally. But cases of scope ambiguity without quantifiers show that underspecifying quantifier scope only is not general enough. For instance, cases of ‘neg raising’ (Sailer 2006) like in (9) have a reading denying that John believes that Peter will come, and one attributing to John the belief that Peter will not come: (9) John doesn’t think Peter will come. Sailer analyses these cases as a scope ambiguity between the matrix verb and the negation (whose syntactic position is invariably in the matrix clause). Other such examples involve coordinated DPs, like in (10), (11), or (12) (Hurum 1988; Babko-Malaya 2004; Chaves 2005b): (10) I want to marry Peggy or Sue. (11) Every man and every woman solved a puzzle. (12) Every lawyer and his secretary met.
24. Semantic underspecification (10) shows that in coordinated DPs scope ambiguity can arise between the conjunction and other scope-bearing material, here, want, even if there are no scope-bearing DPs. (10) is ambiguous in that the conjunction may have scope over want (I either have the wish to marry Peggy or the wish to marry Sue), or vice versa (my wish is to marry either Peggy or Sue). (11) has two readings, every man and every woman solving their own (possibly different) puzzle, or one puzzle being solved by every man and every woman. (There are no intermediate readings in which the indefinite DP intervenes scopally between the conjoined DPs.) Finally, (12) has a reading in which every lawyer meets his own secretary, and one in which all the lawyers with their secretaries meet together. This example can be analysed in terms of a scope ambiguity between the operator G that forms groups out of individuals (assuming that only such groups can be the argument of a predicate like meet) and the conjoined DPs (Chaves 2005b). If G has narrow scope with respect to the DPs, every lawyer and his secretary form a specific group that meet (13a), if the DPs end up in G’s restriction (indicated by brackets in (13)), there is one big meeting group consisting of all lawyers and their secretaries (13b). (13) a. ∀x.lawyer′(x) → ∃y.secr_of ′(y, x) ∧ ∃Z.[x ∈ Z ∧ y ∈ Z] ∧ meet′(Z) b. ∃Z.[∀x.lawyer′(x) → ∃y.secr_of ′(y, x) ∧ x ∈ Z ∧ y ∈ Z] ∧ meet′(Z) Another group of scope ambiguities involves scope below the word level: (14) beautiful dancer (15) John’s former car. (16) John almost died. In (14), the adjective may pertain to the noun as a whole or to the stem only, which yields two readings that can roughly be glossed as ‘beautiful person characterised by dancing’ and ‘person characterised by beautiful dancing’, respectively (Larson 1998). This can be modelled as scope ambiguity between the adjective and the nominal affix -er (Egg 2004). (15) as discussed in Larson & Cho (2003) is a case of scope ambiguity between the possessive relation introduced by the Anglo-Saxon genitive ’s and the adjective former, which yields the readings ‘car formerly in the possession of John’ or ‘ex-car in the possession of John’ (Egg 2007). Finally, the readings of (16), viz., ‘John was close to undergoing a change from being alive to being dead’ (i.e., in the end, nothing happened to him) and ‘John underwent a change from being alive to being close to death’ (i.e., something did happen) can be modelled as scope ambiguity between a change-of state operator like BECOME and the adverbial (Rapp & von Stechow 1999; Egg 2007). The cases of semantically and syntactically homogeneous ambiguity discussed so far have readings that not only comprise the same semantic building blocks, each reading has in addition exactly one instance of each of these building blocks. This was highlighted e.g. for (1) in the representation of its readings in (3), where each semantic building block appears on a different line.
539
540
V. Ambiguity and vagueness However, the definition of semantically and syntactically homogeneous ambiguity includes also cases where the readings consist of the same building blocks, but differ in that some of the readings exhibit more than one instance of specific building blocks. This kind of semantically and syntactically homogeneous ambiguity shows up in the ellipsis in (17). Its two readings ‘John wanted to greet everyone that Bill greeted’ and ‘John wanted to greet everyone that Bill wanted to greet’ differ in that there is one instance of the semantic contribution of the matrix verb want in the first reading, but two instances in the second reading (Sag 1976): (17) John wanted to greet everyone that Bill did. This is due to the fact that the pro-form did is interpreted in terms of a suitable preceding VP (its antecedent), and that there are two such suitable VPs in (17), viz., wanted to greet everyone that Bill did and greet everyone that Bill did. ((17) is a case of antecedentcontained deletion, see Shieber, Pereira & Dalrymple 1996 and Egg, Koller & Niehren 2001 for underspecified accounts of this phenomenon; see also article 70 (Reich) Ellipsis.) Another example of this kind of semantically and syntactically homogeneous ambiguity is the Afrikaans example (18) (Sailer 2004). Both the inflected form of the matrix verb wou ‘wanted’ and the auxiliary het in the subordinate clause introduce a past tense operator. But these examples have three readings: (18) Jan wou gebel het. Jan want.PAST called have ‘Jan wanted to call/Jan wants to have called/Jan wanted to have called.’ The readings can be analysed (in the order given in (18)) as (19a-c): That is, the readings comprise one or two instances of the past tense operator: (19) a. PAST(want′(j,ˆ (call′(j)))) b. want′(j,ˆ PAST(call′(j))) c. PAST(want′(j,ˆ PAST(call′(j)))) Finally, the criterion ‘syntactically and semantically homogeneous’ as defined in this subsection will be compared to similar classes of ambiguity from the literature. Syntactic and semantic homogeneity is sometimes referred to as structural ambiguity. But this term is itself ambiguous in that it is sometimes used in the broader sense of ‘semantically homogeneous’ (i.e., syntactically homogeneous or not). But then it would also encompass the group of semantically but not syntactically homogeneous ambiguities discussed in the next subsection. The group of semantically and syntactically homogeneous ambiguities coincides by and large with Bunt’s (2007) ‘structural semantic ambiguity’ class, excepting ambiguous compounds like math problem and the collective/distributive ambiguity of quantifiers, both of which are syntactically but not semantically homogeneous: Different readings of a compound each instantiate an unspecific semantic relation between the components in a unique way. Similarly, distributive and quantitative readings of a quantifier are distinguished in the semantics by the presence or absence of a distributive or collective operator, e.g., Link’s (1983) distributive D-operator (see article 46 (Lasersohn) Mass nouns and plurals).
24. Semantic underspecification
541
2.2. Semantically but not syntactically homogeneous ambiguities The second kind of ambiguity is semantically but not syntactically homogeneous. The ambiguity has a syntactic basis in that the same syntactic material is arranged in different ways. Consequently, the meanings of the resulting syntactic structures all consist of the same semantic material (though differently ordered, depending on the respective syntactic structure), but no common syntactic structure can be postulated for the different interpretations. The notorious modifier attachment ambiguities as in (20) are a prime example of this kind of ambiguity: (20) Max strangled the man with the tie. There is no common phrase marker for the two readings of (20). In the reading that the man is wearing the tie, the constituent the tie is part of a DP (or NP) the man with the tie. In the other reading, in which the tie is the instrument of Max’ deed, the tie enters a verbal projection (as the syntactic sister of strangled the man) as a constituent of its own: (21) a. ‘tie worn by victim’
b. ‘tie as instrument of crime’
S
S
DP Max
VP V
VP
DP DP
strangled Det
Max NP
the N man
PP
V V
PP DP
strangled the man
with the tie
with the tie
There is an intuitive 1:1 relation between the two phrase markers in (21) and the two readings of (20). None of the phrase markers would be suitable as the syntactic analysis for both readings. Semantically but not syntactically homogeneous ambiguity is usually not described in terms of semantic underspecification in the same fashion as semantically and syntactically homogeneous ambiguity; exceptions include Muskens (2001), Pinkal (1996), or Richter & Sailer (1996). In Bunt’s classification, the group of semantically but not syntactically homogeneous ambiguites are called ‘syntactic ambiguity’.
2.3. Syntactically but not semantically homogeneous ambiguities The third kind of ambiguity is instantiated by expressions whose readings share a single syntactic analysis but do not comprise the same semantic material. These expressions can be classified in four subgroups. Members of the first subgroup comprise lexically ambiguous words, whose ambiguity is inherited by the whole expression. For instance, the ambiguity of the noun Schule ‘school’ (with readings like ‘building’, ‘institution’, ‘staff and pupils’, or ‘lessons’) makes expressions like die Schule begutachten ‘evaluate (the) school’ ambiguous, too. Polysemy belongs to this group but homonymy
542
V. Ambiguity and vagueness does not: Different readings of polysemous words belong to the same lexeme and do not differ syntactically. In contrast, homonymous items are syntactically different lexemes. Underspecified accounts of polysemy model the semantics of a polysemous item in terms of a core meaning common to all readings. This was worked out in Two-level Semantics (Bierwisch 1983; Bierwisch & Lang 1987; Bierwisch 1988), which distinguished a semantic level (where the core meanings reside) and relegated the specification of the individual readings to a conceptual level (see articles 16 (Bierwisch) Semantic features and primes and 31 (Lang & Maienborn) Two-level Semantics). In the case of Schule ‘school’, the ambiguity can be captured in terms of a core meaning S ‘related to processes of teaching and learning’. This meaning is then fully specified on the conceptual level in terms of operators that map S onto an intersection of S with properties like ‘building’, ‘institution’ etc. Underspecification formalisms covering polysemy include the semantic representation language in the PHLIQA question-answering system (Bronnenberg et al. 1979), Poesio’s (1996) Lexically Underspecified Language, and Cimiano & Reyle’s (2005) version of Muskens’ (2001) Logical Description Grammar. Lexical ambiguities were also spotted in sentences with quantifiers that have collective and distributive readings (Alshawi 1992; Frank & Reyle 1995; Chaves 2005a). For instance, in (22), the lawyers can act together or individually: (22) The lawyers hired a secretary. The distributive reading differs from the collective one in that there is a quantification over the set of lawyers whose scope is the property of hiring a secretary (instead of having this property apply to an entity consisting of all lawyers together). The collective reading lacks this quantification, which makes expressions like (22) semantically heterogeneous. The proposed analyses of such expressions locate the ambiguity differently. The Core Language Engine account (Alshawi 1992) and the Underspecified DRT (UDRT) account of Frank & Reyle (1995) suggest an underspecification of the DP semantics (they refer to DPs as NPs) that can be specified to a collective or a distributive interpretation. Chaves (2005a) notes that mixed readings like in (23) are wrongly ruled out if the ambiguity is attributed to the DP semantics. (23) The hikers met in the train station and then left. His UDRT analysis places the ambiguity in the verb semantics in the form of an underspecified operator, which can be instantiated as universal quantification in the spirit of Link’s (1983) account of distributive readings. Lexically based ambiguity also includes compounds like math problem. Their semantics comprises an unspecified relation between their components, which can be specified differently (e.g., for math problem, ‘problem from the domain of mathematics’ or ‘problem with understanding mathematics’). Referential ambiguity is the second subgroup of syntactically but not semantically homogeneous expressions, because there are different interpretations of a deictic expression, which is eventually responsible for the ambiguity. For a discussion of referential ambiguity and its underspecificed representation, see e.g. Asher & Lascarides (2003) and Poesio et al. (2006).
24. Semantic underspecification Some cases of referential ambiguity are due to ellipses where the antecedents comprise anaphors, e.g., the antecedent VP walks his dog in (24): (24) John walks his dog and Max does, too. The interpretation of does in terms of walks his dog comprises an anaphor, too. This anaphor can refer to the same antecedent as the one in walks his dog (‘strict’ reading, Max walks John’s dog), or to its own subject DP in analogy to the way in which the anaphor in John walks his dog refers (‘sloppy’ reading, Max walks his own dog). For an extended discussion of this phenomenon, see Gawron & Peters (1990). The third kind of syntactically but not semantically homogeneous ambiguity where underspecification has been proposed is missing information (Pinkal 1999). In this case, parts of a message could not be decoded due to problems in production, transmission, or reception. These messages can be interpreted in different ways (depending on how the missing information is filled in), while the syntactic representation remains constant. The fourth subgroup is reinterpretation (metonymy and aspectual coercion), if it is modelled in terms of underspecified operators that are inserted during semantic construction (Hobbs et al. 1993; Dölling 1995; Pulman 1997; de Swart 1998; Egg 2005). Such operators will avoid impending clashes for semantic construction by being inserted between otherwise (mostly) incompatible semantic material during the construction process. (See articles 25 (de Swart) Mismatches and coercion and 26 (Tyler & Takahashi) Metaphors and metonymies.) This strategy can introduce ambiguity, e.g., in (25). Here a coercion operator is inserted between play the Moonlight Sonata and its modifier for some time, which cannot be combined directly; this operator can be specified to a progressive or an iterative operator (i.e., she played part of the sonata, or she played the sonata repetitively): (25) Amélie played the Moonlight Sonata for some time. The readings of such expressions have a common syntactic analysis, but, due to the different specification of the underspecified reinterpretation operator, they no longer comprise the same semantic material. Syntactically but not semantically homogeneous ambiguities (together with vagueness) encompass Bunt’s (2007) classes ‘lexical ambiguity’ (except homonymy), ‘semantic imprecision’, and ‘missing information’ with the exception of ellipsis: In ellipsis (as opposed to incomplete utterances), the missing parts in the target sentences are recoverable from the preceding discourse (possibly in more than one way), while no such possibility is available for incomplete utterances (e.g., for the utterance Bill? in the sense of Where are you, Bill?).
2.4. Neither syntactically nor semantically homogeneous ambiguities The group of ambiguous expressions that are neither syntactically nor semantically homogeneous consists of homonyms. Homonymy has not been a prime target of underspecification, because there is not enough common ground between the readings that would support a sufficiently distinctive underspecified representation (which would differ from the representation of other lexical items). Consider e.g., jumper in its textile and its electrical engineering sense: ‘concrete object’ as common denominator of the
543
544
V. Ambiguity and vagueness readings would fail to distinguish jumper from a similarly underspecified representation of the homonym pen (‘writing instrument’ or ‘enclosure for animals’). This group does not show up in Bunt’s (2007) taxonomy.
2.5. The focus of underspecified approaches to ambiguity While underspecification can in principle be applied to all four groups of ambiguity, most of the work on underspecification focusses on semantically and syntactically homogeneous ambiguity. I see two reasons for this: First, it is more attractive to apply underspecification to semantically homogeneous (than to semantically heterogeneous) ambiguity: Suitable underspecified representations of a semantically homogeneous ambiguous expression can delimit the range of readings of the expression and specify them fully without disjunctively enumerating them (for a worked out example, see (40) below). No such delimitation and specification are possible in the case of semantically heterogeneous ambiguity: Here semantic representations must restrict themselves to specify ing the parts of the readings that are common to all of them and leave open those parts that distinguish the specific readings. Further knowledge sources are then needed to define the possible instantiations of these parts (which eventually delimits the set of readings and fully specifies them). Second, syntactically heterogeneous ambiguity seems to be considered less of an issue for the syntax-semantics interface, because each reading is motivated by a syntactic structure of its own. Underspecified presentations of these readings would cancel out the differences between the readings in spite of their independent syntactic motivation. Syntactically homogeneous ambiguity has no such syntactic motivation, which makes it a much greater challenge for the syntax-semantics interface (see section 4.1). I will follow this trend in underspecification research and focus on syntactically and semantically homogeneous ambiguities in the remainder of this article.
3. Approaches to semantic underspecification This section offers a general description of underspecification formalisms. It will outline general properties that characterise these formalisms and distinguish subgroups of them (see also article 108 (Pinkal & Koller) Semantics in computational linguistics). I will first show that underspecification formalisms handle ambiguity by either describing it or by providing an algorithm for the derivation of the different readings of an ambiguous expression. Then I will point out that these formalisms may but need not distinguish different levels of representation, and implement compositionality in different ways. Finally, underspecification formalisms also differ with respect to their compactness (how efficiently can they delimit and specify the set of readings of an ambiguous expression) and their expressivity (can they also do this for arbitrary subsets of this set of readings).
3.1. Describing ambiguity Underspecification is implemented in semantics in two different ways: The readings of an ambiguous expression can either be described or derived. This distinction shows up also in Robaldo (2007), who uses the terms ‘constraint-based’ and ‘enumerative’. An obsolete
24. Semantic underspecification version of Glue Language Semantics (Shieber, Pereira & Dalrymple 1996) mixed both approaches to handle ellipses like (17). The first way of implementing semantic underspecification is to describe the meaning of an ambiguous expression directly. The set of semantic representations for its readings is characterised in terms of partial information. This characterisation by itself delimits the range of readings of the ambiguous expression and specifies them. That is, the way in which fully specified representations for the readings are derived from the underspecified representation does not contribute to the delimitation. This strategy is based on the fact that there are two ways of describing a set: a list of its elements or a property that characterises all the and only the elements of the set. In the second way, a set of semantic representations is delimited by describing the common ground between the representations only. Since the description deliberately leaves out everything that distinguishes the elements of the set, it can only be partial. Most underspecification formalisms that follow this strategy distinguish an object level (semantic representations) and a meta-level (descriptions of these representations). The formalisms define the expressions of the meta-level and their relation to the described object-level representations.
3.1.1. A simple example As an illustration, reconsider (26) [= (1)] and its readings (27a-b) [= (3a-b)]: (26) Every woman loves a man (27) a. ∀x.woman′(x) → ∃y.man′(y) ∧ love′(x,y) b. ∃y.man′(y) ∧ ∀x.woman′(x) → love′(x,y) A description of the common ground in (27) can look like this: □ (28) ∀x. woman′(x) → □ ∃y. man′(y) ∧ □ love′(x,y) In (28), we distinguish four fragments of semantic representations (here, λ-terms) which may comprise holes (parts of fragments that are not yet determined, indicated by ‘□’). Then there is a relation R between holes and fragments (depicted as dotted lines), if R holds for a hole h and a fragment F, F must be part of the material that determines h. R determines a partial scope ordering between fragments: A fragment F1 has scope over another fragment F2 iff F1 comprises a hole h such that R(h, F2) or R(h,F3), where F3 is a third fragment that has scope over F2 (cf. e.g. the definition of ‘qeq relations’ in Copestake et al. 2005). Furthermore, we assume that variable binding operators in a fragment F bind occurrences of the respective variables in all fragments outscoped by F (ignoring the so-called variable capturing problem, see Egg, Koller & Niehren 2001) and that the description explicates all the fragments of the described object-level representations.
545
546
V. Ambiguity and vagueness The description (28) can then be read as follows: The fragment at the top consists of a hole only, i.e., we do not yet know what the described representations look like. However, since the relation R relates this hole and the right and the left fragment, they are both part of these representations – only the order is open. Finally, the holes in both the right and the left fragment are related to the bottom fragment in terms of R, i.e., the bottom fragment is in the scope of either quantifier. The only semantic representations compatible with this description are (27a-b). To derive the described readings from such a constraint (its solutions), the relation R between holes and fragments is monotonically strengthened until all the holes are related to a fragment, and all the fragments except the one at the top are identified with a hole (this is called ‘plugging’ in Bos 2004). In (28), one can strengthen R by adding the pair consisting of the hole in the left fragment and the right fragment. Here the relation between the hole in the universal fragment and the bottom fragment in (28) is omitted. (29)
□ ∀x. woman′(x) → □ ∃y. man′(y)∧ □ love′(x, y)
Identifying the hole-fragment pairs in R in (29) then yields (27a), one of the solutions of (28). The other solution (27b) can be derived by first adding to R the pair consisting of the hole in the right fragment and the left fragment. Underspecification formalisms that implement scope in this way comprise Underspecified Discourse Representation Theory (UDRT; Reyle 1993; Reyle 1996; Frank & Reyle 1995), Minimal Recursion Semantics (MRS; Copestake et al. 2005), the Constraint Language for Lambda Structures (CLLS; Egg, Koller & Niehren 2001), the language of Dominance Constraints (DC; subsumed by CLLS; Althaus et al. 2001), Hole Semantics (HS; Bos 1996; Bos 2004; Kallmeyer & Romero 2008), and Logical Description Grammar (Muskens 2001). Koller, Niehren & Thater (2003) show that expressions of HS can be translated into expressions of DC and vice versa; Fuchss et al. (2004) describe how to translate MRS expressions into DC expressions. Player (2004) claims that this is due to the fact that UDRT, MRS, CLLS, and HS are the same ‘modulo cosmetic differences’, however, his comparison does not pertain to CLLS but to the language of dominance constraints. Scope relations like the one between a quantifying DP and the verb it is an argument of can also be expressed in terms of suitable variables. This is implemented e.g. in the Underspecied Semantic Description Language (USDL; Pinkal 1996, Nieliren, Pinkal & Ruhrberg 1997; Egg & Kohlhase 1997 present a dynamic version of this language). In USDL, the constraints for (26) are expressed by the equations in (30): (30) a. X0 = C1(every_woman@Lx (C2(love@x2@x1))) 1 b. X0 = C3(a_man@Lx (C4(love@x2@x1))) 2
24. Semantic underspecification
547
Here ‘every_woman’ and ‘a_man’ stand for the the two quantifiers in the semantics of (26), ‘@’ denotes explicit functional application in the metalanguage, and ‘Lx’, λ-abstraction over x. These equations can now be solved by an algorithm like the one in Huet (1975). For instance, for the ∀∃-reading of (26), the variables would be resolved as in (31a-c). This yields (31d), whose right hand side corresponds to (27a): (31) a. b. c. d.
C1 = C4 = λP.P C2 = λP.a_man@Lx (P) 2 C3 = λP.every_woman@Lx (P) 1 X0 = every_woman@Lx (a_man@Lx (love@x2@x1)) 1
2
Another way to express such scope relations is used in the version of the Quasi-Logical Form (QLF) in Alshawi & Crouch (1992), viz., list-valued metavariables in semantic representations whose specification indicates quantifier scope. Consider e.g. the (simplified) representation for (26) in (32a), which comprises an underspecified scoping list (the variable _s before the colon). Here the meanings of every woman and a man are represented as complex terms; such terms comprise term indices (+m and +w) and the restrictions of the quantifiers (man and woman, respectively). Specifying this underspecified reading to the reading with wide scope for the universal quantifier then consists in instantiating the variable _s to the list [+w,+m] in (32b), which corresponds to (27a): (32) a. _s: love(term(+w,...,woman,...), term(+m,...,man,...)) b. [+w,+m]: love (term (+w,...,woman,...), term (+m,...,man,...)) Even though QLF representations seem to differ radically from the ones that use dominance constraints, Lev (2005) shows how to translate them into expressions of Hole Semantics, which is based on dominance relations. Finally, I will show how Glue Language Semantics (GLS; Dalrymple et al. 1997; Crouch & van Genabith 1999; Dalrymple 2001) handles scope ambiguity. Each lexical item introduces so-called meaning constructors that relate syntactic constituents (I abstract away from details of the interface here) and semantic representations. For instance, for the proper name John, the constructor is ‘DP ↝ john′’, which states that the DP John has the meaning john′ (‘↝’ relates syntactic constituents and their meanings). Such statements can be arguments of linear logic connectives like the conjunction ⊗ and the implication , e.g., the meaning constructor for love: (33) ∀X, Y.DPsubj ↝ X ⊗ DPobj ↝ Y S ↝ love′(X, Y) In prose: Whenever the subject interpretation in a sentence S is X and the object interpretation is Y, then the S meaning is love′(X, Y). That is these constructors specify how the meanings of smaller constituents determine the meaning of a larger constituent. The implication is resource-sensitive: ‘A B’ can be paraphrased as ‘use a resource A to derive (or produce) B’. The resource is ‘consumed’ in this process, i.e., no longer available for further derivations. Thus, from A and A B one can deduce B, but no longer
548
V. Ambiguity and vagueness A. For (33), this means that after deriving the S meaning the two DP interpretations are no longer available for further processes of semantic construction (consumed). The syntax-semantics interface collects these meaning constructors during the construction of the syntactic structure of an expression. For ambiguous expressions such as (26), the resulting collection is an underspecified representation of its different readings. Representations for the readings of the expression can then be derived from this collection by linear-logic deduction. In the following, the presentation is simplified in that DP-internal semantic construction is omitted and only the DP constructors are given: (34) a. ∀H, P.(∀x.DP ↝ x H ↝t (x)) H ↝ every′(woman′, P) b. ∀G, R.(∀y.DP ↝ y G ↝t(y)) G ↝ a′(man′,R) The semantics of every woman in (34a) can be paraphrased as follows: Look for a resource of the kind ‘use a resource that a DP semantics is x, to build the truth-valued (subscript t of ↝) meaning P(x) of another constituent H’. Then consume this resource and assume that the semantics of H is every′(woman′,P); here every′ abbreviates the usual interpretation of every. The representation for a man works analogously. With these constructors for the verb and its arguments, the semantic representation of (26) in GLS is (35d), the conjunction of the constructors of the verb and its arguments. Note that semantic construction has identified the DPs that are mentioned in the three constructors: (35) a. ∀H, P.(∀x.DPsubj ↝ x H ↝t P(x)) H ↝ every′(woman′, P) b. ∀G, R.(∀y.DPobj ↝ y G ↝t R(y)) G ↝ a′(man′,R) c. ∀X, Y. DPsubj ↝ X ⊗ DPobj ↝ Y S ↝ love′(X, Y) d. (35a)⊗(35b)⊗(35c) From such conjunctions of constructors, fully specified readings can be derived. For (26), the scope ambiguity is modelled in GLS in that two different semantic representations for the sentence can be derived from (35d). Either derivation starts with choosing one of the two possible specifications of the verb meaning in (35c), which determine the order in which the argument interpretations are consumed: (36) a. ∀X.DPsubj ↝ X (∀Y.DPobj ↝ Y S ↝ love′(X, Y)) b. ∀Y. DPobj ↝ Y (∀X. DPsubj ↝ X S ↝ love′(X, Y)) I will now illustrate the derivation of the reading of ∀∃-reading of (26). The next step uses the general derivation rule (37) and the instantiations in (38): (37) from A B and B C one can deduct A C (38) G ↦ S, Y ↦ y, and R ↦ λy.love′(X, y)) From specification (36a) and the object semantics (35b) we then obtain (39a), this goes then together with the subject semantics (35a) to yield (39b), a notational variant of (27a):
24. Semantic underspecification
549
(39) a. ∀X.DPsubj ↝ X S ↝ a′(man′, λy.love′(X, y)) b. every′(woman′, λx.a′(man′, λy.love′(x, y)) The derivation for the other reading of (26) chooses the other specification (36b) of the verb meaning and works analogously.
3.1.2. A more involved example After this expository account of the way that the simple ambiguity of (26) is captured in various underspecification formalisms, reconsider the more involved nested quantification in (40) [= (4)], whose constraint is given in (41). (40) Every researcher of a company saw most samples □
(41)
∃y. company' (y) ∧ □ ∀x.(researcher' (x) ∧ □) → □ of' (x, y)
most' (sample' ,λz □)
see' (x, z)
As expounded in section 2.1, not all scope relations of the quantifiers are possible in (40). I assume that (40) has exactly five readings, there is no reading with the scope ordering ∀ > most′ > ∃. (41) is a suitable underspecified representation of (40) in that it has exactly its five readings as solutions. As a first step of disambiguation, we can order the existential and the universal fragment. Giving the former narrow scope yields (42): (42)
□ ∀x.(researcher' (x ) ∧ □) →□
most' (sample' , λz □)
∃y.company' (y ) ∧ □ of' (x, y)
see' (x, z)
But once the existential fragment is outscoped by the universal fragment, it can no longer interact scopally with the most - and the see-fragment, because it is part of the restriction of the universal quantifier. That is, there are two readings to be derived from (42), with the most-fragment scoping below or above the universal fragment. This rules out a reading in which most scopes below the universal, but above the existential quantifier: (43) a. ∀x.(researcher′(x) ∧ ∃y.company′(y) ∧ of′(x,y)) → most′(sample′, λz.see′(x, z)) b. most′(sample′, λz∀x.(researcher′(x) ∧ ∃y.company′(y) ∧ of ′(x,y)) → see′(x,z)) The second way of fixing the scope of the existential w.r.t. the universal quantifier in (41) gives us (44):
550
V. Ambiguity and vagueness □
(44) ∃y.company' (y) ∧ □
most' (sample' , λz □)
∀x.(researcher' (x) ∧ □) → □ of' (x, y)
see' (x, z)
This constraint describes the three readings in (45), whose difference is whether the most-fragment takes scope over, between, or below the other two quantifiers. In sum, constraint (41) encompasses the five desired interpretations. (45) a. most′(sample′, λz∃y.company′(y) ∧ ∀x.(researcher′(x) ∧ of ′(x,y)) → see′(x,z)) b. ∃y.company′(y) ∧ most′(sample′, λz∀x.(researcher′(x) ∧ of′(x,y)) → see′(x,z)) c. ∃y.company′(y) ∧ ∀x.(researcher′(x) ∧ of′(x,y)) → most′(sample′, λz.see′ (x, z)) While most approaches follow Hobbs & Shieber in assuming five readings for examples like (40), Park (1995) and Kallmeyer & Romero (2008) claim that in cases of nested quantification no quantifier may interfere between those introduced by the embedding and the embedded DP, regardless of their ordering. For (40), this would block the reading (45b). But even (40) is a comparatively simple case of nested quantification. Appropriate underspecification formalisms must be able to handle nested quantification in general and to cope with the fact that there are always less readings than the factorial of the number of the involved DPs, since some scoping options are ruled out. For instance, simple sentences consisting of a transitive verb with two arguments that together comprise n quan(2n)!
tifying DPs have C(n) readings, where C(n) is the Catalan number of n (C(n)) = (n + 1)!n ! ). For instance, example (46) has 5 nested quantifiers and thus C(5) = 42 readings (Hobbs & Shieber 1987): (46) Some representative of every department in most companies saw a few samples of each product Nested quantification highlights the two main characteristics of this approach to semantic underspecification: Underspecified expressions (typically, of a meta-level formalism) describe a set of semantic representations to delimit the range of this set and to fully specify its elements. The derivation of solutions from such expressions does thus not add information in that it restricts the number of solutions in any way.
3.2. Deriving ambiguity The second approach to semantic underspecification derives rather than describes object-level semantic representations on the basis of an initial representation. Consider e.g. the initial semantic representation of (26) in the formalism of Schubert & Pelletier (1982), which closely resembles its syntactic structure:
24. Semantic underspecification (47) love′(〈forall x woman′(x)〉, 〈exists y man′(y)〉) (47) renders the semantics of DPs as terms, i.e., scope-bearing expressions whose scope has not been determined yet. Terms are triples of a quantifier, a bound variable, and a restriction. The set of fully specified representations encompassed by such a representation is then determined by a resolution algorithm. The algorithm ‘discharges’ terms, i.e., integrates them into the rest of the representation, which determines their scope. (Formally, a term is replaced by its variable, then quantifier, variable, and restriction are prefixed to the resulting expression.) For instance, to obtain the representation (27a) for the ‘∀∃’-reading of (26) the existential term is discharged first, which yields (48): (48) ∃y.man′(y) ∧ love′(〈forall x woman′(x)〉, y) Discharging the remaining term then yields (27a); to derive (27b) from (47), one would discharge the universal term first. Such an approach is adopted e.g. in the Core Language Engine version of Moran (1988) and Alshawi (1992). Hobbs & Shieber (1987) show that a rather involved resolution algorithm is needed to prevent overgeneration in more complicated cases of scope ambiguity, in particular, for nested quantification. Initial representations for nested quantification comprise nested terms, e.g., the representation (49) for (40): (49) see′(〈forall x researcher′(x) ∧ of ′(x, 〈exists y company′(y)〉)〉, 〈most z sample′(z)〉) Here the restriction on the resolution is that the inner term may never be discharged before the outer one, which in the case of (40) rules out the unwanted 6th possible permutation of the quantifiers. Otherwise, this permutation could be generated by discharging the terms in the order ‘∃, most′, ∀’. Such resolution algorithms lend themselves to a straightforward integration of preference rules such as ‘each outscopes other determiners’, see section 6.4. Other ways of handling nested quantification by restricting resolution algorithms for underspecified representations have been discussed in the literature. First, one could block vacuous binding (even though vacuous binding does not make formulae ill-formed), i.e., requesting an appropriate bound variable in the scope of every quantifier. In Hobbs & Shieber’s (1987) terms, the step from (51), the initial representation for (50), to (52) is blocked, because the discharged quantifier fails to bind an occurence of a variable y in its scope (the only occurrence of y in its scope is inside a term, hence not accessible for binding). Thus, the unwanted solution (53) cannot be generated: (50) Every researcher of a company came (51) come′(〈forall x researcher′(x) ∧ of′(x, 〈exists y company′(y)〉)〉) (52) ∃y.company′(y) ∧ come′(〈forall x researcher ′(x) ∧ of′(x,y)〉) (53) ∀x.(researcher′(x) ∧ of′(x,y)) → ∃y. company′(y) ∧ come′(x)
551
552
V. Ambiguity and vagueness But Keller (1988) shows that this strategy is not general enough: If there is a second instance of the variable that is not inside a term, as in the representation (55) for (54), the analogous step from (55) to (56) cannot be blocked, even though it would eventually lead to structure (57) where the variable y within the restriction of the universal quantifier is not bound: (54) Every sister of [a boy]i hates himi (55) hate′(〈forall x sister-of ′(x, 〈exists y boy′(y)〉)〉,y) (56) ∃y.boy(y) ∧ hate′(〈forall x sister-of ′(x, y)〉, y) (57) ∀x.sister-of ′(x, y) → ∃y.boy(y) ∧ hate′(x,y) A second way of handling nested quantification (Nerbonne 1993) is restricting the solutions of underspecified representations to closed formulae (without free variables), although free variables do not make formulae ill-formed. This approach can handle problems with sentences like (54), but is inefficient in that resolution steps must be performed before the result can be checked against the closedness requirement. It also calls for an (otherwise redundant) bookkeeping of free variables and bars the possibility of modelling the semantic contribution of non-anaphoric pronouns in terms of free variables. Another way of deriving scope ambiguities is instantiated by Hendriks’ (1993) Flexible Montague Grammar and Sailer’s (2000) Lexicalized Flexible Ty2. Here scope ambiguity is put down to the polysemous nature of specific constituents (in particular, verbs and their arguments), which have an (in principle unlimited yet systematically related) set of interpretations. This ambiguity is then inherited by expressions that these constituents are part of, but this does not influence the constituent structure of the expression, because all readings of these constituents have the same syntactic category. Every lexical entry gets a maximally simple interpretation, which can be changed by general rules such as Argument Raising (AR). For instance, love would be introduced as a relation between two individuals, and twofold application of AR derives the λ-terms in (58), relations between properties of properties, whose difference is due to the different order of applying AR to the arguments: (58) a. b.
λYλX.X(λx.Y(λy.love′(x,y))) λYλX.Y(λy.X(λx.love′(x,y)))
Applying these λ-terms to the semantic representations of a man and every woman (in this order, which follows the syntactic structure in (2)) then derives the two semantic representations in (27). Another formalism of this group is Ambiguous Predicate Logic (APL; Jaspars & van Eijck 1996). It describes scope underspecification in terms of so-called formulae, in which contexts (structured lists of scope-bearing operators) can be prefixed to expressions of predicate logic (or other formulae). For instance, (59a) indicates that the existential quantifier has wide scope over the universal one, since they form one list element together, whereas negation, as another
24. Semantic underspecification
553
element of the same list, can take any scope w.r.t. the two quantifiers (wide, intermediate, or narrow). In contrast, (59b) leaves the scope of the existential quantifier and negation open, and states that the universal quantifier can have scope over or below (not between) the other operators. (59) a. (∃x □∀y □,¬ □)Rxy b. ((∃x □,¬ □)□,∀y □)Rxy Explicit rewrite rules serve to derive the set of solutions from these formulae. In a formula C(α), one can either take any simple list element from the context C and apply it to α, or take the last part of a complex list element, e.g., ∀y□ from ∃x□∀y□ in (59a). This would map (59a) onto (60a), which can then be rewritten as (60c) with the intermediate step (60b): (60) a. (∃x □, ¬ □)∀y.Rxy b. (∃x □)¬∀y.Rxy c. ∃x.¬∀y.Rxy In sum, the underspecification formalisms expounded in this subsection give initial underspecified representations for ambiguous expressions that do not by themselves delimit the range of intended representations fully. This delimitation is the joint effect of the initial representations and the resolution algorithm. The difference between underspecification formalisms describing the readings of an ambiguous expression and those that derive the readings is thus not the existence of suitable algorithms to enumerate the readings (see section 6 for such algorithms for descriptive underspecification formalisms), but the question of whether such an algorithm is essential in determining the set of solutions.
3.3. Levels of representation In the previous sections, underspecification formalisms were introduced as distinguishing a meta and an object level of representation. This holds good for the majority of such formalisms, but in other ones both the underspecified and the fully specified representations are expressions of the same kind (what Cimiano & Reyle 2005 call ‘representational’ as opposed to ‘descriptive’ approaches). UDRT (Reyle 1993, 1996) belongs to the second group. It separates information on the ingredients of a semantic representation (DRS fragments, see article 37 (Kamp & Reyle) Discourse Representation Theory) from information on the way that these fragments are combined. Consider e.g. (61) and its representation in (62): (61) Everybody didn’t pay attention x
(62) lT : l1 :
⇒ human (x)
, l2:
, l3: x pay attention , ORD
554
V. Ambiguity and vagueness In prose: The whole structure (represented by the label l⊤) consists of a set of labelled DRS fragments (for the semantic contributions of DP, negation, and VP, respectively) that are ordered in a way indicated by a relation ORD. For an underspecified representation of the two readings of (61), the scope relations between l1 and l2 are left open in ORD: (63) ORD = 〈l⊤ ≥ l1, l⊤ ≥ l2, scope(l1) ≥ l3, scope(l2) ≥ l3〉. Here ‘≥’ means ‘has scope over’, and scope maps a DRS fragment onto the empty DRS box it contains. Fully specified representations for the readings of (61) can then also be expressed in terms of (62). In these cases, ORD comprises in addition to the items in (63) a relation to determine the scope between universal quantifier and negation, e.g., scope(l1) ≥ l2 for the reading with wide scope of the universal quantifier. Another instance of such a ‘monostratal’ underspecification formalism is the (revised) Quasi-Logical Form (QLF) of Alshawi & Crouch (1992), which uses list-valued metavariables in semantic representations whose specification indicates quantifier scope. See the representation for (26) in (32a). Kempson & Cormack (1981) also assume a single level of semantic representation (higher-order predicate logic) for quantifier scope ambiguities.
3.4. Compositionality Another distinction between underspecification formalisms centres upon the notion of resource: In most underspecification formalisms, the elements of a constraint show up at least once in all its solutions, in fact, exactly once, except in special cases like ellipses. For instance, in UDRT constraints and their solutions share the same set of DRS fragments, in CLLS (Egg, Koller & Niehren 2001) the relation between constraints and their solutions is defined as an assignment function from node variables (in constraints) to nodes (in the solutions), and in Glue Language Semantics this resource-sensitivity is explicitly encoded in the representations (expressions of linear logic). Due to this resource-sensitivity, every solution of an underspecified semantic representation of a linguistic expression preserves the semantic contributions of the parts of the expression. If different parts happen to introduce identical semantic material, then each instance must show up in each solution. For instance, any solution to a constraint for (64a) must comprise two universal quantifiers. The contributions of the two DPs may not be conflated in solutions, which rules out that (64a) and (64b) could share a reading ‘for every person x: x likes x’. (64) a. Everyone likes everyone b. Everyone likes himself While this strategy seems natural in that the difference between (64a) and (64b) need not be stipulated by additional mechanisms, there are cases where different instances of the same semantic material seem to merge in the solutions. Reconsider e.g. the case of Afrikaans past tense marking (65) [= (18)] in Sailer (2004). This example has two tense markers and three readings. Sailer points out that
24. Semantic underspecification
555
the two instances of the past tense marker seem to merge in the first and the second reading of (65): (65) Jan wou gebel het Jan want.PAST called have ‘Jan wanted to call/Jan wants to have called/Jan wanted to have called’ A direct formalisation of this intuition is possible if one relates fragments in terms of subexpressionhood, as in the underspecified analyses in the LRS framework (Richter & Sailer 2006; Kallmeyer & Richter 2006). If constraints introduce identical fragments as subexpressions of a larger fragment, these fragments can but need not coincide in the solutions of the constraints. For the readings of (18), the constraint (simplified) is (66a):
◃ ◃ ◃ ◃ ◃
(66) a. 〈[PAST(γ)]β , [PAST(ζ )]ε, [want′(j,ˆη]θ, [call′(j)]ι, β α, ε δ, θ δ, ι γ, ι ζ, ι η〉 b. PAST(want′(j,^ (call′(j))))
◃
◃
In prose: The two PAST- and the want-fragments are subexpressions of (relation ‘ ’) the whole expression (as represented by the variables α or δ), while the call-fragment is a subexpression of the arguments of the PAST operators and the intensionalised second argument of want. This constraint describes all three semantic representations in (19); e.g., to derive (66b) [= (19a)], the following equations are needed: α = δ = β = ε, γ = ζ = θ, and η = ι. The crucial equation here is β = ε, which equates two fragments. (Additional machinery is needed to block unwanted readings, see Sailer 2004.) This approach is more powerful than resource-sensitive formalisms. The price one has to pay for this additional power is the need to control explicitly whether identical material may coincide or not, e.g., for the analyses of negative concord in Richter & Sailer (2006). (See also article 6 (Pagin & Westerståhl) Compositionality.)
3.5. Expressivity and compactness The standard approach to evaluate an underspecification formalisms is to apply it to challenging ambiguous examples, e.g., (67) [= (40)], and to check whether there is an expression of the formalism that expresses all and only the attested readings of the example: (67) Every researcher of a company saw most samples However, what if these readings are contextually restricted, or, if the sentence has only four readings, as claimed by Kallmeyer & Romero (2008) and others, lacking the reading (45b) with the scope ordering ∃ > most′ > ∀? Underspecification approaches that model scope in terms of partial order between fragments of semantic representations run into problems already with the second of these possibilities: Any constraint set that encompasses the four readings in which most′ has highest or lowest scope also covers the fifth reading (45b) (Ebert 2005). This means
556
V. Ambiguity and vagueness that such underspecification formalisms are not expressive in the sense of König & Reyle (1999) or Ebert (2005), since they cannot represent any subset of readings of an ambiguous expression. The formalisms are of different expressivity, e.g., approaches that model quantifier scope by lists (such as Alshawi 1992) are less expressive than those that use dominance relations, or scope lists together with an explicit ordering of list elements as in Fox & Lappin’s (2005) Property Theory with Curry Typing. Fully expressive is the approach of Koller, Regneri & Thater (2008), which uses Regular Tree Grammars for scope underspecification. Rules of these grammars expand nonterminals into tree fragments. For instance, the rule S → f (A, B) expands S into a tree whose mother is labelled by f, and whose children are the subtrees to be derived by expanding the nonterminals A and B. Koller, Regneri & Thater (2008) show that dominance constraints can be translated into RTGs, e.g., the constraint (68) [= (41)] for the semantics of (40) is translated into (69). □
(68) ∃y.company' (y) ∧ □
∀x.(researcher' (x) ∧ □) → □ of' (x,y)
most' (sample' , λ z □)
see' (x,z)
(69) {1–5} {1–5} {1–5} {2–5} {2–5} {1–4}
→ → → → → →
∃comp({2–5}) ∀res({l–2},{4–5}) most({1–4}) ∀res({2},{4–5}) most({2–4}) ∀({l–2},{4})
{1–4} {1–2} {2–4} {4–5} {2} {4}
→ → → → → →
comp({1}, {2–4}) comp({2}) ∀res({2}, {3}) most({4}) of see
In (69), the fragments of (68) are addressed by numbers, 1, 3, and 5 are the fragments for the existential, universal, and most-DP, respectively, and 2 and 4 are the fragments for of and see. All nonterminals correspond to parts of constraints; they are abbreviated as sequences of fragments. For instance, {2–5} corresponds to the whole constraint except the existential fragment. Rules of the RTG specify on the right hand side the root of the partial constraint introduced on the left hand side, for instance, the first rule expresses wide scope of a company over the whole sentence. The RTG (69) yields the same five solutions as (68). In (69), the reading ∃ > most′ > ∀ can be excluded easily, by removing the production rule {2–5} → most({2–4}): This leaves only one expansion rule for {2–5}. Since {2–5} emerges only as child of ∃comp with widest scope, only ∀res can be the root of the tree below widest-scope ∃comp. This shows that RTGs are more expressive than dominance constraints. (In more involved cases, restricting the set of solutions can be less simple, however, which means that RTGs can get larger if specific solutions are to be excluded.) This last observation points to another property of underspecification formalisms that is interdependent with expressivity, viz., compactness : A (sometimes tacit) assumption is
24. Semantic underspecification that underspecification formalisms should be able to characterise a set of readings of an ambiguous expression in terms of a representation that is shorter or more efficient than an enumeration (or disjunction) of all the readings (König & Reyle 1999). Ebert (2005) defines this intuitive notion of compactness in the following way: An underspecification formalism is compact iff the maximal length of the representations is at most polynomial (with respect to the number of scope-bearing elements). Ebert shows that there is a trade-off between expressivity and compactness, and that no underspecification formalism can be both expressive and compact in his sense at the same time.
4. Motivation This section outlines a number of motivations for the introduction and use of semantic underspecification formalisms.
4.1. Functionality of the syntax-semantics interface The first motivation for semantic underspecification formalisms lies in the syntaxsemantics interface: Semantic underspecification is one way of keeping the mapping from syntax to semantics functional in spite of semantically and syntactically homogeneous ambiguities like (26). These expressions can be analysed in terms of a single syntactic structure even though they have several readings. This seems in conflict with the functional nature of semantic interpretation, which associates one specific syntactic structure with only one single semantic structure (see Westerståhl 1998 and Hodges 2001). Competing approaches to the syntax-semantics interface multiply syntactic structures for semantically and syntactically homogeneous ambiguities (one for each reading) of relinquish the functionality of the syntax-semantics interface altogether to cover these ambiguities. (See article 82 (von Stechow) Syntax and semantics.)
4.1.1. Multiplying syntactic structures Syntactic structures can be multiplied in two ways. First, one can postulate the functional relation between syntactic derivation trees (a syntactic structure and its derivation history) and semantic structures rather than between syntactic and semantic structures. This strategy shows up in Montague’s (1974) account of quantifier scope ambiguity and in Hoeksema (1985). It is motivated by the definition of semantic interpretation as a homomorphism from the syntactic to the semantic algebra (every syntactic operation is translated into a semantic one), but demotes the semantic structure that results from this derivation by giving the pride of place to the derivation itself. Second, one can model the ambiguous expressions as syntactically heterogeneous. This means that each reading corresponds to a unique syntactic structure (on a semantically relevant syntactic level). Syntactic heterogeneity can then emerge either through different ways of combining the parts of the expression (which themselves need not be ambiguous) or through systematic lexical ambiguity of specific parts of the expression, which enforces different ways of combining them syntactically. The first way of making the relevant expressions syntactically heterogeneous is implemented in Generative Grammar. Here syntactic structures unique to specific
557
558
V. Ambiguity and vagueness readings show up on the level of Logical Form (LF). For instance, quantifier scope is be determined by (covert) DP movement and adjunction (mostly, to a suitable S node); relative scope between quantifiers can then be put down to relations of c-command between the respective DPs on LF (Heim & Kratzer 1998). (The standard definition of c-command is that a constituent A c-commands another constituent B if A does not dominate B and vice versa and the lowest branching node that dominates A also dominates B.) This strategy is also used for scope ambiguities below the word level, which are reconstructed in terms of different syntactic constellations of constituents below the word level. These constituents can correspond to morphemes (as in the case of dancer or the Anglo-Saxon genitive), but need not (e.g., for the change-of-state operator in the semantics of die). The second way of inducing syntactic heterogeneity is to assume that specific lexical items are ambiguous because they occur in different syntactic categories. This means that depending on their reading they combine with other constituents in different ways syntactically. For instance, Combinatory Categorial Grammar (CCG) incorporates rules of type raising, which change the syntactic category and hence also the combinatory potential of lexical items. For instance, an expression of category X can become one of type T/(T \ X), i.e., a T which lacks to its right a T lacking an X to its left. If X = DP and T = S, a DP becomes a sentence without a following VP, since the VP is a sentence without a preceding DP (S\DP). Hendriks (1993) and Steedman (2000) point out that these rules could be used for modelling quantifier scope ambiguities in terms of syntactically heterogeneous ambiguity: Syntactic type raising changes the syntactic combinatory potential of the involved expressions, which may change the order in which the expressions are combined in the syntactic construction. This in turn affects the order of combining elements in semantic construction. In particular, if a DP is integrated later than another one (DP′), then DP gets wide scope over DP′: The semantics of DP is applied to a semantic representation that already comprises the semantic contribution of DP′. In an example such as (26), the two readings could thus emerge by either first forming a VP and then combining it with the subject (wide scope for the subject), or by forming a constituent out of subject and verb, which is then combined with the object (which consequently gets widest scope).
4.1.2. Giving up the functionality of the syntax-semantics interface Other researchers give up the functionality of the syntax-semantics interface to handle syntactically and semantically homogeneous ambiguities. One syntactic structure may thus correspond to several readings, which is due to a less strict coupling of syntactic and semantic construction rules. This strategy is implemented in Cooper store approaches (Cooper 1983), in which specific syntactic operations are coupled to more than one corresponding semantic operation in the syntax-semantics interface. In particular, the syntactic combination of a DP with a syntactic structure S may lead to the immediate combination of the semantic contributions of both DP and S or to appending the DP semantics to a list of DP interpretations (the ‘store’). Subsequently, material can be retrieved from the store for any sentence constituent, this material is then combined with the semantic representation
24. Semantic underspecification of the sentence constituent. This gives the desired flexibility to derive scopally different semantic representations like in (27) from uniform syntactic structures like (2). The approach of Woods (1967, 1978) works similarly: Semantic contributions of DPs are collected separately from the main semantic representation; they can be integrated into this main representation immediately or later. Another approach of this kind is Steedman (2007). Here non-universal quantifiers and their scope with respect to universal quantifiers are modelled in terms of Skolem functions. (See Kallmeyer & Romero 2008 for further discussion of this strategy.) These functions can have arguments for variables bound by universal quantifiers to express the fact that they are outscoped by these quantifiers. Consider e.g. the two readings of (26) in Skolem notation: (70) a. ∀x.woman′(x) → man′(sk1) ∧ love′(x, sk1) (‘one man for all women’) b. ∀x.woman′(x) → man′(sk2(x)) ∧ love′(x,sk2(x)) (‘a possibly different man per woman’) For the derivation of the different readings of a scopally underspecified expression, Steedman uses underspecified Skolem functions, which can be specified at any point in the derivation w.r.t. their environment, viz., the n-tuple of variables bound by universal quantifiers so far. For (26), the semantics of a man would be represented by λQ.Q(skolem′(man′)), where skolem′ is a function from properties P and environments E to generalised skolem terms like f (E), where P holds of f (E). The term λQ.Q(skolem′(man′)) can be specified at different steps in the derivation, with different results: Immediately after the DP has been formed specification returns a Skolem constant like sk1 in (70a), since the environment is still empty. After combining the semantics of the DPs and the verb, the environment is the 1-tuple with the variable x bound by the universal quantifier of the subject DP, and specification at that point yields a skolem term like sk2(x). This sketch of competing approaches to the syntax-semantics interface shows that the functionality of this interface (or, an attempt to uphold it in spite of semantically and syntactically homogeneous ambiguous expressions) can be a motivation for underspecification: Functionality is preserved for such an expression directly in that there is a function from its syntactic structure to its underspecified semantic representation that encompasses all its readings.
4.2. Ambiguity and negation Semantic underspecification also helps avoiding problems with disjunctive representations of the meaning of ambiguous expressions that show up under negation: Negating an ambiguous expression is intuitively interpreted as the disjunction of the negated expressions, i.e., one of the readings of the expressions is denied. However, if the meaning of the expression itself is modelled as the disjunction of its readings, the negated expression emerges as the negation of the disjunctions, which is equivalent to the conjunction of the negated readings, i.e., every reading of the expression is denied, which runs counter to intuitions. For instance, for (26) such a semantic representation can be abbreviated as (71), which turns into (72) after negation:
559
560
V. Ambiguity and vagueness (71) ∀∃ ∨ ∃∀ (72) ¬(∀∃ ∨ ∃∀) = ¬∀∃ ∧ ¬∃∀ However, if we model the meaning of the ambiguous expression as the set of its fully specified readings, and assume that understanding such an expression proceeds by forming the disjunction of this set, these interpretations follow directly. For (26), the meaning is {∀∃, ∃∀}. Asserting (26) is understood as the disjunction of its readings {∀∃, ∃∀}; its denial, as the disjunction of its readings {¬∀∃, ¬∃∀}, which yields the desired interpretation (van Eijck & Pinkal 1996). For examples more involved than (26), the most efficient strategy of describing these set of readings would then be to describe their elements rather than to enumerate them, which then calls for underspecification.
4.3. Underspecification in Natural Langugage Processing One of the strongest motivations for semantic underspecification was its usefulness for Natural Language Processing (NLP). (See also article 108 (Pinkal & Koller) Semantics in computational linguistics.) The first issue for which underspecification is very useful is the fact that scope ambiguity resolution can be really hard. For instance, in a small corpus study on quantifier scope in the CHORUS project at the University of the Saarland (using the NEGRA corpus; Brants, Skut & Uszkoreit 2003), roughly 10% of the sentences with several scope-bearing elements were problematic, e.g., the slightly simplified (73): (73) Alle Teilnehmer erhalten ein Handbuch all participants receive a handbook ‘All participants receive a handbook’ The interpretation of (73) is that the same kind of handbook is given to every participant, but that everyone gets his own copy. That is, the scope between the DPs interacts with a type-token ambiguity: an existential quantification over handbook types outscopes the universal quantification over participants, which in turn gets scope over an existential quantification over handbook tokens. For those examples, underspecification is useful to allow a semantic representation for NLP systems at all, because it does not force the system to make arbitrary choices and nevertheless returns a semantic analysis of the examples. But the utility of underspecification for NLP is usually discussed with reference to efficiency, because this technique allows one to evade the problem of combinatorial explosion (Poesio 1990; Ebert 2005). The problem is that in many cases, the number of readings of an ambiguous expression gets too large to be generated and enumerated, let alone to be handled efficiently in further modules of an NLP system (e.g., for Machine Translation). Deriving an underspecified representation of an ambiguous expression that captures only the common ground between its readings and fully deriving a reading only by need is less costly than generating all possible interpretations and then selecting the relevant one.
24. Semantic underspecification What is more, a complete disambiguation may be not even necessary. In these cases, postponing ambiguity resolution, and resolving ambiguity only on demand makes NLP systems more efficient. For instance, scope ambiguities are often irrelevant for translation, therefore it would be useless to identify the intended reading of such an expression: Its translation into the target language would be ambiguous in the same way again. Therefore e.g. the Verbmobil project (machine translation of spontaneous spoken dialogue; Wahlster 2000) used a scopally underspecified semantic representation (Schiehlen 2000). The analyses of concrete NLP systems show clearly that combinatorial explosion is a problem for NLP that suggests the use of underspecification (pace Player 2004). The large number of readings that are attributed to linguistic expressions are due to the fact that, first, the number of scope-bearing constituents per expression is underestimated (there are many more such constituents besides DPs, e.g., negation, modal verbs, quantifying adverbials like three times or again), and, second and much worse, spurious ambiguities come in during syntactic and semantic processing of the expressions. For instance, Koller, Regneri & Thater (2008) found that 5% of the representations in the Rondane Treebank (underspecified MRS representations of sentences from the domain of Norwegian tourist information, distributed as part of the English Resource Grammar, Copestake & Flickinger 2000) have more than 650 000 solutions, record holder is the rather innocuous looking (74) with about 4.5 × 1012 scope readings: (74) Myrdal is the mountain terminus of the Flåm rail line (or Flåmsbana) which makes its way down the lovely Flåm Valley (Flåmsdalen) to its sea-level terminus at Flåm. The median number of scope readings per sentence is 56 (Koller, Regneri & Thater 2008), so, short of applying specific measures to eliminate spurious ambiguities (see section 6.2), combinatorial explosion definitely is a problem for semantic analysis in NLP. In recent years, underspecification has turned out very useful for NLP in another way, viz., in that underspecified semantics provides an interface bridging the gap between deep and shallow processing. To combine the advantages of both kinds of processing (accuracy vs. robustness and speed), both can be combined in NLP applications (hybrid processing). The results of deep and shallow syntactic processing can straightforwardly be integrated on the semantic level (instead of combining the results of deep and shallow syntactic analyses). An example for an architecture for hybrid processing is the ‘Heart of Gold’ developed in the project ‘DeepThought’ (Callmeier et al. 2004). Since shallow syntactic analyses provide only a part of the information to be gained from deep analysis, the semantic information derivable from the results of a shallow parse (e.g., by a part-of-speech tagger or an NP chunker) can only be a part of the one derived from the results of a deep parse. Underspecification formalisms can be used to model this kind of partial information as well. For instance, deep and shallow processing may yield different results with respect to argument linking: NP chunkers (as opposed to systems of deep processing) do not relate verbs and their syntactic arguments, e.g., experiencer and patient in (75). Any semantic analysis based on such a chunker will thus fail to identify individuals in NP and verb semantics as in (76):
561
562
V. Ambiguity and vagueness (75) Max saw Mary (76) named(x1, Max), see(x2, x3), named(x4, Mary) Semantic representations of different depths must be compatible in order to combine results from parallel deep and shallow processing or to transform shallow into deep semantic analyses by adding further pieces of information. Thus, the semantic representation formalism must be capable of separating the semantic information from different sources appropriately. For instance, information on argument linking should be listed separately, thus, a full semantic analysis of (75) should look like (77) rather than (78). Robust MRS (Copes-take 2003) is an underspecification formalism that was designed to fulfill this demand: (77) named(x1, Max), see(x2, x3), named(x4, Mary), x1 = x2, x3 = x4 (78) named(x1, Max), see(x1, x4), named(x4, Mary)
4.4. Semantic construction Finally, underspecification formalisms turn out to be interesting from the perspective of semantic construction in general, independently of the issue of ambiguity. This interest is based on two properties of these formalisms, viz., their portability and their flexibility. First, underspecification formalisms do not presuppose a specific syntactic analysis (which would do a certain amount of preprocessing for the mapping from syntax to semantics, like the mapping from surface structure to Logical Form in Generative Grammar). Therefore the syntax-semantics interface can be defined in a very transparent fashion, which makes the formalisms very portable in that they can be coupled with different syntactic formalisms. Tab. 24.1 lists some of the realised combinations of syntactic and semantic formalisms. Second, the flexibility of the interfaces that are needed to derive underspecified representations of ambiguous expressions is also available for unambiguous cases that pose a challenge for any syntax-semantics interface. For instance, semantic construction for the modification of modifiers and indefinite pronouns like everyone is a problem, because the types of functor (semantics of the modifier) and argument (semantics of the modified expression) do not fit: For instance, in (79) the PP semantics is a function from properties to properties, the semantics of the pronoun as well as the one of the whole modification structure are sets of properties. Tab. 24.1: Realised couplings of underspecification formalisms and syntax formalisms HPSG
LFG
(L)TAG
MRS
Copestake et al. (2005)
Oepen et al. (2004)
Kallmeyer & Joshi (1999)
GLS
Asudeh & Crouch (2001)
Dalrymple (2001)
Frank & van Genabith (2001)
UDRT
Frank & Reyle (1995)
van Genabith & Crouch (1999)
Cimiano & Reyle (2005)
HS
Chaves (2002)
Kallmeyer & Joshi (2003)
24. Semantic underspecification
563
(79) everyone in this room Interfaces for the derivation of underspecified semantic representations for examples like (26) can be reused to perform the semantic construction of (79) and of many more examples of that kind, see Egg (2004, 2006). Similarly, Richter & Sailer (2006) use their underspecification formalism to handle semantic construction for unambiguous cases of negative concord. For these unambiguous expressions, the use of underspecification formalisms requires a careful control of the solutions of the resulting constraints: These constraints must have a single solution only (since the expressions are unambiguous), but underspecification constraints were designed primarily for the representation of ambiguous expressions, whose constraints have several solutions.
5. Semantic underspecification and the syntax-semantics interface In this section, I will sketch the basic interface strategy to derive underspecified semantic structures from (surface-oriented) syntactic structures. The strategy consists in deliberately not specifying scope relations between potentially scopally ambiguous constituents of an expression, e.g., in the syntax-semantics interfaces described for UDRT (Frank & Reyle 1995), MRS (Copestake et al. 2005), CLLS (Egg, Koller & Niehren 2001) or Hole Semantics (Bos 2004). To derive underspecified semantic structures, explicit bookkeeping of specific parts of these structures is necessary. These parts have ‘addresses’ (e.g., the labels of UDRT or the handles of MRS) that are visible to the interface rules. This allows interface rules to address these parts in the subconstituents when they specify how the constraints of the subconstituents are to be combined in the constraint of the emerging new constituent. (The rules also specify these parts for the constraint of the new constituent.) Therefore, these interfaces are more powerful than interfaces that only combine the semantic contributions of the subconstituents as a whole. As an example, consider the (greatly simplified) derivation of the under-specified representation (28) of example (26) by means of the syntax-semantics interface rules (80)–(82). In the interface, each atomic or complex constituent C is associated with a constraint and has two special fragments, a top fragment [[Ctop]] (which handles scope issues) and a main fragment [[C]]. These two fragments are addressed in the interface rules as ‘glue points’ where the constraints of the involved constituents are put together; each interface rule determines these fragments anew for the emerging constituent. Furthermore, all fragments of the subconstituents are inherited by the emerging constituent. The first rule builds the DP semantics out of the semantic contributions of determiner and NP: [[Det]]([[NP]])(λz. □ ); [[DPtop]] = [[Dettop]] = [[NPtop]] (SSI) (80) [DP Det NP] ⇒ [[DP]] : z In prose: Apply the main determiner fragment to the main NP fragment and a hole with a λ-abstraction over a variable that is dominated by the hole and constitutes by itself
564
V. Ambiguity and vagueness the main DP fragment. The top fragments (holes that determine the scope of the DP, because the top fragment of a constituent always dominates all its other fragments) of DP, determiner, and NP are identical. (‘SSI’ indicates that it is a rule of the syntax-semantics interface.) The main fragment of a VP (of a sentence) emerges by applying the main verb (VP) fragment to the main fragment of the object (subject) DP. The top fragments of the verb (VP) and its DP argument are identical to the one of the emerging VP (S): (81) [VP V DP]
(SSI) ⇒
[[VP]]: [[V]]([[DP]]);
[[VPtop]] = [[Vtop]] = [[DPtop]]
(82) [S DP VP]
(SSI) ⇒
[[S]]: [[VP]]([[DP]]);
[[Stop]] = [[DPtop]] = [[VPtop]]
We assume that for all lexical entries, the main fragments are identical to the standard semantic representation (e.g., for every, we get [[Det]]: λQλP∀x.Q(x) → P(x)), and the top fragments are holes dominating the main fragments. If in unary projections like _ the one of man from N to N and NP main and top fragments are merely inherited, the semantics of a man emerges as (83): (83) DPtop : □ ∃y. man' (y) ∧
□
DP : y The crucial point is the decision to let the bound variable be the main fragment in the DP semantics. The intermediate DP fragment between top and main fragment is ignored in further processes of semantic construction. Combining (83) with the semantics of the verb yields (84): (84) VPtop : □ ∃y. man' (y) ∧ □ VP :
love' (y)
Finally, the semantics of every woman, which is derived in analogy to (83), is combined with (84) through rule (82). According to this rule, the two top fragments are identified and the two main fragments are combined by functional application into the main S fragment, but the two intermediate fragments, which comprise the two quantifiers, are not addressed at all, and hence remain dangling in between. The result is the desired dominance diamond:
24. Semantic underspecification (85)
Stop : □ ∀x. woman' (x) → □
∃y. man' (y) ∧ □
S : love' (x,y) The technique of splitting the semantic contribution of a quantifying DP resurfaces in many underspecification approaches, among them CLLS, Muskens’ Logical Description Grammar, and LTAG (Cimiano & Reyle 2005).
6. Further processing of underspecified representations Topic of this section is the further processing of underspecified representations. One can enumerate the set of solutions of a constraint, which has been the topic of much work in computational approaches to underspecification. Related to the enumeration of solutions is work on redundancy elimination, which tries to identify set of equivalent readings. The third line of approach is the attempt to derive (fully specified) information from underspecified representations by reasoning with these representations. Finally, one can derive a solution (or a small set of solutions) in terms of preferences. This enterprise has been pursued both in computational linguistics and in psycholinguistics.
6.1. Resolution of underspecified representations The first way of deriving fully specified semantic representations from underspecified representations is to enumerate the readings by resolving the constraints. For a worked out example of such a resolution, reconsider the derivation of fully specified interpretations from the set of meaning constructors in Glue Language Semantics as expounded in section 3.1. or the detailed account of resolving USDL representations in Pinkal (1996). For a number of formalisms, specific systems, so-called solvers, are available for this derivation. For MRS representations, there is a solver in the LKB (Linguistic Knowledge Builder) system (Copestake & Flickinger 2000). Blackburn & Bos (2005) present a solver for Hole Semantics. For the language of dominance constraints, a number of solvers have been developed (see Koller, Regneri & Thater 2008 and Koller & Thater 2005 for an overview).
6.2. Redundancy elimination In NLP applications that use underspecification, spurious ambiguities (which do not correspond to attested readings) are an additional complication, because they drastically enlarge the number of readings assigned to an ambiguous expression. For instance, Koller, Regneri & Thater (2008) found high numbers of spurious ambiguities in the Rondane Treebank (see section 4.3).
565
566
V. Ambiguity and vagueness Hurum’s (1988) algorithm, the CLE resolution algorithm (Moran 1988; Alshawi 1992), and Chaves’ (2003) extension of Hole Semantics detect specific cases of equivalent solutions (e.g., when one existential quantifier immediately dominates another one) and block all but one of them. The blocking is only effective once the solutions are enumerated. In contrast, Koller & Thater (2006) and Koller, Regneri & Thater (2008) present algorithms to reduce spurious ambiguities that map underspecified representations on (more restricted) underspecified representations.
6.3. Reasoning with underspecified representations Sometimes fully specified information can be deduced from an underspecified semantic representation. For instance, if Amélie is a woman, then (26) allows us to conclude that she loves a man, because this conclusion is valid no matter which reading of (26) is chosen. For UDRT (König & Reyle 1999; Reyle 1992; Reyle 1993; Reyle 1996) and APL (Jaspars & van Eijck 1996), there are calculi for such reasoning with underspecified representations. Van Deemter (1996) discusses different kinds of consequence relations for this reasoning.
6.4. Integration of preferences In many cases of scope ambiguity, the readings are not on a par in that some are more preferred than others. Consider e.g. a slight variation of (26), here the ∃∀-reading is preferred over the ∀∃-reading: (86) A woman loves every man One could integrate these preferences into underspecified representations of scopally ambiguous expressions to narrow down the number of its readings or to order the generation of solutions (Alshawi 1992).
6.4.1. Kinds of preferences The preferences discussed in the literature can roughly be divided into three groups. The first group are based on syntactic structure, starting with Johnson-Laird (1969) and Lakoff’s (1971) claim that surface linear order or precedence introduces a preference for wide scope of the preceding scope-bearing item (but see e.g.Villalta 2003 for experimental counterevidence). Precedence can be interpreted in terms of a syntactic configuration such as c-command (e.g., VanLehn 1978), since in a right-branching binary phrase-marker preceding constituents c-command the following ones. However, these preferences are not universally valid: Kurtzman & MacDonald (1993) report a clear preference for wide scope of the embedded DP in the case of nested quantification as in (87). Here the indefinite article precedes (and c-commands) the embedded DP, but the ∀∃-reading is nevertheless preferred: (87) I met a researcher from every university Hurum (1988) and VanLehn (1978) make the preference of scope-bearing items to take scope outside the constituent they are directly embedded in also dependent on the
24. Semantic underspecification category of that constituent (e.g., much stronger for items inside PPs than items inside infinite clauses). The scope preference algorithm of Gambäck & Bos (1998) gives scope-bearing nonheads (complements and adjuncts) in binary-branching syntactic structures immediate scope over the respective head. The second group of preferences focus on grammatical functions and thematic roles. Functional hierarchies have been proposed that indicate preference to take wide scope in Ioup (1975) (88a) and VanLehn (1978) (88b): (88) a. topic > deep and surface subject > deep subject or surface subject > indirect object > prepositional object > direct object b. preposed PP, topic NP > subject > (complement in) sentential or adverbial PP > (complement in) verb phrase PP > object While Ioup combines thematic and functional properties in her hierarchy (by including the notion of ‘deep subject’), Pafel (2005) distinguishes grammatical functions (only subject and sentenctial adverb) and thematic roles (strong and weak patienthood) explicitly. There is a certain amount of overlap between structural preferences and the functional hierarchies, at least in a language like English: Here DPs higher on the functional hierarchy also tend to c-command DPs lower on the hierarchy, because they are more likely to surface as subjects. The third group of preferences addresses the quantifiers (or, the determiners expressing them) themselves. Ioup (1975) and VanLehn (1978) introduce a hierarchy of determiners: (89) each > every > all > most > many > several > some (plural) > a few CLE incorporates such preference rules, too (Moran 1988; Alshawi 1992), e.g., the rule that each outscopes other determiners, and that negation is outscoped by some and outscopes every. Some of these preferences can be put down to a more general preference for logically weaker interpretations, in particular, the tendency of universal quantifiers to outscope existential ones (recall that the ∀∃-reading of sentences like (26) is weaker than the ∃∀-reading; VanLehn 1978; Moran 1988; Alshawi 1992). Similarly, scope of the negation above every and below some returns existential statements, which are weaker than the (unpreferred) alternative scopings (universal statements) in that they do not make a claim about the whole domain. Pafel (2005) lists further properties, among them focus and discourse binding (whether a DP refers to an already established set of entities, as e.g. in few of the books as opposed to few books).
6.4.2. Interaction of preferences It has been argued that the whole range of quantifier scope effects can only be accounted for in terms of an interaction of different principles. Fodor (1982) and Hurum (1988) assume an interaction between linear precedence and the determiner hierarchy, which is corroborated by experimental results of Filik,
567
568
V. Ambiguity and vagueness Paterson & Liversedge (2004). They show that a conflict of these principles leads to longer reading times. The results of Filik, Paterson & Liversedge (2004) are also compatible with the predictions of Ioup (1975), who puts down scoping preferences to an interaction of the functional and quantifier hierarchy. To get wider scope than another quantifier, a quantifier must score high on both hierarchies. Kurtzman & MacDonald (1993) present further empirical evidence for the interaction of preferences. They point to a clear contrast between sentences like (90a) [=(26)] and their passive version (90b), where the clear preference of (90a) for the ∀∃-reading is no longer there: (90) a. Every woman loves a man b. A man is loved by every woman If preferences were determined by a single principle, one would expect a preference for the passive version, too, either one for its (new) subject, or for the by-PP (the former demoted subject). Kurtzman & MacDonald (1993) argue that the interaction of syntax-oriented principles with the thematic role principle can account for these findings. Since most subjects have higher thematic roles, the principles agree on the scope preference for the subject in the active sentence, but conflict in the case of the passive sentence, which consequently exhibits no clear-cut scope preference. The interaction between surface ordering and the position of the indefinite article below the universally quantifying every and each on the quantifier hierarchy is explained by Fodor (1982) and Kurtzman & MacDonald (1993) in that it is easier to interpret indefinite DPs in terms of a single referent than in terms of several ones. The second interpretation must be motivated, e.g., in the context of an already processed universal quantifier, which suggests several entities, one for each of the entities over which the universal quantifier ranges. The most involved model of interacting preferences for quantifier scope is the one of Pafel (2005). He introduces no less than eight properties of quantifiers that are relevant for scope preferences, among them syntactic position, grammatical function, thematic role, discourse binding and focus. The scores for the different properties are added up for each quantifier, the properties carry weights that were determined empirically.
7. References Alshawi, Hiyan (ed.) 1992. The Core Language Engine. Cambridge, MA: The MIT Press. Alshawi, Hiyan & Richard Crouch 1992. Monotonic semantic interpretation. In: Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics (= ACL). Newark, DE: Association for Computational Linguistics, 32–39. Althaus, Ernst, Denys Duchier, Alexander Koller, Kurt Mehlhorn, Joachim Niehren & Sven Thiel 2001. An efficient algorithm for the configuration problem of dominance graphs. In: Proceedings of the 12th ACM-SIAM Symposium on Discrete Algorithms. New York: Association for Computing Machinery, 815–824. Asher, Nicholas & Tim Fernando 1999. Labeled representations, underspecification and disambiguation. In: H. Bunt & R. Muskens (eds.). Computing Meaning, vol. 1. Dordrecht: Kluwer, 73–95.
24. Semantic underspecification Asher, Nicholas & Alex Lascarides 2003. Logics of Conversation. Cambridge: Cambridge University Press. Asudeh, Ash & Richard Crouch 2002. Glue semantics for HPSG. In: F. van Eynde, L. Hellan & D. Beermann (eds.). Proceedings of the 8th International Conference on Head-Driven Phrase Structure Grammar. Stanford, CA: CSLI Publications, 1–19. Babko-Malaya, Olga 2004. LTAG semantics of NP-coordination. In: Proceedings of the 7th International Workshop on Tree Adjoining Grammars and Related Formalisms. Vancouver, 111–117. Bierwisch, Manfred 1983. Semantische und konzeptionelle Repräsentation lexikalischer Einheiten. In: R. R˚užiˇcka & W. Motsch (eds.). Untersuchungen zur Semantik. Berlin: Akademie Verlag, 61–99. Bierwisch, Manfred 1988. On the grammar of local prepositions. In: M. Bierwisch, W. Motsch & I. Zimmermann (eds.). Syntax, Semantik und das Lexikon. Berlin: Akademie Verlag, 1–65. Bierwisch, Manfred & Ewald Lang (eds.) 1987. Grammatische und konzeptuelle Aspekte von Dimensionsadjektiven. Berlin: Akademie Verlag. Blackburn, Patrick & Johan Bos 2005. Representation and Inference for Natural Language. A First Course in Computational Semantics. Stanford, CA: CSLI Publications. Bos, Johan 1996. Predicate logic unplugged. In: P. Dekker & M. Stokhof (eds.). Proceedings of the 10th Amsterdam Colloquium. Amsterdam: ILLC, 133–142. Bos, Johan 2004. Computational semantics in discourse. Underspecification, resolution, and inference. Journal of Logic, Language and Information 13, 139–157. Brants, Thorsten, Wojciech Skut & Hans Uszkoreit 2003. Syntactic annotation of a German newspaper corpus. In: A. Abeillé (ed.). Treebanks. Building and Using Parsed Corpora. Dordrecht: Kluwer, 73–88. Bronnenberg, Wim, Harry Bunt, Jan Landsbergen, Remko Scha, Wijnand Schoenmakers & Eric van Utteren 1979. The question answering system PHLIQA1. In: L. Bolc (ed.). Natural Language Question Answering Systems. London: Macmillan, 217–305. Bunt, Harry 2007. Underspecification in semantic representations. Which technique for what purpose? In: H. Bunt & R. Muskens (eds.). Computing Meaning, vol. 3. Amsterdam: Springer, 55–85. Callmeier, Ulrich, Andreas Eisele, Ulrich Schäfer & Melanie Siegel 2004. The DeepThought core architecture framework. In: M.T. Lino et al. (eds.). Proceedings of the 4th International Conference on Language Resources and Evaluation (= LREC). Lisbon: European Language Resources Association, 1205–1208. Chaves, Rui 2002. Principle-based DRTU for HSPG. A case study. In: Proceedings of the 1st International Workshop on Scalable Natural Language Understanding (= ScaNaLu). Heidelberg: EML. Chaves, Rui 2003. Non-Redundant scope disambiguation in underspecified semantics. In: B. ten Cate (ed.). Proceedings of the 8th EESLLI Student Session. Vienna, 47–58. Chaves, Rui 2005a. DRT and underspecification of plural ambiguities. In: Proceedings of the 6th International Workshop in Computational Semantics. Tilburg University, 78–89. Chaves, Rui 2005b. Underspecification and NP coordination in constraint-based grammar. In: F. Richter & M. Sailer (eds.). Proceedings of the ESSLLI’05 Workshop on Empirical Challenges and Analytical Alternatives to Strict Compositionality. Edinburgh: Heriot-Watt University, 38–58. Cimiano, Philipp & Uwe Reyle 2005. Talking about trees, scope and concepts. In: H. Bunt, J. Geertzen & E. Thijse (eds.). Proceedings of the 6th International Workshop in Computational Semantics. Tilburg: Tilburg University, 90–102. Cooper, Robin 1983. Quantification and Syntactic Theory. Dordrecht: Reidel. Copestake, Ann & Dan Flickinger 2000. An open-source grammar development environment and broad-coverage English grammar using HPSG. In: Proceedings of the 2nd International Conference on Language, Resources and Evaluation (= LREC). Athens: European Language Resources Association, 591–600. Copestake, Ann, Dan Flickinger, Carl Pollard & Ivan Sag 2005. Minimal recursion semantics. An introduction. Research on Language and Computation 3, 281–332.
569
570
V. Ambiguity and vagueness Crouch, Richard 1995. Ellipsis and quantification. A substitutional approach. In: Proceedings of the 7th Conference of the European Chapter of the Association for Computational Linguistics (= EACL). Dublin: Association for Computational Linguistics, 229–236. Crouch, Richard & Josef van Genabith 1999. Context change, underspecification and the structure of Glue Language derivations. In: M. Dalrymple (ed.). Semantics and Syntax in Lexical Functional Grammar. The Resource Logic Approach. Cambridge, MA: The MIT Press, 117–189. Dalrymple, Mary 2001. Lexical Functional Grammar. New York: Academic Press. Dalrymple, Mary, John Lamping, Fernando Pereira & Vijay Saraswat 1997. Quantifiers, anaphora, and intensionality. Journal of Logic, Language and Information 6, 219–273. Dalrymple, Mary, Stuart Shieber & Fernando Pereira 1991. Ellipsis and higherorder unification. Linguistics & Philosophy 14, 399–452. van Deemter, Kees 1996. Towards a logic of ambiguous expressions. In: K. van Deemter & S. Peters (eds.). Semantic Ambiguity and Underspecification. Stanford, CA: CSLI Publications, 203–237. Dölling, Johannes 1995. Ontological domains, semantic sorts and systematic ambiguity. International Journal of Human-Computer Studies 43, 785–807. Duchier, Denys & Claire Gardent 2001. Tree descriptions, constraints and incrementality. In: H. Bunt, R. Muskens & E. Thijsse (eds.). Computing Meaning, vol. 2. Dordrecht: Kluwer, 205–227. Ebert, Christian 2005. Formal Investigations of Underspecified Representations. Ph.D. dissertation. King’s College, London. Egg, Markus 2004. Mismatches at the syntax-semantics interface. In: S. Müller (ed.). Proceedings of the 11th International Conference on Head-Driven Phrase Structure Grammar. Stanford, CA: CSLI Publications, 119–139. Egg, Markus 2005. Flexible Semantics for Reinterpretation Phenomena. Stanford, CA: CSLI Publications. Egg, Markus 2006. Anti-Ikonizität an der Syntax-Semantik-Schnittstelle. Zeitschrift für Sprachwissenschaft 25, 1–38. Egg, Markus 2007. Against opacity. Research on Language and Computation 5, 435–455. Egg, Markus & Michael Kohlhase 1997. Dynamic control of quantifier scope. In: P. Dekker, M. Stokhof & Y. Venema (eds.). Proceedings of the 11th Amsterdam Colloquium. Amsterdam: ILLC, 109–114. Egg, Markus, Alexander Koller & Joachim Niehren 2001. The constraint language for lambdastructures. Journal of Logic, Language and Information 10, 457–485. Egg, Markus & Gisela Redeker 2008. Underspecified discourse representation. In: A. Benz & P. Kühnlein (eds.). Constraints in Discourse. Amsterdam: Benjamins, 117–138. van Eijck, Jan & Manfred Pinkal 1996. What do underspecified expressions mean? In: R. Cooper et al. (eds.). Building the Framework. FraCaS Deliverable D 15, section 10.1. Edinburgh: University of Edinburgh, 264–266. Filik, Ruth, Kevin Paterson & Simon Liversedge 2004. Processing doubly quantified sentences. Evidence from eye movements. Psychonomic Bulletin and Review 11, 953–959. Fodor, Janet 1982. The mental representation of quantifiers. In: S. Peters & E. Saarinen (eds.). Processes, Beliefs, and Questions. Dordrecht: Reidel, 129–164. Fox, Chris & Shalom Lappin 2005. Underspecified interpretations in a Curry-typed representation language. Journal of Logic and Computation 15, 131–143. Frank, Annette & Josef van Genabith 2001. GlueTag. Linear logic based semantics for LTAG – and what it teaches us about LFG and LTAG. In: M. Butt & T. Holloway King (eds.). Proceedings of the LFG ’01 Conference. Stanford, CA: CSLI Publications, 104–126. Frank, Annette & Uwe Reyle 1995. Principle-based semantics for HPSG. In: Proceedings of the 7th Conference of the European Chapter of the Association for Computational Linguistics (= EACL). Dublin: Association for Computational Linguistics, 9–16. Fuchss, Ruth, Alexander Koller, Joachim Niehren & Stefan Thater 2004. Minimal recursion semantics as dominance constraints. Translation, evaluation, and analysis. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (= ACL). Barcelona: Association for Computational Linguistics, 247–254.
24. Semantic underspecification Gambäck, Björn & Johan Bos 1998. Semantic-head based resolutions of scopal ambiguities. In: Proceedings of the 17th International Conference on Computational Linguistics and the 36th Annual Meeting of the Association for Computational Linguistics (= ACL). Montreal: Association for Computational Linguistics, 433–437. Gawron, Jean Mark & Stanley Peters 1990. Anaphora and Quantification in Situation Semantics. Menlo Park, CA: CSLI Publications. van Genabith, Josef & Richard Crouch 1999. Dynamic and underspecified semantics for LFG. In: M. Dalrymple (ed.). Semantics and Syntax in Lexical Functional Grammar. The Resource Logic Approach. Cambridge, MA: The MIT Press, 209–260. Harris, John 2007. Representation. In: P. de Lacy (ed.). The Cambridge Handbook of Phonology. Cambridge: Cambridge University Press, 119–137. Heim, Irene & Angelika Kratzer 1998. Semantics in Generative Grammar. Oxford: Blackwell. Hendriks, Herman 1993. Studied Flexibility. Categories and Types in Syntax and Semantics. Amsterdam: ILLC. Hirschbühler, Paul 1982. VP deletion and across the board quantifier scope. In: J. Pustejovsky & P. Sells (eds.). Proceedings of the North Eastern Linguistics Society (= NELS) 12. Amherst, MA: GLSA. 132–139. Hobbs, Jerry & Stuart Shieber 1987. An algorithm for generating quantifier scoping. Computational Linguistics 13, 47–63. Hobbs, Jerry, Mark Stickel, Douglas Appelt & Paul Martin 1993. Interpretation as abduction. Artificial Intelligence 63, 69–142. Hodges, Wilfrid 2001. Formal features of compositionality. Journal of Logic, Language and Information 10, 7–28. Hoeksema, Jack 1985. Categorial Morphology. New York: Garland. Huet, Gérard P. 1975. A unification algorithm for typed λ-calculus. Theoretical Computer Science 1, 27–57. Hurum, Sven 1988. Handling scope ambiguities in English. In: Proceedings of the 2nd Conference on Applied Natural Language Processing (= ANLP). Morristown, NJ: Association for Computational Linguistics, 58–65. Ioup, Georgette 1975. Some universals for quantifier scope. In: J. Kimball (ed.). Syntax and Semantics 4. New York: Academic Press, 37–58. Jaspars, Jan & Jan van Eijck 1996. Underspecification and reasoning. In: R. Cooper et al. (eds.). Building the Framework. FraCaS Deliverable D 15, section 10.2, 266–287. Johnson-Laird, Philip 1969. On understanding logically complex sentences. Quarterly Journal of Experimental Psychology 21, 1–13. Kallmeyer, Laura & Aravind K. Joshi 1999. Factoring predicate argument and scope semantics. Underspecified semantics with LTAG. In: P. Dekker (ed.). Proceedings of the 12th Amsterdam Colloquium. Amsterdam: ILLC, 169–174. Kallmeyer, Laura & Aravind K. Joshi 2003. Factoring predicate argument and scope semantics. Underspecified semantics with LTAG. Research on Language and Computation 1, 3–58. Kallmeyer, Laura & Frank Richter 2006. Constraint-based computational semantics. A comparison between LTAG and LRS. In: Proceedings of the 8th International Workshop on Tree-Adjoining Grammar and Related Formalisms. Sydney: Association for Computational Linguistics, 109–114. Kallmeyer, Laura & Maribel Romero 2008. Scope and situation binding in LTAG using semantic unification. Research on Language and Computation 6, 3–52. Keller, William 1988. Nested Cooper storage. The proper treatment of quantification in ordinary noun phrases. In: U. Reyle & Ch. Rohrer (eds.). Natural Language Parsing and Linguistic Theories. Dordrecht: Reidel, 432–447. Kempson, Ruth & Annabel Cormack 1981. Ambiguity and quantification. Linguistics & Philosophy 4, 259–309. Koller, Alexander, Joachim Niehren & Stefan Thater 2003. Bridging the gap between underspecification formalisms. Hole semantics as dominance constraints. In: Proceedings of the
571
572
V. Ambiguity and vagueness 11th Conference of the European Chapter of the Association for Computational Linguistics (= EACL). Budapest: Association for Computational Linguistics, 195–202. Koller, Alexander, Michaela Regneri & Stefan Thater 2008. Regular tree grammars as a formalism for scope underspecification. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (= ACL). Columbus, OH: Association for Computational Linguistics, 218–226. Koller, Alexander & Stefan Thater 2005. The evolution of dominance constraint solvers. In: Proceedings of the ACL Workshop on Software. Ann Arbor, MI: Association for Computational Linguistics, 65–76. Koller, Alexander & Stefan Thater 2006. An improved redundancy elimination algorithm for underspecified descriptions. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics (= ACL). Sydney: Association for Computational Linguistics, 409–416. König, Esther & Uwe Reyle 1999. A general reasoning scheme for underspecified representations. In: H.J. Ohlbach & U. Reyle (eds.). Logic, Language and Reasoning. Essays in Honour of Dov Gabbay. Dordrecht: Kluwer, 1–28. Kratzer, Angelika 1998. Scope or pseudoscope? Are there wide-scope indefinites? In: S. Rothstein (ed.). Events and Grammar. Dordrecht: Kluwer, 163–196. Kurtzman, Howard & Maryellen MacDonald 1993. Resolution of quantifier scope ambiguities. Cognition 48, 243–279. Lakoff, George 1971. On generative semantics. In: D. Steinberg & L. Jakobovits (eds.). Semantics. An Interdisciplinary Reader in Philosophy, Linguistics and Psychology. Cambridge: Cambridge University Press, 232–296. Larson, Richard 1998. Events and modification in nominals. In: D. Strolovitch & A. Lawson (eds.). Proceedings of Semantics and Linguistic Theory (= SALT) VIII. Ithaca, NY: CLC Publications, 145–168. Larson, Richard & Sungeon Cho 2003. Temporal adjectives and the structure of possessive DPs. Natural Language Semantics 11, 217–247. Lev, Iddo 2005. Decoupling scope resolution from semantic composition. In: H. Bunt, J. Geertzen & E. Thijse (eds.). Proceedings of the 6th International Workshop on Computational Semantics. Tilburg: Tilburg University, 139–150. Link, Godehard 1983. The logical analysis of plurals and mass terms. A lattice-theoretic approach. In: R. Bäuerle, Ch. Schwarze & A. von Stechow (eds.). Meaning, Use, and the Interpretation of Language. Berlin: Mouton de Gruyter, 303–323. Marcus, Mitchell, Donald Hindle & Margaret Fleck 1983. D-theory. Talking about talking about trees. In: Proceedings of the 21st Annual Meeting of the Association for Computational Linguistics (= ACL). Cambridge, MA: Association for Computational Linguistics, 129–136. Montague, Richard 1974. Formal Philosophy. Selected Papers of Richard Montague. Edited and with an introduction by Richmond H. Thomason. New Haven, CT: Yale University. Moran, Douglas 1988. Quantifier scoping in the SRI core language engine. In: Proceedings of the 26th Annual Meeting of the Association for Computational Linguistics (= ACL). Buffalo, NY: Association for Computational Linguistics, 33–40. Muskens, Reinhard 2001. Talking about trees and truth-conditions. Journal of Logic, Language and Information 10, 417–455. Nerbonne, John 1993. A feature-based syntax/semantics interface. Annals of Mathematics and Artificial Intelligence 8, 107–132. Niehren, Joachim, Manfred Pinkal & Peter Ruhrberg 1997. A uniform approach to underspecification and parallelism. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (= ACL). Madrid: Association for Computational Linguistics, 410–417. Oepen, Stephan, Helge Dyvik, Jan Tore Lønning, Erik Velldal, Dorothee Beermann, John Carroll, Dan Flickinger, Lars Hellan, Janni Bonde Johannessen, Paul Meurer, Torbjørn Nordgård & Victoria Rosén 2004. Som å kapp-ete med trollet? Towards MRS-based NorwegianEnglish machine translation. In: Proceedings of the 10th International Conference on Theoretical and Methodological Issues in Machine Translation (= TMI). Baltimore, MD, 11–20.
24. Semantic underspecification Pafel, Jürgen 2005. Quantifier Scope in German. Amsterdam: Benjamins. Park, Jong 1995. Quantifier scope and constituency. In: Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics (= ACL). Cambridge, MA: Association for Computational Linguistics, 205–212. Pinkal, Manfred 1995. Logic and Lexicon: The Semantics of the Indefinite. Dordrecht: Kluwer. Pinkal, Manfred 1996. Radical underspecification. In: P. Dekker & M. Stokhof (eds.). Proceedings of the 10th Amsterdam Colloquium. Amsterdam: ILLC, 587–606. Pinkal, Manfred 1999. On semantic underspecification. In: H. Bunt & R. Muskens (eds.). Computing Meaning, vol. 1. Dordrecht: Kluwer, 33–55. Player, Nickie 2004. Logics of Ambiguity. Ph.D. dissertation. University of Manchester. Poesio, Massimo 1996. Semantic ambiguity and perceived ambiguity. In: K. van Deemter & S. Peters (eds.). Semantic Ambiguity and Underspecification. Stanford, CA: CSLI Publications, 159–201. Poesio, Massimo, Patrick Sturt, Ron Artstein & Ruth Filik 2006. Underspecification and anaphora. Theoretical issues and preliminary evidence. Discourse Processes 42, 157–175. Pulman, Stephen G. 1997. Aspectual shift as type coercion. Transactions of the Philological Society 95, 279–317. Rambow, Owen, David Weir & Vijay K. Shanker 2001. D-tree substitution grammars. Computational Linguistics 27, 89–121. Rapp, Irene & Arnim von Stechow 1999. Fast ‘almost’ and the visibility parameter for functional adverbs. Journal of Semantics 16, 149–204. Regneri, Michaela, Markus Egg & Alexander Koller 2008. Efficient processing of underspecified discourse representations. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (= ACL), Short Papers. Columbus, OH: Association for Computational Linguistics, 245–248. Reyle, Uwe 1992. Dealing with ambiguities by underspecification. A first order calculus for unscoped representations. In: M. Stokhof & P. Dekker (eds.). Proceedings of the 8th Amsterdam Colloquium. Amsterdam: ILLC, 493–512. Reyle, Uwe 1996. Co-indexing labeled DRSs to represent and reason with ambiguities. In: K. van Deemter & S. Peters (eds.). Semantic Ambiguity and Underspecification. Stanford, CA: CSLI Publications, 239–268. Richter, Frank & Manfred Sailer 1996. Syntax für eine unterspezifizierte Semantik. PP-Anbindung in einem deutschen HPSG-Fragment. In: S. Mehl, A. Mertens & M. Schulz (eds.). Präpositionalsemantik und PP-Anbindung. Duisburg: Institut für Informatik, 39–47. Richter, Frank & Manfred Sailer 2006. Modeling typological markedness in semantics. The case of negative concord. In: S. Müller (ed.). Proceedings of the HPSG 06 Conference. Stanford, CA: CSLI Publications, 305–325. Robaldo, Livio 2007. Dependency Tree Semantics. Ph.D. disseration. University of Torino. Sag, Ivan 1976. Deletion and Logical Form. Ph.D. dissertation. MIT, Cambridge, MA. Sailer, Manfred 2000. Combinatorial Semantics and Idiomatic Expressions in Head-Driven Phrase Structure Grammar. Doctoral dissertation. University of Tübingen. Sailer, Manfred 2004. Past tense marking in Afrikaans. In: C. Meier & M. Weisgerber (eds.). Proceedings of Sinn und Bedeutung 8. Konstanz: University of Konstanz, 233–248. Sailer, Manfred 2006. Don’t believe in underspecified semantics. In: O. Bonami & P. Cabredo Hofherr (eds.). Empirical Issues in Formal Syntax and Semantics, vol. 6. Paris: CSSP, 375–403. Schiehlen, Michael 2000. Semantic construction. In: W. Wahlster (ed.). Verbmobil. Foundations of Speech-to-Speech Translation. Berlin: Springer, 200–216. Schilder, Frank 2002. Robust discourse parsing via discourse markers, topicality and position. Natural Language Engineering 8, 235–255. Schubert, Lenhart & Francis Pelletier 1982. From English to logic. Context-free computation of ‘conventional’ logical translation. American Journal of Computational Linguistics 8, 26–44. Shieber, Stuart, Fernando Pereira & Mary Dalrymple 1996. Interaction of scope and ellipsis. Linguistics & Philosophy 19, 527–552. Steedman, Mark 2000. The Syntactic Process. Cambridge, MA: The MIT Press.
573
574
V. Ambiguity and vagueness Steedman, Mark 2007. Surface-Compositional Scope-Alternation without Existential Quantifiers. Draft 5.2., September 2007. Ms. Edinburgh, ilcc/University of Edinburgh. http://www.iccs. informatics.ed.ac.uk/~steedman/papers.html, August 25, 2009. Steriade, Donka 1995. Underspecification and markedness. In: J. Goldsmith (ed.). The Handbook of Phonological Theory. Oxford: Blackwell, 114–174. de Swart, Henriëtte 1998. Aspect shift and coercion. Natural Language and Linguistic Theory 16, 347–385. VanLehn, Kurt 1978. Determining the Scope of English Quantifiers. MA thesis. MIT, Cambridge, MA. MIT Technical Report AI-TR-483. Villalta, Elisabeth 2003. The role of context in the resolution of quantifier scope ambiguities. Journal of Semantics 20, 115–162. Wahlster, Wolfgang (ed.) 2000. Verbmobil. Foundations of Speech-to-Speech Translation. Berlin: Springer. Westerståhl, Dag 1998. On mathematical proofs of the vacuity of compositionality. Linguistics & Philosophy 21, 635–643. Woods, William 1967. Semantics for a Question-Answering System. Ph.D. dissertation. Harvard University, Cambridge, MA. Woods, William 1978. Semantics and quantification in natural language question answering. In: M. Yovits (ed.). Advances in Computers, vol. 17. New York: Academic Press, 2–87.
Markus Egg, Berlin (Germany)
25. Mismatches and coercion 1. 2. 3. 4. 5. 6.
Enriched type theories Type coercion Aspect shift and coercion Cross-linguistic implications Conclusion References
Abstract The principle of compositionality of meaning is the foundation of semantic theory. With function application as the main rule of combination, compositionality requires complex expressions to be interpreted in terms of function-argument structure. Type theory lends itself to an insightful representation of many combinations of a functor and its argument. But not all well-formed linguistic combinations are accounted for within classical type theory as adopted by Montague Grammar. Conflicts between the requirements of a functor and the properties of its arguments are described as type mismatches. Three solutions to type mismatches have been proposed within enriched type theories, namely type raising, type shifting, and type coercion. This article focusses on instances of type coercion. We provide examples, propose lexical and contextual restrictions on type coercion, and discuss the status of coercion as a semantic enrichment operation. The paper includes a special application of the notion of coercion in the domain of tense and aspect. Aspectual coercion affects the relationship between predicative and grammatical aspect. We treat examples from English and Romance languages in a cross-linguistic perspective. Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 574–597
25. Mismatches and coercion
1. Enriched type theories 1.1. Type mismatches According to the principle of compositionality of meaning, the meaning of a complex whole is a function of the meaning of its composing parts and the way these parts are put together. With function application as the main rule for combining linguistic expressions, compositionality requires that complex expressions be interpreted in terms of functionargument structure. Functors do not combine with just any argument. They look for arguments with particular syntactic and semantic properties. A type mismatch arises when the properties of the argument do not match the requirements of the functor. Such a type mismatch can lead to ungrammaticalities. Consider the ill-formed combination eat laugh, where the transitive verb eat wants to take an object of category NP, and does not combine with the intransitive verb laugh of category VP. In type-theoretical terms, eat is an expression of type (e, (e,t)), which requires an expression of type e as its argument. Given that laugh is of type (e,t), there is a mismatch between the two expressions, so they do not combine by function application, and the combination eat laugh is ungrammatical. In this article we do not focus on syntactic incompatibilities, but on semantic type mismatches and ways to resolve conflicts between the requirements of a functor and the properties of its argument. Sections 1.2 and 1.3 briefly refer to type raising and type shifting as operations defined within an enriched type theoretical framework. Section 2 addresses type coercion, as introduced by Pustejovsky (1995). Section 3 treats aspectual coercion within a semantic theory of tense and aspect.
1.2. Type raising As we saw with the ill-formed combination eat laugh, a type mismatch can lead to ungrammaticalities. Not all type mismatches have such drastic consequences. Within extended versions of type theory, mechanisms of type-raising have been formulated, that deal with certain type mismatches (cf. Hendriks 1993). Consider the VPs kiss Bill and kiss every boy, which are both well-formed expressions of type (e,t). The transitive verb kiss is an expression of type (e,(e,t)). In the VP kiss Bill, the transitive verb directly combines with the object of type e. However, the object every boy is not of type e, but of type ((e,t),t). The mismatch is resolved by raising the type of the argument of the transitive verb. If we assume that a transitive verb like kiss can take either proper names or generalized quantifiers as its argument, we interpret kiss as an expression of type (e, (e,t)), or type (((e,t),t), (e,t)). Argument raising allows the transitive verb kiss to combine with the object every boy by function application.
1.3. Type shifting Type raising affects the domain in which the expression is interpreted, but maintains function application as the rule of combination, so it preserves compositionality. For other type mismatches, rules of type-shifting have been proposed, in particular in the interpretation of indefinites (Partee 1987). In standard Montague Grammar, NPs have a denotation in the domain of type e (e.g. proper names) and/or type ((e,t),t) (generalized quantifiers). In certain environments, the indefinite seems to behave more like a
575
576
V. Ambiguity and vagueness predicate (type e,t) than like an argument. We find the predicative use of indefinites in predicative constructions (Sue is a lawyer, She considers him a fool), in measurement constructions (This baby weighs seven pounds), with certain ‘light’ verbs (This house has many windows), and in existential constructions (There is a book on the table, (see article 69 (McNally) Existential sentences). An interpretation of indefinites as denoting properties makes it possible to treat predicative and existential constructions as functors taking arguments of type (e,t) (see article 85 (de Hoop) Type shifting, and references therein). The type-shifting approach raises problems for the role of indefinites in the object position of regular transitive verbs, though. Assuming that we interpret the verb eat as an expression of type (e,(e,t)), and the indefinite an apple as an expression of type (e,t), we would not know how to combine the two in eat an apple. Argument raising does not help in this case, for a functor of type (((e,t),t), (e,t)) does not combine with an argument of type (e,t) either. Two solutions present themselves. If we assign transitive verbs two types, the type (e,(e,t)) and the type ((e,t),(e,t)), we accept a lexical ambiguity of all verbs (Van Geenhoven 1998). If widespread lexical ambiguities are not viewed as an attractive solution, the alternative is to develop combinatory rules other than function application. File Change Semantics and Discourse Representation theory (Heim 1982, Kamp 1981, Kamp & Reyle 1993) follow this route, but not in a strictly type-theoretical setting. The indefinite introduces a variable, that is existentially closed at the discourse level if it is not embedded under a quantificational binder. De Swart (2001) shows that we can reinterpret the DRT rule of existential closure in type-theoretical terms. The price we pay is a fairly complex system of closure rules in order to account for monotone increasing, decreasing and non-monotone quantifiers. These complexities favor a restriction of the property type denotation of indefinites to special contexts where specific construction rules can be motivated by the particularities of the construction at hand.
2. Type coercion In Section 1, we saw that enriched type theories have formulated rules of type raising and type shifting to account for well-formed natural language examples that present mismatches in stricter type theories. A third way of resolving type mismatches in natural language has been defined in terms of type coercion, and this route constitutes the focus of this article.
2.1. Type coercion as a strategy to resolve type mismatches The use of coercion to resolve type mismatches goes back to the work on polysemy by James Pustejovsky (Pustejovsky 1995). Pustejosvky points out that the regular rules of type theoretic combinatorics are unable to describe sentences such as (1): (1) a. Mary finished the novel in January 2007. A week later, she started a new book. b. John wants a car until next week. Verbs like start and finish in (1a) do not directly take expressions of type e or ((e,t),t), such as the novel or a new book, for they operate on processes or actions rather than on objects, as we see in start reading, finish writing. As Dowty (1979) points out, the temporal adverbial until next week modifies a hidden or understood predicate in (1b), as in John
25. Mismatches and coercion wants to have a car until next week. The type mismatch in (1) does not lead to ungrammaticalities, but triggers a reinterpretation of the combination of the object of the verb. The hearer fills in the missing process denoting expression, and interprets finish the novel as ‘finish reading the novel’ or ‘finish writing the novel’, and wants a car as ‘wants to have a car’. This process of reinterpretation of the argument, triggered by the type mismatch between a functor (here: the verb finish or start) and its argument (here the object the novel or a new book) is dubbed type coercion by Pustejovsky. Type coercion is a semantic operation that converts an argument to the type expected by the function, where it would otherwise result in a type error. Formally, function application with coercion is defined as in (2) (Pustejovsky 1995: 111), with α the argument and β the functor: (2) Function application with coercion. If α is of type c, β of type (a,b), then: (i) if type c = a, then β(α) is of type b. (ii) if there is a σ ∈ Σα such that σ(α) results in an expression of type a, then β(σ(α)) is of type b. (iii) otherwise a type error is produced. If the argument is of the right type to be the input of the functor, function application applies as usual (i). If the argument can be reinterpreted in such a way that it satisfies the input requirements of the function, function application with coercion applies (ii). Σα represents the set of well-defined reinterpretations of α, leading to the type a for σ(α) that is required as the input of the functor β. If such a reinterpretation is not available, the sentence is ungrammatical (iii). Type coercion is an extremely powerful process. If we assume that any type mismatch between an operator and its argument can be repaired by the hearer simply filling in the understood meaning, we would run the risk of inserting hidden meaningful expressions anywhere, thus admitting random combinations of expressions, and loosing any ground for explaining ungrammaticalities. So the process of type coercion must be severely restricted.
2.2. Restrictions on type coercion There are three important restrictions Pustejovsky (1995) imposes upon type coercion in order to avoid overgeneralization. First, coercion does not occur freely, but must always be triggered as part of the process of function application (clause (ii) of definition 2). In (1), the operators triggering type coercion are the verbs finish, start and want. Outside of the context of such a trigger, the nominal expressions the novel, a new book and a car do not get reinterpreted as processes involving the objects described by these noun phrases (clause (i) of definition 2). Second, type coercion always affects the argument, never the functor. In example (1), this means that the interpretation of finish, start and want is as usual, and only the interpretation of the nominal expressions the novel, a new book and a car is enriched by type coercion. Third, the interpretation process involved in type coercion requires a set of well-defined reinterpretations for any expression α (Σα in clause ii of definition 2). Pustejovsky (1995) emphasizes the role of the lexicon in the reinterpretation process. He develops a generative lexicon, in which not only the referential interpretation of lexical items is spelled out, but an extended interpretation is developed in terms of argument structure, extended event structure and qualia
577
578
V. Ambiguity and vagueness structure (involving roles). In (1), the relevant qualia of the lexical items novel and book include the fact that these nouns denote artifacts which are produced in particular ways (agentive role: process of writing), and which are used for particular purposes (telic role: process of reading). The hearer uses these extended lexical functions to fill in the understood meaning in examples like (1). The qualia structure is claimed to be part of our linguistic knowledge, so Pustejovsky maintains a strict distinction between lexical and world knowledge. See article 32 (Hobbs) Word meaning and world knowledge for more on the relation between word meaning and world knowledge. Type coercion with verbs like begin, start, finish, is quite productive in everyday language use. Google provided many examples involving home construction as in (3a,b), building or producing things more in general (3c-f), and, by extension, organizing (3g, h): (3) a. When I became convalescent, father urged Mr. X to begin the house. b. We will begin the roof on the pavilion on December 1. c. Today was another day full of hard work as our team stays late hours to finish the car. d. (about a bird) She uses moss, bits of vegetation, spider webs, bark flakes, and pine needles to finish the cup-shaped nest. e. Susan’s dress is not at any picture, but the lady in the shop gave very good instructions how to finish the dress. f. Once I finish an album, I’ll hit the road and tour. I got a good band together and I really feel pleased. g. The second step was to begin the House Church. (context: a ministry page) h. This is not the most helpful way to begin a student-centered classroom. Examples with food preparation (4a) and consumption (4b) are also easy to find, even though we sometimes need some context to arrive at the desired interpretation (4c). (4) a. An insomniac, she used to get up at four am and start the soup for dinner, while rising dough for breakfast, and steaming rice for lunch. b. The portions were absolutely rediculusly huge, even my boyfriend the human garbage disposal couldn’t finish his plate. c. A pirate Jeremy is able to drink up a rum barrel in 10 days. A pirate Amelie needs 2 weeks for that. How much time do they need together to finish the barrel? Finally, we find a wide range of other constructions, involving exercices, tables, exams, etc. to complete (5a), technical equipment or computer programs to run (5b), lectures to teach (by lecturers) (5c), school programs to attend (by students) (5d), etc. (5) a. b. c. d.
Read the instructions carefully before you begin the exam. Press play to begin the video. Begin the lesson by asking what students have seen in the sky. These students should begin the ancient Greek or Latin sequence now if they have not already done so.
These reinterpretations are clearly driven by Pustejovsky’s qualia structure, and are thus firmly embedded in a generative lexicon. But type coercion can also apply in creative
25. Mismatches and coercion language use, where the context rather than the lexicon drives the reinterpretation process. If the context indicates that Mary is a goat that eats anything she is fed, we could end up with an interpretation of (1a) in which Mary finished eating the novel, and started eating a new book a week later (cf. Lascarides & Copestake 1998). One might be skeptical about the force of contextual pressure. Indeed, a quick google search revealed no naturally occurring examples of ‘begin/finish the book’ (as in: a goat beginning/ finishing to eat the book). However, there are other examples that suggest a strongly context-dependent application of type coercion: (6) a. Is it too late to start geranium plants from cuttings? (context: a Q&A page on geraniums, meaning: begin growing geranium plants) b. (After briefly looking at spin as an angular momentum property, we will begin discussing rotational spectroscopy.) We finish the particle on a sphere, and begin to apply those ideas to rotational spectography. (context: chemistry class schedule; meaning: finish discussing). The numbers of such examples of creative language use are fairly low, but in the right context, their meaning is easy to grasp. All in all, type coercion is a quite productive process, and the limits on this process reside in a combination of lexical and contextual information.
2.3. Putting type coercion to work The contexts in (7) offer another prominent example of type coercion discussed by Pustejovsky (1995): (7) a. We will need a fast boat to get back in time. b. John put on a long album during dinner. Although fast is an adjective modifying the noun boat in (7a), it semantically operates on the action involving the boat, rather than the boat itself. Similarly, the adjective long in (7b) has an interpretation as an event modifier, so it selects the telic role of playing the record. Pustejovsky’s notion of type coercion has been put to good use in other contexts as well. One application involves the interpretation of associative (‘bridging’) definite descriptions (Clark 1975): (8) Sara brought an interesting book home from the library. A photo of the author was on the cover. The author and the cover are new definites, for their referents have not been introduced in previous discourse. The process of accommodation is facilitated by an extended argument structure, which is based on the qualia structure of the noun (Bos, Buitelaar & Mineur 1995), and the rhetorical structure of the discourse (Asher & Lascarides 1998). Accordingly, the most coherent interpretation of (8) is to interpret the author and the cover as referring to the author and the cover of the book introduced in the first sentence.
579
580
V. Ambiguity and vagueness Kluck (2007) applies type coercion to metonymic reinterpretations as in (9), merging Pustejovsky’s generative lexicon with insights concerning conceptual blending: (9) a toy truck, a stone lion The metonymic interpretation is triggered by a conflict between the semantics of the noun and that of the modifying adjective. In all these examples, we find a shift to an ‘image’ of the object, rather than an instance of the object itself. The metonymic type coercion in (9) comes close to instances of polysemy like those in (10) that Nunberg (1995) has labelled ‘predicate transfer’. (10) The ham sandwich from table three wants to pay. This sentence is naturally used in a restaurant setting, where one waiter informs another one that the client seated at table three, who had a ham sandwich for lunch is ready to pay. Nunberg analyzes this as a mapping from one property onto a new one that is functionally related to it; in this case the mapping from the ham sandwich to the person who had the ham sandwich for lunch. Note that the process of predicate transfer in (10) is not driven by a type mismatch, but by other conflicts in meaning (often driven by selection restrictions on the predicate). This supports the view that reinterpretation must be contextually triggered in general. Asher & Pustejovsky (2005) use Rhetorical Discourse Representation Theory to describe the influence of context more precisely (see article 37 (Kamp & Reyle) Discourse Representation Theory for more on discourse theories). They point out that in the context of a fairy tale, we have no difficulty ascribing human-like properties to goats that allow us to interpret examples like (11): (11) The goat hated the film but enjoyed the book. In a more realistic setting, we might interpret (11) as the goat eating the book, and rejecting the film (cf. section 2.2 above), but in a fictional interpretation, goats can become talking, thinking and reading agents, thereby assuming characteristics that their sortal typing would not normally allow. Thus discourse may override lexical preferences. Type coercion, metonymic type coercion and predicate transfer can be included in the set of mechanisms that Nunberg (1995) subsumes under the label ‘transfer of meaning’, and that come into play in creative language use as well as systematic polysemy. Function application with coercion is a versatile notion, which is applied in a wide range of cases where mismatches between a functor and its argument are repaired by reinterpretation of the argument. Coercion restores compositionality in the sense that reinterpretation involves inserting hidden meaningful material that enriches the semantics of the argument, and enables function application. Lexical theories such as those developed by Pustejovsky (1995), Nunberg (1995), Asher & Pustejovsky (2005), highlighting not only denotational and selective properties, but also functional connections and discourse characteristics are best suited to capture the systematic relations of polysemy we find in these cases.
25. Mismatches and coercion
2.4. The status of type coercion in comprehension Given the complexity of the enriched interpretation process underlying coercion, we might expect it to encur a processing cost. The status of type coercion in comprehension is subject to debate in the literature. McElree et al. (2001) and Traxler et al. (2002) found evidence from self-paced reading and eye-tracking experiments that sentences such as the secretary began the memo involving coercion cause processing difficulties evidenced by longer reading times for relevant parts of the sentence. De Almeida (2004) reports two self-paced reading experiments that failed to replicate crucial aspects of previous studies. He uses these results to argue that type-shifting operations are pragmatic inferences computed over underspecified semantic representations. Pickering et al. (2005) report a new eye-tracking experiment based on de Almeida’s stimuli, and claim evidence of coercion cost with these items, also involving verbs of begin/finish. According to Pickering et al. (2005) and Traxler et al. (2005), the evidence from eye movement strongly suggests that interpretation is costly when composition requires the on-line construction of a sense not lexically stored or available in the immediate discourse. The debate between De Almeida on the one hand and McElree et al., Pickering et al. and Traxler et al. on the other suggests that there are two ways we can look at the meaning effects in contexts like begin the book. De Almeida does not find online comprehension effects of coercion, which leads him to propose an underspecified semantic representation and treat reinterpretation in the pragmatics. McElree et al. and Traxler et al. do find online comprehension effects of coercion, so they include reinterpretation in the compositional semantics. In the remainder of this paper, we will adopt a semantic approach to coercion, while leaving open the possibility of rephrasing the analysis in an underspecification analysis.
3. Aspect shift and coercion We find a special application of the notion of coercion in the domain of tense and aspect, which is discussed in this section. Aspectual coercion relies on a conflict between predicative and grammatical aspect, which is resolved by adapting the characteristics of the eventuality description to the requirements of the aspectual operator. The analysis is spelled out in Discourse Representation Theory (DRT), building on insights from de Swart (1998). We will give an informal, more empirical description in sections 3.1– 3.5, and specify the construction rules in section 3.6. Working knowledge of DRT is presupposed in this presentation.
3.1. Predicative and grammatical aspect The appeal to aspectual coercion dates back to work by Moens (1987), Moens & Steedman (1988), and Parsons (1991). Pulman (1997), Jackendoff (1996), de Swart (1998), Zucchi (1998) and subsequent literature work out these ideas along the lines of Pustejovsky (1995), and embed aspectual coercion in the semantics of tense and aspect. Aspectual coercion affects predicative aspect (also called Aktionsart, aspectual class, situation aspect) as well as grammatical aspect. In languages like English, predicative aspect is determined at the predicate-argument level, following insights by Verkuyl (1972, 1993) and Krifka (1989, 1992). The predicate-argument structure provides a description
581
582
V. Ambiguity and vagueness of a situation or an action, referred to as eventuality description. Eventuality descriptions come in various subtypes (Vendler 1957). For instance, love Susan describes a state, walk in the woods a process, eat a fish an event, and reach the summit a punctual event or achievement. States are characterized by a lack of anything going on, whereas processes and events are both dynamic eventualities. Non-stative verbs that affect their arguments in a gradual and incremental way refer to a process when they combine with a bare plural or a mass noun (drink milk), and to an event when they combine with a quantized NP (drink a glass of milk). Events have an inherent endpoint, and have quantized reference (Bach 1986, Krifka 1989). States and processes lack an inherent endpoint; they are unbounded, and have homogeneous reference. Processes and events require an evaluation over time intervals, whereas states and achievements involve instants (Vendler 1957). In Discourse Representation theory (DRT), the ontological nature of the discourse referents is reflected in the use of designated variables s for states, p for processes and e for events; h is used for homogeneous eventualities (states or processes), and d for dynamic eventualities (processes or events). According to Comrie (1976: 1–3), “aspects are different ways of viewing the internal temporal constituency of a situation.” Grammatical aspect bears on the eventuality description introduced by the predicate-argument structure. In English, the Progressive and the Perfect are examples of grammatical aspect. In the literature, there is a debate about the status of the Perfect as a temporal operator (building on Reichenbach’s 1947 ideas), or as an aspectual operator (Comrie 1976 and others). In de Swart (2007), the temporal and aspectual contribution of the perfect is integrated with its discourse properties. Here we adopt an interpretation of aspectual operators in terms of a mapping relation from one domain of eventualities to another (Bach 1986). Predicative and grammatical aspect are defined in terms of the same kinds of ontological entities: states, processes and events. The mapping approach underlies the DRT approach developed in Kamp & Reyle (1993), de Swart (1998), Schmitt (2001), and others. In she is eating the fish, the Progressive takes an event predicate as input, and produces the state of being in the process of eating the fish as its output. In she has written the letter, the perfect maps the event onto its consequent state. A tense operator introduces existential closure of the set of eventualities, and locates the eventuality on the time axis. Tense operators thus take scope over predicative and grammatical aspect. Under these assumptions, the syntactic structure of the sentence is as follows: (12) [Tense [ Aspect* [ eventuality description ]]] The Kleene star indicates that we can have 0, 1, 2, ... instances of grammatical aspect in a sentence. Tense is obligatory in languages like English, and the eventuality description is fully specified by predicate-argument structure. Grammatical aspect is optional, and aspectual operators apply recursively. (13) a. Julie was eating a fish. [Past [ Prog [ Julie eat a fish ]]] b. Julie has been writing a book. [Pres [ Perfect [ Prog [ Julie write a book ]]]]
25. Mismatches and coercion The aspectual operator applies (recursively if necessary) on the eventuality description provided by the predicate-argument structure, so in (13b), the perfect maps the ongoing process of writing a book onto its consequent state. Aspectual restrictions and implications for a cross-linguistic semantics of the perfect are discussed in Schmitt (2001) and de Swart (2003, 2007). Article 49 (Portner) Perfect and progressive offers a general treatment of the semantics and pragmatics of the perfect. Tense operates on the output of the aspectual operator (if present), so in (13a), the state of the ongoing process of eating is located in the past, and no assertion about the event reaching its inherent endpoint in the real world is made. The structure in (13a) gives rise to the semantic representation in DRT format in Fig. 25.1.
Fig. 25.1: Julie was eating a fish
According to Fig. 25.1, there is a state of Julie eating a fish in progress, that is located at some time t preceding the speech time n (‘now’). The intensional semantics of the Progressive (Dowty 1979, Landman 1992, and others) is hidden in the truth conditions of the Prog operator, and is not our concern in this article (see article 49 (Portner) Perfect and progressive for more discussion).
3.2. Aspectual mismatches and coercion It is well known that the English Progressive does not take just any eventuality description as its argument. The Progressive describes processes and events in their development, and the combination with a state verb frequently leads to ungrammaticalities (14a,b). (14) a. b. c. d.
*?Bill is being sick/in the garden. *?Julie is having blue eyes. I’m feeding him a line, and he is believing every word. Bill is being obnoxious.
However, the Progressive felicitously combines with a state verb in (14c) (from Michaelis 2003) and (14d). A process of aspectual reinterpretation takes place in such examples, and the verb is conceived as dynamic, and describing an active involvement of the agent. The dynamic reinterpretation arises out of the conflict between the aspectual requirements of the Progressive, and the aspectual features of the state description. De Swart (1998) and Zucchi (1998) use the notion of aspectual coercion to explain the meaning effects arising from such mismatches. In an extended use of the structure in (12), aspectual reinterpretation triggers the insertion of a hidden coercion operator Csd mapping a stative description onto a dynamic one: (15) a. He is believing every word. [Pres [Prog [ Csd [he believe every word]]]
583
584
V. Ambiguity and vagueness b. Bill is being obnoxious. [ Pres [ Prog [ Csd [ Bill be obnoxious ]]]] The hidden coercion operator is inserted in the slot for grammatical aspect. In (15), Csd adapts the properties of the state description to the needs of the Progressive. More precisely, the operator Csd reinterprets the state description as a dynamic description, which has the aspectual features that allow it to be an argument of the Progressive operator. In this way, the insertion of a hidden coercion operator resolves the aspectual mismatch between the eventuality description and the aspectual operator in contexts like (15). (15b) gives rise to the semantic representation in DRT in Fig. 25.2:
Fig. 25.2: Bill is being obnoxious
The dynamic variable d, obtained by coercion of the state s’ is the input for the Progressive operator. The output of the progressive is again a state, but the state of an event or process being in progress is more dynamic than the underlying lexical state. The DRT representation contains the coercion operator Csd, with the subscripts indicating the aspectual characterization of the input and output description. The actual interpretation of the coercion operator is pushed into the truth conditions of C, because it is partly dependent on lexical and contextual information (compare also Section 2). The relevant interpretation of Csd in Fig. 25.2 is dynamic: (16) Dynamic is a function from sets of state eventualities onto sets of dynamic eventualities in such a way that the state is presented as a process or event that the agent is actively involved in. Note that (14a) and (14b) cannot easily be reinterpreted along the lines of (16), because of lack of agentivity in the predicate. Following Moens & Steedman (1988), we assume that all mappings between sets of eventualities that are not labelled by overt aspectual operators are free. This implies that all semantically possible mappings that the language allows between sets of eventualities, but that are not the denotation of an overt grammatical aspect operator like the Progressive, the Perfect, etc. are possible values of the hidden coercion operator. Of course, the value of the coercion operator is constrained by the aspectual characteristics of the input eventuality description and the aspectual requirements of the aspectual operator. Thus, the coercion operator in Fig. 25.1 is restricted to being a mapping from sets of states to sets of events.
3.3. Extension to aspectual adverbials Grammatical operators like the Progressive or the Perfect are not the only expressions overtly residing in the Aspect slot in the structure proposed in (12). Consider the
25. Mismatches and coercion
585
sentences in (17)–(19), which illustrate that in adverbials combine with event descriptions, and for and until adverbials with processes or states: (17) in adverbials a. Jennifer drew a spider in five minutes. b. #Jennifer ran in five minutes. c. #Jennifer was sick in five minutes.
(event) (process) (state)
(18) for adverbials a. Jennifer was sick for two weeks. b. Jennifer ran for five minutes. c. #Jennifer drew a spider for five minutes.
(state) (process) (event)
(19) until adverbials a. Jennifer was sick until today. b. Jennifer slept until midnight. c. #Jennifer drew a spider until midnight.
(state) (process) (event)
The aspectual restrictions on in and for adverbials have been noted since Vendler (1957), see article 48 (Filip) Aspectual class and Aktionsart and references therein. For until adverbials, see de Swart (1996), and references therein. A for adverbial measures the temporal duration of a state or a process, whereas an in adverbial measures the time it takes for an event to culminate. Until imposes a right boundary upon a state or a process. Their different semantics is responsible for the aspectual requirements in, for and until adverbials impose upon the eventuality description with which they combine, as illustrated in (17)–(19). We focus here on for and in adverbials. Both have been treated as measurement expressions that yield a quantized eventuality (Krifka 1989). Thus to run denotes a set of processes, but to run for five minutes denotes a set of quantized eventualities. Aspectual adverbials can then be treated as operators mapping one set of eventualities onto another (cf. Moens & Steedman 1988). For adverbials map a set of states or processes onto a set of events; in adverbials a set of events onto a set of events. The mapping approach provides a natural semantics of sentences (17a) and (18a, b). However, the mismatches in (17b, c) and (18c) are marked as pragmatically infelicitous (#), rather than ungrammatical (*). This raises the question whether the aspectual restrictions on in and for adverbials are strictly semantic, or rather loosely pragmatic, in which case an underspecification account might be equally successful (cf. Dölling 2003 for a proposal, and article 24 (Egg) Semantic underspecification for a general discussion on the topic). We can maintain the semantic treatment of aspectual adverbials as two different kinds of measurement expressions, if we treat the well-formed examples that combine in with states/processes or for with event descriptions as instances of aspectual coercion, as illustrated in (20) and (21): (20) a. Sally read a book for two hours. (process reading of event) [Past [ for two hours [ Ceh [ Sally read a book ]]]] b. Jim hit a golf ball into the lake for an hour. (frequentative reading of event)
586
V. Ambiguity and vagueness c.
The train arrived late for several months. (habitual reading of event) [Past [ for several months [ Ceh [ the train arrived late ]]]]
(21) a. We took the train under the English Channel and were in Paris in 3 hours. (inchoative reading of state) [Past [ in three hours [Che [ we were in Paris ]]]] b. Jim broke his leg in a car accident last year. Fortunately, it healed well, and in six months he was walking again. (inchoative reading of progressive process) [Past [ in six months [Che [ Prog [ Jim walk ]]]] The process reading of the event in (20a) reflects that Sally did not necessarily read the whole book, but was involved in reading for two hours. The event description with a for adverbial in (20b) implies that there is a golf ball that Jim hit into the lake repeatedly for an hour (Van Geenhoven 2005: 119). In (20a-c), the hidden coercion operator Ceh represents the reinterpretation of the event description provided by the predicateargument structure as an eventuality with homogeneous reference. Depending on the context, a process (20a), a frequentative (20b) or a habitual reading (20c) is the value of Ceh. Examples (21a) and (21b) get an inchoative reading: (21a) picks up on the transition from not being in Paris to being in Paris. The inchoative reading of (21a) and (21b) arises as the reinterpretation of the homogeneous input as a non-homogeneous event by the hidden coercion operator Che. Fig. 25.3 spells out the semantics of (21a) in DRT:
Fig. 25.3: We were in Paris in three hours
The value of Che in Fig. 25.3 is inchoativity defined as in (22a). Relevant values of the operator Ceh, relevant for the examples in (20) are the process reading and the frequentative/habituality reading, defined in (22b,c). (22) a. Incho is a function from sets of homogeneous eventualities onto sets of event descriptions in such a way that the event describes the onset of the state or process. This interpretation generates the entailment that the state/process holds after the inchoative event. b. Proc is a function from sets of eventualities to sets of processes, in such a way that we obtain the process underlying the event predicate without reference to an inherent culmination point. c. Freq/Hab is a function from sets of eventualities of any aspectual class onto a set of states. Its interpretation involves a generic or quantificational operator
25. Mismatches and coercion over eventualities such that the frequentative/habitual state describes a stable property over time of an agent. Note that habitual readings are constrained to iterable events, which imposes constraints upon the predicate-argument structure, as analyzed in de Swart (2006). The advantage of a coercion analysis over an underspecification approach in the treatment of the aspectual sensitivity of in and for adverbials is that we get a systematic explanation of the meaning effects that arise in the unusual combinations exemplified in (17b, c), (18c), (20) and (21), but not in (17a) and (18a, b). Effects of aspectual coercion have been observed in semantic processing (Piñango et al. 2006 and references therein). Brennan & Pylkkänen (2008) hypothesize that the online effects of aspectual coercion are milder than in type coercion, because it is easier to shift into another kind of event than to shift an object into an event. Nevertheless, they find an effect with magnetoencephalography, which they take to support the view that aspectual coercion is part of compositional semantics, rather than a context-driven pragmatic enrichment of an underspecified semantic representation, as proposed by Dölling (2003) (cf. also the discussion in section 2.4 above).
3.4. Aspectually sensitive tenses Aspectual mismatches can arise at all levels in the structure proposed in (12). De Swart (1998) takes the English simple past tense to be aspectually transparent, in the sense that it lets the aspectual characteristics of the eventuality description shine through. Thus, both he was sick and he ate an apple get an episodic reading, and the sentences locate a state and an event in the past respectively. Not all English tenses are aspectually transparent. The habitual interpretation of he drinks or she washes the car (Schmitt 2001, Michaelis 2003) arises out of the interaction of the process/event description and the present tense, but cannot be taken to be the meaning of either, for he drank and she washed the car have an episodic besides a habitual interpretation, and he is sick has just an episodic interpretation. We can explain the special meaning effects if we treat the English simple present as an aspectually restricted tense. It is a present tense in the sense that it describes the eventuality as overlapping in time with the speech time. It is aspectually restricted in the sense that it exclusively operates on states (Schmitt 2001, Michaelis 2003). This restriction is the result of two features of the simple present, namely its characterization as an imperfective tense, which requires a homogenous eventuality description as its input (cf. also the French Imparfait in section 4.4 below), and the fact that it posits an overlap of the eventuality with the speech time, which requires an evaluation in terms of instants, rather than intervals. States are the only eventualities that combine these two properties. As a result, the interpretation of (23a) is standard, and the sentence gets an episodic interpretation, but the aspectual mismatch in (23b) and (23c) triggers the insertion of a coercion operator Cds, reinterpreting the dynamic description as a stative one (see Fig. 25.4). (23) a. Bill is sick. [ Present [ Bill be sick ]] b. Bill drinks. [ Present [ Cds [ Bill drink ]]]
587
588
V. Ambiguity and vagueness c.
Julie washes the car. [ Present [ Cds [ Julie wash the car ]]]
Fig. 25.4: Julie washes the car
In (23b) and (23c), the value of Cds is a habitual interpretation. A habitual reading is stative, because it describes a regularly recurring action as a stable property over time (Smith 2005). The contrast between (23a) and (23b, c) confirms the general feature of coercion, namely that reinterpretation only arises if coercion is forced by an aspectual mismatch. If there is no conflict between the aspectual requirements of the functor, and the input, no reinterpretation takes place, and we get a straightforward episodic interpretation (23a). Examples (23b) and (23c) illustrate that the reinterpretations that are available through coercion are limited to those that are not grammaticalized by the language (Moens 1987). The overt Progressive blocks the progressive interpretation of sentences like (23b, c) as the result of a diachronic development in English (Bybee, Perkins & Pagliucca 1994: 144).
3.5. Aspectual coercion and rhetorical structure At first sight, the fact that an episodic as well as a habitual interpretation is available for he drank and she washed the car might be a counterexample to the general claim that aspectual coercion is not freely available: (24)
(25)
Bill drank. a. [Past [ Bill drink ]] b. [Past [Cds [ Bill drink]]]
(episodic interpretation) (habitual intepretation)
Julie washed the car. a. [Past [ Julie washed the car ]] b. [Past [ Cds [ Julie washed the car ]]]
(episodic interpretation) (habitual interpretation)
Given that the episodic interpretation spelled out in (24a) and (25a) does not show an aspectual mismatch, we might not expect to find habitual interpretations involving a stative reinterpretation of the dynamic description, as represented in (24b) and (25b), but we do. Such aspectual reinterpretations are not triggered by a sentence internal aspectual conflict, but by rhetorical requirements of the discourse. Habitual interpretations can be triggered if the sentence is presented as part of a list of stable properties of Bill over time, as in (26): (26) a. In high school, Julie made a little pocket money by helping out the neighbours. She washed the car, and mowed the lawn.
25. Mismatches and coercion
589
b. In college, Bill led a wild life. He drank, smoked, and played in a rock band, rather than going to class. Aspectual coercion may be governed by discourse requirements as in (26), but even then, the shifted interpretation is not free, but must be triggered by the larger context, and more specifically the rhetorical structure of the discourse. An intermediate case is provided by examples in which an adverb in the sentence indicates the rhetorical function of the eventuality description, and triggers aspectual coercion, as in (27): (27) Suddenly, Jennifer knew the answer. [Past [ Csd [ Jennifer know the answer ]]]
(inchoative reading of state)
Suddenly only applies to happenings, but know the answer is a state description, so (27) illustrates an aspectual mismatch. The insertion of a hidden coercion operator solves the conflict, so we read the sentence in such a way that the adverb suddenly brings out the inception of the state as the relevant happening in the discourse.
3.6. Key features of aspectual coercion In this section, we defined aspectual mapping operations with coercion in DRT. Formally, the proposal can be summed up as follows: (28)
Aspectual mappings with coercion in DRT (i) When a predicate-argument structure S1 is modified by an aspectual operator Asp denoting a mapping from a set of eventualities of aspectual class a to a set of eventualities of aspectual class a’, introduce in the universe of discourse of the DRS a new dr a’ and introduce in the set of conditions a new condition a’: Asp K1. (ii) If S1 denotes an eventuality description of aspectual class a, introduce in the universe of discourse of K1 a new discourse referent a’, and introduce in the set of conditions of K1 a new condition a’: γ, where γ is the denotation of S1. (iii) If S1 denotes an eventuality description of aspectual class a” (a” ≠ a), and there is a coercion operator Ca”a such that Ca”a(γ) denotes a set of eventualities of aspectual class a, introduce in the universe of K1 a new discourse referent a’, and introduce in the set of conditions of K1 a new condition a’: Ca”a(γ), where γ is the denotation of S1. (iv) If S1 denotes an eventuality description of aspectual class a” (a” ≠ a), and there is no coercion operator Ca”a such that Ca”a(γ) denotes a set of eventualities of aspectual class a, modification of S1 by the aspectal operator Asp is semantically ill-formed.
According to this definition, aspectual coercion only arises when aspectual features of an argument do not meet the requirements of the functor (typically an adverbial expression, an aspectual operator, or a tense operator) (compare clauses ii and iii). As a result, we never find coercion operators that map a set of states onto a set of states, or a set of events onto a set of events, but we always find mappings from one
590
V. Ambiguity and vagueness aspectual domain onto another one. Reinterpretation requires an enriched interpretation of the argument that meets the aspectual requirements of the functor (clause iii). The functor itself does not shift, but maintains its normal interpretation. The range of possible interpretations of a hidden coercion operator excludes the ones that are the denotation of an overt aspectual operator. Coercion operators generally have a range of possible meanings, which are hidden in the truth conditions, so that the actual interpretation of the coercion operator in a sentence depends on lexical and contextual information. Several possible values for aspectual transitions have been listed in the course of this section.
4. Cross-linguistic implications Whether we find instances of aspectual coercion in a language, and which meaning effects are created depends on the aspectuo-temporal system of the language. Therefore, it is useful to extend the discussion to languages other than English. We restrict ourselves to some observations concerning Romance languages, the aspectual systems of which are well studied.
4.1. The Romance simple present The semantic value of coercion operators is limited to those aspect shifts that are not the denotation of overt grammatical operators. In section 3.4 above, we observed that the English Simple Present tense is an aspectually sensitive tense operator, which exclusively locates a state as overlapping with the speech time. The Simple Present gets a habitual reading when it combines with an event predicate (14b, c). There is a clear contrast between English and Romance languages like French, where the simple present tense is also an aspectually sensitive tense operator, but allows both habitual and progressive interpretations, as illustrated in (29) (from Michaelis 2003): (29) a. Faites pas attention, Mademoiselle. Il vous taquine! ‘Don’t pay any attention to him, miss. He’s teasing you.’ b. La pratique régulière du jogging prolonge la vie de deux à huit ans. ‘Regular jogging prolongs life from two to eight years.’ French lacks a grammaticalized progressive. As a result, a progressive or a habitual interpretation is possible as the value of the coercion operator Ced, which is inserted to repair the mismatch between the event description and the aspectually sensitive tense operator. The habitual reading of the coercion operator has been defined in (22d) above; its progressive interpretation is defined in (30): (30) Prog is a function from sets of eventualities to sets of processes, in such a way that we obtain the process underlying the event predicate without referent to an inherent culmination point. The Italian and Spanish simple presents are similar to French, but the Portuguese simple present tense behaves like its English counterpart, and a periphrastic progressive is required to refer to an ongoing event (Schmitt 2001).
25. Mismatches and coercion
591
4.2. The perfective/imperfective contrast in Romance De Swart (1998) focuses on the contrast between the perfective and imperfective past tenses in French, called the Passé Simple (PS) and the Imparfait (Imp). She adopts Kamp & Rohrer’s (1983) insight that Passé Simple sentences introduce events into the discourse representation, and Imparfait sentences introduce states. Given that events typically happen in sequence, and move the narrative time forward, whereas states introduce background information valid at the current reference time, Kamp & Rohrer’s characterization of the Passé Simple and the Imparfait provides an insightful account of (31) and (32): (31) Le lendemain, elle se leva, but un café, et partit tôt. The next-day, she refl got-up.ps, drank.ps a coffee, and left.ps early. ‘The next day, she got up, drank a cup of coffee, and left early.’ (32) Il alla ouvrir les volets. Il faisait un grand soleil. He went.ps open.inf the blinds. It made.imp a large sun. ‘He went to open the blinds. The sun was shining.’ In order to describe the role of the Passé Simple and the Imparfait in narrative discourse, a characterization in terms of events and states/processes is quite useful, so we would like to maintain this insight. However, Kamp & Rohrer (1983) do not work out the relation between the discourse functioning of the sentence, and its internal structure. What is of particular interest to us is the combination of predicative aspect with the different tense forms. (31) and (32) exemplify state descriptions in the Imparfait, and event descriptions in the Passé Simple. Although this seems to be the default, a wider range of possibilities is available in the language. (33) combines a state description with the Passé Simple, and (34) and (35) an event description with the Imparfait: (33) Il désira voir l’Imprimerie, il la vit He desired.ps see.inf the Press, he her saw.ps ‘He desired to see the Press, saw it, and was satisfied.’
et and
fut content. was.ps content
(34) Chaque jour, Henri écrivait une lettre qu’il envoyait à Florence. Every day, Henri wrote.imp a letter that he sent.imp to Florence ‘Every day, Henri wrote a letter that he sent to Florence.’ (35) Il faisait ses devoirs quand on sonna à la porte. He made.imp his homework when one rang.ps at the door. ‘He was doing his homework when the doorbell rang.’ The combination of a state description with an Imparfait describes the state in its duration (32), but the combination with the Passé Simple triggers an inchoative reading (33). An event description in the Passé Simple introduces a culminated event (31), but the Imparfait gets a habitual (34) or a progressive interpretation (35). These meaning effects are well described in French grammars, and a compositional theory of aspect should capture them in a systematic way. In one line of work, the Passé Simple and the Imparfait are described as fused tense forms that combine a past tense with a perfective
592
V. Ambiguity and vagueness or an imperfective operator (Smith 1991/1997, Vet 1994, Verkuyl 1993). Given that the contribution of tense and aspect cannot be morphologically separated in the French past tenses, and the semantic contribution of the perfective/imperfective contrast correlates in a systematic way with predicative aspect, de Swart (1998) proposes to treat the Passé Simple and the Imparfait as aspectually sensitive tense operators. The Passé Simple and the Imparfait are both past tense operators, but differ in that the Passé Simple locates quantized events in the past, whereas the Imparfait locates homogeneous states/ processes in the past. The eventuality descriptions in (31) and (32) satisfy the aspectual requirements of the tense operator, so their semantic representation does not involve a grammatical aspect operator: (36) a. Elle but (ps) un café. [ Past [ she drink a coffee ]] b. Il faisait (imp) un grand soleil. [ Past [ the sun shine ]] (33)–(35) exemplify a mismatch between the aspectual requirements of the tense form, and the properties of the eventuality description. The insertion of a hidden coercion operator restores compositionality, but requires a reinterpretation of the argument: (37) a. Il fut (ps) content. [ Past [ Che [ he be content ]]] b. Il faisait (imp) ses devoirs. [ Past [ Ceh [ he do his homework ]]] The inchoative reading of (37a) is the result of the reinterpretation of the state description as an event by the coercion operator Che that also played a role in example (21) (definition of incho in 22a) above. The process reading of (37b) arises out of the reinterpretation of the event description as an eventuality with homogeneous reference by the coercion operator Ceh that also played a role in example (20a) (definition of proc in 22b) above. The insertion of the coercion operators Che and Ceh enriches the semantics of the eventuality description in such a way that it meets the aspectual requirements of the tense operator. The structures in (36) give rise to the following semantic representations in DRT:
Fig. 25.5: Il fut content
The value of the coercion operator depends on lexical and contextual information, so the range of meanings in (33)–(35) is accounted for by the various possible reinterpretations introduced in section 3 above. The coercion operators emphasize that the meaning effects are not part of the semantics of the Passé Simple and Imparfait per se, but arise
25. Mismatches and coercion
Fig. 25.6: Il faisait ses devoirs
out of the conflict between the aspectual requirements of the tense operator, and the input eventuality description. The outcome of the coercion process is that state/process descriptions reported in the Passé Simple are event denoting, and fit into a sequence of ordered events, whereas event descriptions reported in the Imparfait are state denoting, and do not move the reference time forward. (38) a. Ils s’ aperçurent de l’ennemi. Jacques eut grand peur. They refl noticed.ps of the ennemi. Jacques had.ps great fear.’ ‘They noticed the ennemi. Jacques became very fearful.’ b. Il se noyait quand l’agent le sauva He refl drowned.imp when the officer him saved.ps en le retirant de l’eau. by him withdraw. part from the water. ‘He was drowning when the officer saved him by dragging him out of the water.’ The examples in (38) are to be compared to those in (30) and (31), where no coercion takes place, but the discourse structures induced by the Passé Simple and the Imparfait are similar. We conclude that the coercion approach maintains Kamp & Rohrer’s (1983) insights about the temporal structure of narrative discourse in French, while offering a compositional analysis of the internal structure of the sentence. Versions of a treatment of the perfective/imperfective contrast in terms of aspectually sensitive tense operators have been proposed for other Romance languages, cf. Schmitt (2001). If we embed the analysis of the perfective and imperfective past tenses in the hierarchical structure proposed in (12), we predict that they always take wide scope over overt aspectual operators, including aspectual adverbials, negation, and quantificational adverbs (iterative adverbs like twice and frequentative adverbs like often). If these expressions are interpreted as mappings between sets of eventualities, we predict that the distribution of perfective/ imperfective tenses is sensitive to the output of the aspectual operator. De Swart (1998) argues that eventuality descriptions modified by in and for adverbials are quantized (cf. section 3.3 above), and are thus of the right aspectual class to serve as the input to the Passé Simple in French. She shows that coercion effects arise in the combination of these adverbials with the Imparfait. De Swart & Molendijk (1999) extend the analysis to negation, iteration and frequency in French. Lenci & Bertinetto (2000) treat the perfective/imperfective past tense distribution in sentences expressing iteration and frequency in Italian, and Pérez-Leroux et al. (2007) show that similar meaning effects arise in Spanish. Coercion effects also play a role in first and second language acquisition, cf. Montrul & Slabakova (2002), Pérez-Leroux et al. (2007), and references therein for aspectuality.
593
594
V. Ambiguity and vagueness
5. Conclusion This article outlined the role that type coercion and aspectual reinterpretation play in the discussion on type mismatches in the semantic literature. This does not mean that the coercion approach has not been under attack, though. Three issues stand out. The question whether coercion is a semantic enrichment mechanism or a pragmatic inference based on an underspecified semantics has led to an interesting debate in the semantic processing literature. The question is not entirely resolved at this point, but there are strong indications that type coercion and aspectual coercion lead to delays in online processing, supporting a semantic treatment. As far as aspectual coercion is concerned, the question whether predicative aspect and grammatical aspect are to be analyzed with the same ontological notions (as in the mapping approach discussed here), or require two different sets of tools (as in Smith 1991/1997) is part of an ongoing discussion in the literature. It has been argued that the lexical component of the mapping approach is not fine-grained enough to account for more subtle differences in predicative aspect (Caudal 2005). In English, there is a class of verbs (including read, wash, combe, ...) that easily allow the process reading, whereas others (eat, build, ...) do not (Kratzer 2004). Event decomposition approaches (as in Kempchinsky & Slabakova 2005) support the view that some aspectual operators apply at a lower level than the full predicate-argument structure, which affects the view on predicative aspect developed by Verkuyl (1972/1993). The contrast between Englishtype languages, where aspectual operators apply at the VP level, and Slavic languages, where aspectual prefixes are interpreted at the V-level is highly relevant here (Filip & Rothstein 2006). The sensitivity of coercion operators to lexical and contextual operators implies that the semantic representations developed in this article are not complete without a mechanism that resolves the interpretation of the coercion operator in the context of the sentence and the discourse. So far, we only have a sketch of how this would have to run (Pulman 1997, de Swart 1998, Hamm & van Lambalgen 2005, Asher & Pustejovsky 2005). There is also a more fundamental problem. Most operators we discussed in the aspectual section are sensitive to the stative/dynamic distinction, or the homogeneous/quantized distinction. This implies that the relation established by coercion is not strictly functional, for it leaves us with several possible outputs (Verkuyl 1999). The same issue arises with type coercion. However, spelling out the different readings semantically would bring us back to a treatment in terms of ambiguities or underspecification, and we would lose our grip on the systematicity of the operator-argument relation. Notwithstanding the differences in framework, and various problems with the implementation of the notion of coercion, the idea that reinterpretation effects are real, and require a compositional analysis seems to have found its way into the literature.
6. References de Almeida, Roberto 2004. The effect of context on the processing of type-shifting verbs. Brain & Language 90, 249–261. Asher, Nicholas & Alex Lascarides 1998. Briding. Journal of Semantics 15, 83–113. Asher, Nicholas & James Pustejovsky 2005. Word Meaning and Commonsense Metaphysics. Ms. Austin, TX, University of Texas/Waltham/Boston, MA, Brandeis University.
25. Mismatches and coercion Bach, Emmon 1986. The algebra of events. Linguistics & Philosophy 9, 5–16. Bos, Johan, Paul Buitelaar & Anne-Marie Mineur 1995. Bridging as coercive accomodation. In: S. Manandhar (ed.). Proceedings of the Workshop on Computational Logic for Natural Language Processing. Edinburgh. Brennan, Jonathan & Liina Pylkkänen 2008. Processing events. Behavioural and neuromagnetic correlates of aspectual coercion. Brain & Language 106, 132–143. Bybee, Joan, Revere Perkins & William Pagliuca 1994. The Evolution of Grammar. Tense, Aspect and Modality in the Languages of the World. Chicago, IL: The University of Chicago Press. Caudal, Patrick 2005. Stage structure and stage salience for event semantics. In: P. Kempchinsky & R. Slabakova (eds.). Aspectual Inquiries. Dordrecht: Springer, 239–264. Clark, Herbert H. 1975. Bridging. In: R. C. Schank & B. L. Nash-Webber (eds.). Theoretical Issues in Natural Language Processing. New York: Association for Computing Machinery, 169–174. Comrie, Bernard 1976. Aspect. An Introduction to the Study of Verbal Aspect and Related Problems. Cambridge: Cambridge University Press. Dölling, Johannes 2003. Aspectual (re)interpretation. Structural representation and processing. In: H. Härtl & H. Tappe (eds.). Mediating between Concepts and Grammar. Berlin: Mouton de Gruyter, 303–322. Dowty, David R. 1979. Word Meaning and Montague Grammar. The Semantics of Verbs and Times in Generative Semantics and Montague’s PTQ. Dordrecht: Reidel. Filip, Hana & Susan Rothstein 2006. Telicity as a semantic parameter. In: J. Lavine et al. (eds.). Formal Approaches to Slavic Linguistics (= FASL) 14. Ann Arbor, MI: University of Michigan Slavic Publications, 139–156. Van Geenhoven, Veerle 1998. Semantic Incorporation and Indefinite Descriptions. Semantic and Syntactic Aspects of Noun Incorporation in West Greenlandic. Stanford, CA: CSLI Publications. Van Geenhoven, Veerle 2005. Atelicity, pluractionality and adverbial quantification. In: H. J. Verkuyl, H. de Swart & A. van Hout (eds.). Perspectives on Aspect. Dordrecht: Springer, 107–124. Hamm, Fritz & Michiel van Lambalgen 2005. The Proper Treatment of Events. Oxford: Blackwell. Heim, Irene 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation. University of Massachusetts, Amherst, MA. Reprinted: Ann Arbor, MI: University Microfilms. Hendriks, Herman 1993. Studied Flexibility. Categories and Types in Syntax and Semantics. Doctoral dissertation. University of Amsterdam. Jackendoff, Ray 1996. The proper treatment of measuring out, telicity and perhaps even quantification in English. Natural Language and Linguistic Theory 14, 305–354. Kamp, Hans 1981. A theory of truth and semantic representation. In: J. Groenendijk, T. Janssen & M. Stokhof (eds.). Formal Methods in the Study of Language. Amsterdam: Mathematical Centre, 277–321. Kamp, Hans & Christian Rohrer 1983. Tense in texts. In: R. Bäuerle, Ch. Schwarze & A. von Stechow (eds.). Meaning, Use, and Interpretation of Language. Berlin: Mouton de Gruyter, 250–269. Kamp, Hans & Uwe Reyle 1993. From Discourse to Logic. Dordrecht: Kluwer. Kempchinsky, Paula & Roumyana Slabakova (eds.) 2005. Aspectual Inquiries. Dordrecht: Springer. Kluck, Marlies 2007. Optimizing interpretation from a generative lexicon. A case study of metonymic type coercion in modified nouns. In: K. Kanzaki, P. Bouillon & L. Danlos (eds.). Proceedings of the 4th International Workshop on Generative Approaches to the Lexicon. Paris. Kratzer, Angelika 2004. Telicity and the meaning of objective case. In: J. Guéron & J. Lecarme (eds.). The Syntax of Time. Cambridge, MA: The MIT Press, 75–115. Krifka, Manfred 1989. Nominal reference, temporal constitution and quantification in event semantics. In: R. Bartsch, J. van Benthem & P. van Emde Boas (eds.). Semantics and Contextual Expression. Dordrecht: Foris, 75–115. Krifka, Manfred 1992. Thematic relations as links between nominal reference and temporal constitution. In: I. Sag & A. Szabolcsi (eds.). Lexical Matters. Stanford, CA: CSLI Publications, 29–53. Landman, Fred 1992. The progressive. Natural Language Semantics 1, 1–32.
595
596
V. Ambiguity and vagueness Lascarides, Alex & Ann Copestake 1998. Pragmatics and word meaning. Journal of Linguistics 34, 387–414. Lenci, Allessandro & Pier M. Bertinetto 2000. Aspect, adverbs and events. Habituality vs. perfectivity. In: J. Higginbotham, F. Pianesi & A. C. Varzi (eds.). Speaking of Events. Oxford: Oxford University Press, 245–287. McElree, Brian, Matthew Traxler, Martin Pickering, Rachel E. Seely & Ray Jackendoff 2001. Reading time evidence for enriched semantic composition. Cognition 78, B17–B25. Michaelis, Laura A. 2003. Headless constructions and coercion by construction. In: E. J. Francis & L. A. Michaelis (eds.). Mismatch. Form-Function Incongruity and the Architecture of Grammar. Stanford, CA: CSLI Publications, 259–310. Moens, Marc 1987. Tense, Aspect and Temporal Reference. Ph.D. dissertation. University of Edinburgh. Moens, Marc & Mark Steedman 1988. Temporal ontology and temporal reference. Computational Linguistics 14, 15–28. Montrul, Silvina A. & Roumyana Slabakova 2002. The L2 acquisition of morphosyntactic and semantic properties of the aspectual tenses preterite and imperfect. In: A. T. Pérez-Leroux & J. M. Liceras (eds.). The Acquisition of Spanish Morphosyntax. The L1/L2 Connection. Dordrecht: Kluwer, 113–149. Nunberg, Geoffrey 1995. Transfers of meaning. Journal of Semantics 12, 109–132. Parsons, Terence 1991. Events in the Semantics of English. A Study in Subatomic Semantics. Cambridge, MA: The MIT Press. Partee, Barbara H. 1987. Noun phrase interpretation and type-shifting principles. In: J. Groenendijk, D. de Jongh & M. Stokhof (eds.). Studies in Discourse Representation Theory and the Theory of Generalized Quantifiers. Dordrecht: Foris, 115–143. Pérez-Leroux, Ana T., Alejandro Cuza, Monica Majzlanova & Jeanette Sánchez-Naranjo 2007. Non-native recognition of the iterative and habitual meanings of Spanish preterite and imperfect tenses. In: J. M. Liceras, H. Zobl & H. Goodluck (eds.). The Role of Formal Features in Second Language Acquisition. Mahwah, NJ: Lawrence Erlbaum, 432–451. Pickering, Martin, Brian McElree & Matthew Traxler 2005. The difficulty of coercion. A response to de Almeida. Brain & Language 93, 1–9. Piñango, Maria M., Aaron Winnick, Rashad Ullah & Edgar Zurif 2006. Time-course of semantic composition. The Case of aspectual coercion. Journal of Psycholinguistic Research 35, 233–244. Pulman, Stephen 1997. Aspectual shift as type coercion. Transactions of the Philological Society 95, 279–317. Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA: The MIT Press. Reichenbach, Hans 1947. Elements of Symbolic Logic. New York: Free Press, paperback edn. 1966. Schmitt, Cristina 2001. Cross-linguistic variation and the present perfect. The case of Portuguese. Natural Language and Linguistic Theory 19, 403–453. Smith, Carlota S. 1991. The Parameter of Aspect. Dordrecht: Kluwer, revised 2nd version 1997. Smith, Carlota S. 2005. Aspectual entities and tense in discourse. In: P. Kempchinsky & R. Slabakova (eds.). Aspectual Inquiries. Dordrecht: Springer, 223–237. de Swart, Henriëtte 1996. Meaning and use of ‘not ... until’. Journal of Semantics 13, 221–263. de Swart, Henriëtte 1998. Aspect shift and coercion. Natural Language and Linguistic Theory 16, 347–385. de Swart, Henriëtte 2001. Weak readings of indefinites. Type-shifting and closure. The Linguistics Review 18, 69–96. de Swart, Henriëtte 2003. Coercion in a cross-linguistic theory of aspect. In: E. J. Francis & L. A. Michaelis (eds.). Mismatch. Form-Function Incongruity and the Architecture of Grammar. Stanford, CA: CSLI Publications, 231–258.
26. Metaphors and metonymies
597
de Swart, Henriëtte 2006. Aspectual implications of the semantics of plural indefinites. In: S. Vogeleer & L. Tasmowski (eds.). Non-definiteness and Plurality. Amsterdam: Benjamins, 169–189. de Swart, Henriëtte 2007. A cross-linguistic discourse analysis of the perfect. Journal of Pragmatics 39, 2273–2307. de Swart, Henriëtte & Arie Molendijk 1999. Negation and the temporal structure of narrative discourse. Journal of Semantics 16, 1–42. Traxler, Matthew, Martin Pickering & Brian McElree 2002. Coercion in sentence processing. Evidence from eye-movements and self-paced reading. Journal of Memory and Language 4, 530–547. Traxler, Matthew, Brian McElree, Rihana S. Williams & Martin Pickering 2005. Context effects in coercion. Evidence from eye movements. Journal of Memory and Language 53, 1–25. Vendler, Zeno 1957. Verbs and times. Philosophical Review 66, 143–160. Reprinted in: Z. Vendler, Linguistics in Philosophy. Ithaca, NY: Cornell University Press, 1967, 97–121. Verkuyl, Henk J. 1972. On the Compositional Nature of the Aspects. Dordrecht: Reidel. Verkuyl, Henk J. 1993. A Theory of Aspectuality. The Interaction between Temporal and Atemporal Structure. Cambridge: Cambridge University Press. Verkuyl, Henk J. 1999. Aspectual Issues. Studies on Time and Quantity. Stanford, CA: CSLI Publications. Vet, Co 1994. Petite grammaire de l’Aktionsart et de l’aspect. Cahiers de Grammaire 19, 1–17. Zucchi, Sandro 1998. Aspect shift. In: S. Rothstein (ed.). Events and Grammar. Dordrecht: Kluwer, 349–370.
Henriëtte de Swart, Utrecht (The Netherlands)
26. Metaphors and metonymies 1. 2. 3. 4. 5. 6. 7.
Introduction Traditional approaches to metaphor Pragmatic accounts of metaphor Psycholinguistic approaches to figurative language Cognitive and conceptual accounts Metonymy References
Abstract For centuries, the study of metaphor and metonymy was primarily the province of rhetoricians. Although scholars developed a number of variations on the theme, the prevailing perspectives, i.e. that these tropes are stylistic devices used primarily for literary purposes and result from some kind of transfer of meaning between similar entities, remained largely unchanged from Aristotle until the mid twentieth century. Aspects of this long tradition still exist in current accounts that continue to argue that metaphoric meaning is in some way deviant, involves transference, and should be analyzed on a word by word basis. In the last 50 years linguists, such as Lakoff and Sperber & Wilson, have joined philosophers, such as Black and Grice, in the debate. The result has been a sharp increase in interest in
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 597–621
598
V. Ambiguity and vagueness non-literal language and a number of major innovations in metaphor theory. Metaphor and metonymy are now understood to be ubiquitous aspects of language, not simply fringe elements. The study of metaphor and metonymy have provided a major source of evidence for Cognitive Linguists to argue against the position that a sharp divide exists between semantics and pragmatics. Mounting empirical data from psychology, especially work by Gibbs, has led many to question the sharp boundary between literal and metaphorical meaning. While distinct perspectives remain among contemporary metaphor theorists, very recent developments in relevance theory and cognitive semantics also show intriguing areas of convergence.
1. Introduction It is an exaggeration to say that metaphor and metonymy are the tropes that launched 1,000 theories. Still, they are long recognized uses of language that have stimulated considerable debate among language scholars and most recently served as a major impetus for Cognitive Linguistics, a theory of language which challenges the modularity hypothesis and postulates such as a strict divide between semantics and pragmatics (cf. article 88 (Jaszczolt) Semantics and pragmatics). From Aristotle to the 1930’s, the study of metaphor and metonymy was primarily seen as part of the tradition of rhetoric. Metaphor in particular was seen as a highly specialized, non-literal, ‘deviant’ use of language, used primarily for literary or poetic effect. Although scholars developed a number of variations on the theme, the basic analysis that metaphor involved using one term to stand for another, remained largely unchanged. Beginning with Richards (1936) and his insight that the two elements in a metaphor (the tenor and vehicle) are both active contributors to the interpretation, the debate has shifted away from understanding metaphor as simply substituting one term for another or setting up a simple comparison (although this definition is still found in reference books such as the Encyclopedia Britannica). In the main, contemporary metaphor scholars now believe that the interpretations and inferences generated by metaphors are too complex to be explained in terms of simple comparisons. (However, see discussion of resemblance metaphors, below.) Richards’ insights stirred debate among philosophers such as Black and Grice. Subsequently, linguists (for example, Grady, Johnson, Lakoff, Reddy, Sperber & Wilson, and Giora) and psychologists (for example, Coulson, Gibbs, and Keysar & Bly) have joined in the discussion. The result has been a sharp increase in interest in metaphor and metonymy. Debate has revolved around issues such as the challenges such non-literal language poses for truth conditional semantics and pragmatics, for determining the division between literal and non-literal language, and the insights metaphor and metonymy potentially offer to our understanding of the connections between language and cognition. The upshot has been a number of major innovations which include, but are not limited to, a deeper understanding of the relations between the terms in linguistic metaphor, that is, tenor (or target) and vehicle (or source); exploration of the type of knowledge, ‘dictionary’ versus ‘encyclopedic’, necessary to account for the interpreting metaphor; and development of a typology of metaphors, that is, primary, complex, and resemblance metaphor. Given the recent diversity of scholarly activity, universally accepted definitions of either metaphor or metonymy are difficult to come by. All metaphor scholars recognize certain examples of metaphor, such as ‘Ted is the lion of the Senate’, and agree that such
26. Metaphors and metonymies examples of metaphor represent, in some sense, non-literal use of language. Scholars holding a more traditional, truth-conditional perspective might offer a definition along the lines of ‘use of a word to create a novel class inclusion relationship’. In contrast, conceptual metaphor theorists hold that metaphor is “a pattern of conceptual association” (Grady 2007) resulting in “understanding and experiencing one kind of thing in terms of another” (Lakoff & Johnson 1980: 5). A key difference between perspectives is that the truth-conditional semanticists represent metaphor as a largely word-level phenomenon, while cognitive semanticists view metaphor as reflective of cognitive patterns of organization and processing more generally. Importantly, even though exemplars of metaphor such as ‘Ted is a lion’ are readily agreed upon, the prototypicality of such a metaphor is controversial. More traditional approaches tend to treat ‘Ted is a lion’ as a central example of metaphor; in contrast, conceptual metaphor theory sees it as one of three types of metaphor (a resemblance metaphor) and not necessarily the most prototypical. According to Evans & Green (2006) and Grady (2008), resemblance metaphors represent the less prototypical type of metaphors and reflect a comparison involving cultural norms and stereotypes. A subset of resemblance metaphors, termed image metaphors, are discussed by Lakoff & Turner (1989) as more clearly involving physical comparisons, as in ‘My wife’s waist is an hourglass’. For cognitive semanticists, uses such as ‘heavy’ in Stewart’s brother is heavy, meaning Stewart’s brother is a serious, deep thinker, represent the most prevalent, basic type of metaphor, termed primary metaphor. In contrast, a truth conditional semanticist (cf. articles 23 (Kennedy) Ambiguity and vagueness and 24 (Egg) Semantic underspecification) would likely classify this use of ‘heavy’ as a case of literal language use (sometimes referred to as a dead metaphor) involving polysemy and ambiguity. Not surprisingly, analogous divisions are found in the analyses of metonymy. All semanticists recognize a statement such as ‘Brad is just another pretty face’ or ‘The ham sandwich is ready for the check’ as central examples of metonymy. Moreover, there is general agreement that a metonymic relationship involves referring to one entity in terms of a closely associated or conceptually contiguous entity. Nevertheless, as with analyses of metaphor, differences exists in terms of whether metonymy is considered solely a linguistic phenomenon in which one linguistic entity refers to another or whether it is reflective of more general cognition. Cognitive semanticists argue “metonymy is a cognitive process in which one conceptual entity […] provides mental access to another conceptual entity within the same [cognitive] domain […]” (Kövesces & Radden 1998: 39). The chapter is organized as follows: Section 2 will present an overview of the traditional approaches to metaphor which include Aristotle’s classic representation as well as Richards’, Black’s and Davidson’s contributions. Section 3 addresses pragmatic accounts of metaphor with particular attention to recent discussions within Relevance theory. Section 4 presents psycholinguistic approaches, particularly Gentner’s ‘metaphor-as-career’ account and Glucksberg’s metaphor -as- category account and the more recent “dual reference” hypothesis. Section 5 focuses on cognitive and conceptual accounts. It provides an overview of Lakoff & Johnson’s (1980) original work on conceptual metaphor, later refinements of that work, especially development of the theory of experiential correlation and Grady’s typology of metaphors, and conceptual blending theory. Section 6 addresses metonymy.
599
600
V. Ambiguity and vagueness
2. Traditional approaches to metaphor Metaphor and metonymy have traditionally been considered stylistic devices employed to embellish literal interpretations of words and sentences for literary and poetic effects, with their meaning involving special deviations from the encoded content of language. Of the two tropes, metaphor has been given far more attention and our discussion here will primarily focus on it. The metaphor/metonymy-as-deviance view goes back at least as far as Aristotle, who defined metaphor (and metonymy) as “giving the thing a name that belongs to something else” (Poetics 1457, cited in Gibbs 1994: 210). Aristotle speculated that metaphor involved “transference” of the meaning of one word to another. Thus, his analysis was confined to the word level with a focus on similarities between two things as the basis for metaphors. Aristotle’s views engendered two major modes of thought in metaphor research that have continued to the present, one based on the “comparison” of two apparently dissimilar items, and the other based on the “substitution” of one item for another. These traditional conceptions of metaphor were disputed by I. A. Richards (1936), who claimed that in metaphor, two entities or thoughts are equally “active” and copresent and interactive with each other (Taverniers 2002: 21). In John is a rat, for instance, we know that a person (“tenor”) and a rat (“vehicle”) are obviously distinct but they still can share certain traits, such as being furtive and filthy, and such recognition gives rise to the conceptualization of John as a vile person (“ground” or new meaning). Richards’ work represents an important innovation which metaphor scholars continue to draw on; most current analyses recognize the two terms in a metaphor as playing the roles of ‘source’ and ‘target’. Black (1962, 1979) expanded on Richards’ theory and propounded his own version of the interaction model of metaphor in an explicit attempt to overcome the oversimplified view of metaphor as lexical comparison or substitution. For Black, metaphor comprehension in fact “creates” new similarities, rather than simply highlighting certain unnoticed similarities. In Man is a wolf, for example, the term wolf calls up a system of “associated commonplaces” of the word (such as “fierce”, “predatory”, “hunting prey in packs”, etc.), and that system serves as a “filter” through which man is understood, highlighting only those traits that fit the notion of man. This leads to a new conceptualization of man as a “wolf-like” creature, which constitutes the “cognitive content” of the metaphor that cannot be captured by literal paraphrasing. Interesting echoes of this analysis are found in Sperber and Wilson’s recent work involving accessing encyclopedic knowledge when interpreting metaphor and cognitive semanticists’ work on blending theory. Davidson (1978) critiqued Black’s interaction model from the perspective of truthconditional semantics. Sharply distinguishing between what words mean (semantics) and what words are used to do (pragmatics), Davidson argued that metaphor squarely belongs to the realm of language use and itself has no “meaning” or truth-conditional propositional content. To him, “metaphor means what the words mean and nothing more” (Davidson 1978: 30). Davidson thus claims that metaphor is a special use of this literal meaning to evoke some new, unnoticed insight, and that the content of this insight is external to the meaning of the metaphor, thus denying any “cognitive content” in metaphor. Davidson (1981) further argued that, when juxtaposed in an X is Y frame, any two things can be understood as bearing a metaphorical relation, claiming, “There are no unsuccessful metaphors” (Davidson 1981: 201). In other words, there are no constraints
26. Metaphors and metonymies on metaphor, and hence no systematic patterns of metaphor. Like Black, and in keeping with the long standing tradition, Davidson believed that metaphor is a lexical phenomenon. Gibbs (1994: 220–221) points out that Davidson’s “metaphors-without-meaning” view reflects the centuries-old belief in the importance of an ideal scientific-type language and that Davidson’s denial of meaning or cognitive content in metaphor is motivated by a theory-internal need to restrict the study of meaning to truth conditional sentences.
3. Pragmatic accounts of metaphor Since their inception in the 1960’s most pragmatic theories have preserved or reinforced this sharp line of demarcation between semantic meanings and pragmatic interpretations in their models of metaphor understanding. In his theory of conversational implicature, Grice (1969/1989, 1975) argued that the non-literal meaning of utterances is derived inferentially from a set of conversational maxims (e.g. be truthful and be relevant). Given the underlying assumption that talk participants are rational and cooperative, an apparent violation of any of those maxims triggers a search for an appropriate conversational implicature that the speaker intended to convey in context (cf. articles 92 (Simons) Implicature and 94 (Potts) Conventional implicature). In this conception of non-literal language understanding, known as the “standard pragmatic model”, metaphor comprehension involves a series of inferential steps – analyze the literal meaning of an utterance first, then reject it due to its literal falsity, and then derive an alternative interpretation that fits the context. In Robert is a bulldozer, for instance, the literal interpretation is patently false (i.e. violates the maxim of quality) and this recognition urges the listener to infer that the speaker must have meant something non-literal (e.g., “persistent and does not let obstacles stand in his way in getting things done”). In his speech act theory, Searle (1979) also espoused this standard model and posited an even more elaborate set of principles and steps to be followed in deriving non-literal meanings. One significant implication of the standard pragmatic view is that metaphors, whose interpretation is indirect and requires conversational implicatures, should take additional time to comprehend over that needed to interpret equivalent literal speech, which is interpreted directly (Gibbs 1994, 2006b). The results of numerous psycholinguistic experiments, however, have shown this result to be highly questionable. In the majority of experiments, listeners/readers were found to take no longer to understand the figurative interpretations of metaphor, metonymy, sarcasm, idioms, proverbs, and indirect speech acts than to understand equivalent literal expressions, particularly if these are seen in realistic linguistic and social contexts (see Gibbs 1994, 2002 for reviews; see also the discussion below of psycholinguistic analyses of metaphor understanding for findings on processing time for conventional versus novel figurative expressions). Within their increasingly influential framework of relevance theory, Sperber & Wilson (1995; Wilson & Sperber 2002a, 2002b) present an alternative model of figurative language understanding that does not presuppose an initial literal interpretation and its rejection. Instead, metaphors are processed similarly to literal speech in that interpretive hypotheses are considered in their order of accessibility with the process stopping once the expectation of “optimal relevance” has been fulfilled. A communicative input is optimally relevant when it connects with available contextual assumptions to yield maximal cognitive effects (such as strengthening, revising, and negating such contextual
601
602
V. Ambiguity and vagueness assumptions) without requiring any gratuitous processing effort. At the heart of this “fully inferential” model is the idea that “human cognition tends to be geared to the maximization of relevance” (i.e. the Cognitive Principle of Relevance) and that “every ostensive stimulus conveys a presumption of its own optimal relevance” (i.e. the Communicative Principle of Relevance) (Wilson & Sperber 2002a: 254–256). With respect to non-literal meanings, relevance theorists embrace a “continuity view, on which there is no clear cut-off point between ‘literal’ utterances, approximations, hyperboles and metaphors, and they are all considered to be interpreted in the same way” (Wilson & Carston 2006: 406), a position very similar to that of Conceptual Metaphor theorists and Cognitive Linguistics (see below). In their latest articulation of relevance theory, Sperber & Wilson (2008: 84) call their approach to figurative language a “deflationary” account in which they posit no mechanism specific to metaphor, making abundantly clear their position that there is “no interesting generalization” applicable only to metaphors, distinct from literal and other non-literal types of language use. On this view, metaphor comprehension simply calls for finding an interpretation that satisfies the presumption of optimal relevance, as with any other ostensive stimuli (i.e. stimuli, verbal or nonverbal, that are designed to attract an audience’s attention to home in on the meaning intended by the communicator, who conveys this very intention to the audience in what Sperber & Wilson 2004 call “ostensive-inferential communication”). To achieve this, the listener needs to enrich the decoded sentence meaning at the explicit level to derive its “explicature” (through disambiguation, reference assignment, and other enrichment processes) and complement it at the implicit level by supplying contextual assumptions which combine with the adjusted explicit meaning to yield enough contextual implications (which vary in degrees of strength) or other cognitive effects to make the utterance relevant in the expected way (Sperber & Wilson 2004: 408). In Caroline is a princess, for instance, the adjusted explicit content may contain not just the encoded concept princess (daughter of royalty) but a related “ad hoc” concept princess* in the context in which the assigned referent of Caroline is manifestly not royal. This ad hoc concept matches particular contextual assumptions or “implicated premises” (such as “spoiled” and “indulged”) made accessible through the encyclopedic knowledge about the concept princess. The combination of the explicit content (Caroline is a princess*) and the encyclopedic assumptions (A Princess is spoiled, indulged, etc.) yields some strong contextual implications or “implicated conclusions” (Caroline is spoiled, indulged, etc.). Notice that this online process of constructing an ad hoc concept at the explicit level involves what Carston (2002a, 2002b) and Wilson & Carston (2006) call “broadening”, or category extension where the ad hoc concept (in this case princess*) comes to cover a broader scope of meaning than the linguistically encoded concept (princess). Lexical broadening is held to be pervasive in everyday language use, particularly in the form of approximation (using flat to mean “flattish” in a sentence like Holland is flat), and it does not preserve literalness, unlike “narrowing”, the other variety of loose use of language in which the constructed meaning is narrower or more specific than the linguistically encoded meaning, as in the use of temperature in a sentence like I have a temperature (Sperber & Wilson 2008: 91), in which temperature is taken to mean an internal body temperature above 98.7 degrees. While some cases of non-literal use of language, such as hyperbole, involve only broadening of the encoded concept, most metaphors involve both broadening and narrowing and thus cannot be treated as cases of simple category extension (Sperber & Wilson 2008: 95; see also the discussion of the “class inclusion” view of metaphor below).
26. Metaphors and metonymies In the case of Caroline is a princess, the contextually derived ad hoc concept princess* is narrowed, in a particular context, from the full extension of the encoded concept princess to cover only a certain type of princess (“spoiled” and “indulged”), excluding other kinds (e.g. “well-bred”, “well-educated”, “altruistic”, etc.). It is also at the same time a case of broadening in that the ad hoc concept covers a wider category of people who behave in self-centered, spoiled manners, but are non-royalty, and whose prototypical members are coddled girls and young women. The relevance theoretic account of figurative language understanding thus pivots on the “literal-loose-metaphorical continuum”, whereby no criterion can be found for distinguishing literal, loose, and metaphorical utterances, at least in terms of inferential steps taken to arrive at an interpretation that satisfies the expectation of optimal relevance. What is crucial here is the claim that the fully inferential process of language understanding unfolds equally for all types of utterances and that the linguistically encoded content of an utterance (hence, the “decoded” content of the utterance for the addressee) merely serves as a starting point for inferring the communicator’s intended meaning, even when it is a “literal” interpretation. In other words, strictly literal interpretations that involve neither broadening nor narrowing of lexical concepts are derived through the same process of mutually adjusting explicit content (i.e. explicature, as distinct from the decoded content) with implicit content (i.e. implicature), as in loose and metaphorical interpretations of utterances (Sperber & Wilson 2008: 93). As seen above, this constitutes a radical departure from the standard pragmatic model of language understanding, which gives precedence to the literal over the non-literal in the online construction of the speaker’s intended meaning. Relevance theory assumes, instead, that literal interpretations are neither default interpretations to be considered first, nor are they equivalent to the linguistically encoded (and decoded) content of given utterances. The linguistically denoted meaning recovered by decoding is therefore “just one of the inputs to an inferential process which yields an interpretation of the speaker’s meaning” (Wilson & Sperber 2002b: 600) and has “no useful theoretical role” in the study of verbal communication (Wilson & Sperber 2002b: 583). Under this genuinely inferential model, “interpretive hypotheses about explicit content and implicatures are developed partly in parallel rather than in sequence and stabilize when they are mutually adjusted so as to jointly confirm the hearer’s expectations of optimal relevance” (Sperber & Wilson 2008: 96; see also Wilson & Sperber 2002b). As an illustration of how the inferential process may proceed in literal and figurative use of language, let us look at two examples from Sperber & Wilson (2008). (1) Peter: For Billy’s birthday, it would be nice to have some kind of show. Mary: Archie is a magician. Let’s ask him. (Sperber & Wilson 2008: 90) In this verbal exchange, the word magician can be assumed to be ambiguous in its linguistic denotation, between (a) someone with supernatural powers who performs magic (magician1), and (b) someone who does magic tricks to amuse an audience (Magician2). In the minimal context given, where two people are talking about a show for a child’s birthday party, the second encoded sense is likely to be activated first, under the assumption that Mary’s utterance will achieve optimal relevance by addressing Peter’s suggestion that they have a show for Billy’s birthday party. Since magicians in the sense of Magician2 put on magic shows that children enjoy, this assumption, combined with
603
604
V. Ambiguity and vagueness Peter’s wish to have a show for Billy’s birthday, yields an implicit conclusion that Archie could put on a magic show for Billy’s birthday party, which in turn helps to stabilize the explicit content of Mary’s utterance as denoting that Archie is a Magician2. The overall interpretation that emerges in the end to satisfy the expectation of optimal relevance would thus be something along the lines of “Archie is a Magician2 who could put on a magic show for Billy’s birthday party that the children would enjoy”. This inferential process for the literal interpretation of the word magician, which involves neither broadening nor narrowing, also holds for its metaphorical interpretation, such as the one in (2) below. (2) Peter: I’ve had this bad back for a while now, but nobody has been able to help. Mary: My chiropractor is a magician. You should go and see her. (Sperber & Wilson 2008: 96) If we assume that Mary’s utterance will achieve optimal relevance by addressing Peter’s expressed concern about his back pain, the use of the word magician, combined with Peter’s worry that no ordinary treatments work for him, is likely to activate the first sense Magician1 to invite the assumption that magicians can achieve extraordinary things. This assumption, together with the information that chiropractors are those who specialize in healing back pain, then yields an implicit conclusion that Mary’s chiropractor is capable of achieving extraordinary things and would thus be able to help Peter better than others. This conclusion in turn helps to stabilize the explicit content of Mary’s utterance as involving a metaphorical Magician*, an ad hoc concept contextually constructed on the fly, through broadening, to mean someone who is capable of achieving extraordinary things in certain situations. The overall interpretation that congeals in the end would thus be something along the lines of “Mary’s chiropractor is a Magician*, who would be able to help Peter better than others by achieving extraordinary things”. These two examples indicate that literal and metaphorical interpretations are derived through essentially equivalent inferential steps (see Sperber & Wilson 2008: 95–96 for a more detailed explication of the inferential steps putatively involved in the interpretation of these two examples). Given these observations, relevance theorists argue that their account of metaphor is psychologically and cognitively plausible and escapes the psycholinguistic criticism leveled against the standard pragmatic model. Since the relevance theoretic heuristic for utterance understanding, as was seen above, is to take a path of least effort and stop once the expectation of relevance is fulfilled, understanding metaphorical expressions does not presuppose the primacy of literal interpretations and their rejection, but can go straight to non-literal interpretations from the minimal decoding of linguistically coded stimuli. Furthermore, hypotheses about explicatures, implicated premises, and implicated conclusions should not be taken to be sequentially ordered but they are rather developed online in parallel against a background of expectations (Wilson & Sperber 2002a: 261–262). Nevertheless, it still remains unclear how the notions of narrowing and broadening can account for “category-crossing” cases like Robert is a bulldozer, where its supposed ad hoc concept bulldozer* (with psychological properties such as “aggressive” and “obstinate”) cannot be drawn from the lexical entry of bulldozer or its encyclopedic entry because such properties simply do not exist in these entries stored in long-term memory, a gap in the relevance-theoretic explanation of metaphor as a
26. Metaphors and metonymies kind of loose talk, as acknowledged by Carston (2002a). This problem pertains most centrally to the issue of “emergent meaning” to be discussed below in the review of cognitive-linguistic approaches to metaphor, particularly in connection with the theory of conceptual blending. Another theoretical issue recently raised by some philosophers of language (Stern 2006; Camp 2006a) with regard to relevance-theoretic and other “contextualist” accounts of metaphor (Sperber & Wilson 1995; Wilson & Sperber 2002a, 2002b; Wilson & Carston 2006; Carston 2002a, 2002b; Recanati 2004) is whether or how much contextual factors “intrude” into the realm of semantics to affect “what is said”, as opposed to “what is merely communicated”, in figurative language understanding (see the June 2006 special issue of Mind and Language, for a recent overview of this issue). In the traditional or Gricean view, “what is said” is equated with what is linguistically encoded, or a minimal propositional form supplied by the semantics of an utterance, plus the “saturation” of variables (such as indexicals) present in linguistic structure. This minimal literal proposition (i.e. “sentence meaning”) in turn serves as the basis for further pragmatic inferencing to yield “what is implicated” or meant by the speaker (i.e. “speaker’s meaning”). On this view, metaphor does not impinge upon “what is said”, but simply remains a matter of communicating something by saying something else (hence, a “violation” of maxims). Those in the contextualist camp challenge this conception as an oversimplification, and argue that metaphorical language prompts the addressee to create ad hoc concepts through the kind of narrowing and broadening discussed above, which contributes to the explicit content of the utterance (Carston 2002a: 85). From the contextualist viewpoint, therefore, metaphor does affect what is overtly communicated (or “explicature” in relevance-theoretic terms) in determining the full explicit proposition of an utterance, as opposed to its minimal literal proposition, which is criticized as being too fragmentary to provide the basis for further pragmatic inferencing. Contra Davidson (1978) and Rorty (1989), few present-day “literalists”, in fact, dispute that metaphors “express truths” and concern truth-evaluable semantic content (Stern 2006: 245). Neither do they accept the traditional view, which stipulates that “what is said” is simply conventionally encoded semantic meaning (Camp 2006a: 300). What current literalists are objecting to, instead, is the way contextualists are treating “what is said” as a notion “radically disconnected from conventional meaning” (Camp 2006a: 300). Literalists contend that the contextualist accounts of metaphor fail to explain how the metaphorical depends on the literal, or how literal meanings are somehow “active” in metaphorical interpretations of utterances (Stern 2006: 250; also see Giora’s discussion of the role of salient meaning in metaphor understanding). It is worth noting that this contextualist-literalist controversy on how metaphor affects “what is said” arises only when one maintains a highly modular view of linguistic meaning and communication, in which a rather strict division of labor is presupposed between semantics and pragmatics, as well as a divide between encoded word meaning and encyclopedic knowledge. One final sticky point of contention surrounding the relevance theoretic account of metaphor concerns the role of context in the time course of figurative language processing, an issue raised by Rachel Giora, who puts forth an alternative model of language understanding dubbed the graded salience hypothesis (Giora 1997, 1998, 1999, 2002, 2008, inter alia). Citing a large body of empirical research attesting to the primacy of context-independent “salient” lexical meanings, including her own experimental work,
605
606
V. Ambiguity and vagueness Giora argues that relevance theory fails to explain cases in which contextually incompatible lexical meanings are in fact activated (or fail to be “suppressed”) in the initial stage of metaphor understanding. Under her hypothesis, the meaning that is obligatorily accessed in the initial phase of comprehension is the most salient meaning coded in the mental lexicon, rather than the contextually sanctioned meaning, regardless of its literal or non-literal status, a position that runs counter to the “direct access” model assumed in relevance theory, where only contextually compatible meanings are activated to satisfy the expectation of optimal relevance. According to Giora, the degree of salience of a word or an expression is modulated by its conventionality, familiarity, frequency, or prototypicality (e.g. the meaning that one has just learned, the meaning activated by previous context, the meaning frequently used in a particular discourse community, etc.), rather than its contextual compatibility. In this theory, textual context is thus assigned a limited role in the initial stage of comprehension and, most importantly, it is ineffective in blocking the activation of highly salient meanings even when they are contextually incompatible since “it does not interact with lexical processes but runs in parallel” (Giora 2002: 490). Context does come into play in the later stage to prompt sequential processing whereby the salient meaning, when contextually inappropriate, is rejected in favor of a meaning that matches the context. Therefore, measures that tap online processing show that processing novel metaphors, for instance, activates the salient literal meaning initially, long before the metaphoric meaning becomes available (Giora 1999: 923). Giora argues that this calls for the revision of the relevance theoretic account of metaphor that incorporates the graded salience hypothesis to capture the psychologically real online process of metaphor understanding (Giora 1998).
4. Psycholinguistic approaches to figurative language In metaphor research, there is a long genealogy of psycholinguistic studies conducted to examine how people produce and understand figurative language and a number of theories have been proposed in this tradition to explain possible psychological processes behind metaphor production and comprehension. Among those psycholinguistic approaches to metaphor, two influential perspectives, among others, have emerged in recent years, largely independently from the traditional and pragmatic accounts of figurative language discussed above, to capture aspects of online metaphor processing: the “career of metaphor” theory developed by Gentner and colleagues (e.g. Bowdle & Gentner 2005; Gentner & Bowdle 2001; Gentner & Wolff 1997), and the “metaphoras-categorization” theory proposed by Glucksberg and associates (Glucksberg 2001; Glucksberg & Keysar 1990). Focusing mostly on nominal metaphors of the “X is Y” type such as My lawyer is a shark, these two approaches differ most saliently in how they explain such nominal metaphorical statements are processed online. In the careerof-metaphor theory, any metaphorical expressions are understood in either of two distinct processing modes, namely, either as a simile (i.e. a comparison assertion) or as a categorization (i.e. a class inclusion assertion), and the processing mode of each metaphorical expression hinges on its degree of familiarity. Under this model, therefore, novel metaphors are invariably processed as comparisons at first but after repeated use become gradually conventionalized and then eventually become familiar enough to be processed as categorization assertions. In other words, a metaphor “undergoes a process of gradual abstraction and conventionalization as it evolves from its novel use
26. Metaphors and metonymies to becoming a conventional ‘stock’ metaphor” (Gentner & Bowdle 2008: 116). What underlies this theory is the assumption that metaphors are essentially comparisons in their origin. By contrast, metaphor-as-categorization theorists argue that metaphors are fundamentally different from comparisons and are rather intended and understood as categorical, class-inclusion statements. On this view, the sentence My lawyer is a shark is thus understood by seeking the closest superordinate category that encompasses the two concepts, lawyer and shark (in this case, the category of predators, or any creatures that are vicious, aggressive and merciless), rather than by comparing the properties of lawyer and shark to find common features. Despite these differences, on the other hand, the metaphor-as-categorization theory shares with the career-of-metaphor theory the basic assumption that metaphors and similes are fundamentally equivalent and are thus essentially interchangeable: while the latter argues that metaphors are implicit similes, in which features are extracted from the two given concepts and matched with one another, the former contends that metaphors can be expressed as similes to make implicit categorization assertions. More recent evidence, however, paints a different picture, according to Glucksberg (2008), suggesting that both the comparison view and the categorization view are wrong in their view of metaphors and similes. In order to test the finding by Bowdle & Gentner (2005) that novel metaphors were preferred in simile/comparison form (X is like Y), while conventional ones were preferred in metaphor/categorization form (X is Y), Haught & Glucksberg (2004) selected a set of apt and comprehensible conventional metaphors (e.g. My lawyer was (like) a shark and Some ideas are (like) diamonds) and then made them novel by using adjectives that are applicable to the metaphor topic, but not to the literal metaphor term (e.g. My lawyer was (like) a well-paid shark). Participants in this study rated these modified novel metaphors as apt as their original conventional counterparts when they were presented in metaphor/categorization form, but much less apt in simile/comparison form, offering a counterexample to the career of metaphor hypothesis. Based on the finding of this study, Glucksberg (2008: 77) proposes the “dual reference” hypothesis, that “the metaphor vehicle in similes refers at the literal level, but in metaphors at the superordinate metaphorical level”, saying that a metaphorical shark can plausibly be well paid but the literal marine creature is not readily characterizable in terms of monetary income. This new hypothesis does seem to reconcile the discrepancy between the two positions (career-of-metaphor vs. metaphor-as-categorization) in their attempts to unravel the online mechanism of metaphor understanding, especially for noun-based metaphors intensely studied in this strand of research on figurative language processing. Yet it still fails to offer much in the way of illuminating insights into what actually motivates and constraints particular uses of metaphor in the first place. This brings us to more cognitively and conceptually oriented approaches to metaphor.
5. Cognitive and conceptual accounts Conceptual metaphor (and metonymy) theory, as first articulated by George Lakoff and Mark Johnson (1980), represents a radical departure from theories discussed thus far and is clearly embedded in a larger theory of language (Cognitive Linguistics) which holds that language is a reflection of human cognition and general cognitive processes (cf. articles 27 (Talmy) Cognitive Semantics, 28 (Taylor) Prototype theory, 29 (Gawron)
607
608
V. Ambiguity and vagueness Frame Semantics, and 86 (Kay & Michaelis) Constructional meaning for broader perspectives on language and cognition within the framework of Cognitive Linguistics). Moving away from the deviance tradition, Lakoff & Johnson focused on the ubiquity of, often highly conventionalized, metaphor in everyday speech and what this reveals about general human cognition. Rather than placing metaphor on the periphery of language, Lakoff & Johnson argue that it is central to human thought processes and hence central to language. They defined metaphor as thinking about a concept (the target) from one knowledge domain in terms of another domain (the source). In particular, they articulated the notion of embodied meaning, i.e., that human conceptualization is crucially structured by bodily experience with the physical-spatial world; much of metaphor is a reflection of this structuring by which concepts from abstract, internal or less familiar domains are structured by concepts from the physical/spatial domains. This structuring is held to be asymmetrical. The argument is that many of these abstract or internal domains have not developed language in their own terms, for instance, English has few domain specific temporal terms and generally represents time in terms of the spatial domain. Thus language from the familiar, accessible, intersubjective domains of the socialspatial-physical is recruited to communicate about the abstract and internal. This overarching vision is illustrated in Lakoff & Johnson’s well known analysis of the many non-spatial, metaphoric meanings of up and down. They note that up and down demonstrate a wide range of meanings in everyday English, a few of which are illustrated below: (3) a. The price of gas is up/down amount b. Jane is further up/down in the corporation than most men. power, status c. Scooter is feeling up/down these days. mood Their argument is that all these uses of up and down reflect a coherent pattern of human bodily experience with verticality. The importance of up and down to human thinking is centrally tied to the particularities of the human body; humans have a unique physical asymmetry stemming from the fact that we walk on our hind legs and that our most important organs of perception are located in our heads, as opposed to our feet. Moreover, the up/down pattern cannot be attributed simply to these two words undergoing semantic extension, as the conceptual metaphor is found with most words expressing a vertical orientation: (4) a. The price of gas is rising/declining. b. She’s in high/low spirits today. c. She’s over/under most men in the corporation. This analysis explains a consistent asymmetry found in such metaphors; language does not express concepts from social/physical/spatial domains in terms of concepts from abstract or internal domains. For example, concepts such as physical verticality are not expressed in terms of emotions or amount; a sentence such as He seems particularly confident cannot be interpreted to mean something like He seems particularly tall, whereas He’s standing tall now does have the interpretation of a feeling of confidence and well being For Conceptual Metaphor Theorists, the asymmetry of mappings represents a key, universal constraint on the vast majority metaphors.
26. Metaphors and metonymies In contrast to other approaches, conceptual metaphor theory tends not to consider metaphoric use of words and phrases on a case-by-case, word-by-word basis, but rather points out broader patterns of use. Thus, a distinction is made between conceptual metaphors, which represent underlying conceptual structure, and metaphoric expressions, which are understood as linguistic reflections of the underlying conceptual structure organized by systematic cross-domain mappings. The correspondences are understood to be mappings across domains of conceptual structure. Cognitive semanticists hold that words do not refer directly to entities or events in the world, but rather they act as access points to conceptual structure. They also advocate an encyclopedic view of word meaning and argue that a lexical item serves as an access point to richly patterned memory (Langacker 1987, 1991, 1999, 2008; also see the discussion of Conceptual Blending below). Lakoff & Johnson argue that such coherent patterns of bodily experience are represented in memory in what they term ‘image schemas’, which are multi-modal, multifaceted, systematically patterned conceptualizations. Some of the best explored image schemas include BALANCE, RESISTANCE, BEGINNING-PATH-END POINT, and MOMENTUM (Gibbs 2006a). Conceptual metaphors draw on these image-schemas and can form entrenched, multiple mappings from a source domain to a target domain. Her problems are weighing her down is a metaphorical expression of the underlying conceptual metaphor difficulties are physical burdens, which draws on the image schematic structure representing physical resistance to gravity and momentum. The metaphor maps concepts from the external domain of carrying physical burdens onto the internal domain of mental states. Such underlying conceptual metaphors are believed to be directly accessible without necessarily involving any separate processing of a literal meaning and subsequent inferential processes to determine the metaphorical meaning (Gibbs 1994, 2002). Hence, Conceptual Metaphor Theory questions a sharp division between semantics and pragmatics, rejecting the traditional stance that metaphoric expressions are first processed as literal language and only when no appropriate interpretation can be found, are then processed using pragmatic principles. The same underlying conceptual metaphor is found in numerous metaphorical expressions, such as He was crushed by his wife’s illness and Her heart felt heavy with sorrow. Recognition of the centrality of mappings is a hallmark of Conceptual Metaphor Theory; Grady (2007: 190), in fact, argues that the construct of cross-domain mapping is “the most fundamental notion of CMT”. Additionally, since what is being mapped across domains is cognitive models, the theory holds that the mappings allow systematic access to inferential structure of the source domain, as well as images and lexicon (Coulson 2006). So, the mapping from the domain of physical weight to the domain of mental states includes the information that the effect of the carrying the burden increases and decreases with its size or weight, as expressed in the domain of emotional state in phrases such as Facing sure financial ruin, George felt the weight of the world settling on him, and Teaching one class is a light work load, and Everyone’s spirits lightened as more volunteers joined the organization and the number of all night work sessions were reduced. Conceptual metaphor theorists argue that while the conceptual metaphor is entrenched, the linguistic expression is often novel. Coulson (2006) further notes that viewing metaphorical language as a manifestation of the underlying conceptual system offers an explanation for why we consistently find limited but systematic mappings between the source domain and target domain (often termed ‘partial projection’). The systematicity stems from the access to higher order inferential structures within the conceptual domains which allows mappings across
609
610
V. Ambiguity and vagueness abstract similarities, rather than just at the level of objective features of the two domains, which may in fact be quite different. Mapping from the domain of physical weight to the domain of mental states would include the following correspondences (Fig. 26.1):
Fig. 26.1: Cross-domain mapping from physical weight to mental states
In their original work, Lakoff & Johnson also investigated metaphors that were less clearly tied to essential human bodily experiences, such as argument is war. Critiques of a Lakoffian approach tend to focus on these metaphors and idioms (e.g., Camp 2006b; Keysar & Bly 1995; Wilson & Carston 2006; cf. article 20 (Fellbaum) Idioms and collocations). They point out that such metaphors can violate the argument that familiar, bodily experiences are used to understand more abstract, less familiar concepts (here in their daily lives, many more people are likely to experience verbal disagreements than open warfare). Moreover, these critics argue that if metaphors are anchored in basic bodily experience, we should expect them to be found universally. In addition, they question why only some of the information from the domain of war is projected onto argument. Although Lakoff & Johnson clearly state that projections from source to target domains are partial, their early work offers no systematic principles for constraint (see below for further discussion; also see Gibbs & Perlman 2006 and Gibbs 2006c for detailed discussion of some of the potential methodological problems with the cognitive linguistic approach to metaphor and their suggestions about how best to address such concerns). However, even the early conceptual metaphor work contains the beginnings of differentiation between metaphor closely related to embodied meaning (image schema and primary metaphors) and other types of conceptual metaphor. Joseph Grady and his colleagues (e.g., Grady 1997, 2008; Oakley & Coulson 1999) have deepened the original insights concerning embodied experience through analyzing metaphors such as theories are buildings as complex metaphors made up of several ‘primary metaphors’. Grady (1999a) posits a typology of metaphors at the center of which is primary metaphor (or experiential correlation). He argues that theories are buildings is a complex metaphor composed of two primary metaphors organization is complex physical structure and functioning/persisting is remaining erect/whole. Thus, any entity that we understand as involving organization, such as foreign policy or social customs, as well as theories, can exploit conceptual structure from the domain of complex physical structure, buildings being a prime, humanly salient example. Together with the general commitment to mapping between conceptual domains, the model of experiential correlation goes a long way towards accounting for partial projection, since the domain of buildings is not being mapped, but rather the more generic domain of complex physical structure (see more below). It also accounts for the multiple mappings between sources and targets, since things understood as organization can be mapped to multiple examples of complex
26. Metaphors and metonymies physical structure. Moreover, the analysis suggests that while primary metaphors, such as organization is complex physical structure and functioning/persisting is remaining erect/whole will be found in most languages, the precise cross-domain mappings or complex metaphors which occur will differ from language to language. Grady points out that humans regularly observe or experience the co-occurrence of two separable phenomena, such that the two phenomena become closely associated in memory. Grady identifies scores of experiential correlations, of which more is up is probably the most frequently cited. The argument is that humans frequently observe liquids being poured into containers or objects added to piles of one sort or another. These seemingly simple acts actually involve two discrete aspects which tend to go unrecognized– the addition of some substance and an increase in vertical elevation. Notice that it is possible to differentiate these two phenomena; for instance, one could pour the liquid on the floor, thus adding more liquid to a puddle, without causing an increase in elevation. However, humans are likely to use containers and piles because they offer important affordances involving control and organization which humans find particularly useful. The correlation between increases in amount and increases in elevation are ubiquitous and important for humans and thus form a well entrenched experiential correlation or mapping between two domains. Another important experiential correlation Grady identifies is affection is warmth. He argues that many of the most fundamental experiences during the first few months of life involve the infant experiencing the warmth of a caretaker’s body when being held, nursed and comforted. Thus, from the first moments of life, the infant begins to form an experiential correlation between the domain of temperature and the domain of affection. This experiential correlation (or primary metaphor) is reflected in expressions such as warm smile and warm welcome. Conversely, lack of warmth is associated with discomfort and physical distance from the caretaker, as reflected in the expressions cold stare or frigid reception. The theory of experiential correlation has begun to gain support from a number of related fields. Psycholinguistic evidence comes from work by Zhong & Leonardelli (2008) who investigated the affection is warmth metaphor and found that subjects report experiencing a sense of physical coldness when interpreting phrases such as ‘icy glare’. From the area of child language learning, Chris Johnson (1999) has argued that caretaker speech often reflects experiential correlations such as knowing is seeing and that children may have no basis for distinguishing between literal and metaphorical uses of a word like see. If caretakers regularly use see in contexts where it can mean either the literal ‘perceive visually’, as in “Let’s see what is in the bag” or the metaphorical ‘learn; find out’, as in “Let’s see what this tastes like”, children may develop a sense of the word which conflates literal and metaphorical meanings. Only at some later stage of development, they come to understand there are two separate uses and go through a process Johnson terms deconflation. Thus, Johnson argues that the conceptual mappings between experiential correlates are reinforced in the child’s earliest exposure to language. Experiential correlation and the theory of primary metaphors was also a major impetus for the “Neural Theory of Language” (Lakoff & Johnson 1999; Narayanan 1999). This is a computational theory which has successfully modeled inferences arising from metaphoric interpretation of language. Experiential correlations are represented in terms of computational ‘neural nets’; correlation mappings are treated as neural circuits linking representations of source and target concepts. Consistent with a usage-based, frequency
611
612
V. Ambiguity and vagueness driven model of language, the neural net model assumes that neural circuits are automatically established when a perceptual and a nonperceptual concept are repeatedly co-activated. The theory of experiential correlation moves our understanding of metaphor past explanations based only on comparison or similarity or ad hoc categories. It addresses the conundrum of how it is that much metaphoric expression, such as ‘warm smile’, involves two domains that have no immediately recognizable similarities. Grady does not claim that all metaphor is based on experiential correlation. In addition to primary metaphors and their related complex metaphors, he posits another type, resemblance metaphor, to account for expressions such as Achilles is a lion. These are considered one-shot metaphors that tend to have limited extensions. They involve mappings based on conventionalized stereotypes, here the western European stereotype that lions are fierce, physically powerful and fearless. Key to this account is Grady’s adherence to the tenets of lexical meaning being encyclopedic in nature (thus lion acts as an access point to the conventional stereotypes) and mappings across conceptual domains. Clearly, resemblance metaphors with their direct link to conventional stereotypes are expected to be culturally-specific, in contrast to primary metaphors which are predicted to be more universal. Grady (1999b) has analyzed over 15 languages, representing a wide range of historically unrelated languages, for the occurrence of several primary metaphors such as big is important. He found near universal expression of these metaphors in the languages examined. The work by Grady and his colleagues addresses many of the criticisms of Lakoff & Johnson (1980). By setting up a typology of metaphors, conceptual metaphor theory can account for the occurrence of both universal and language specific metaphors. Within a specific language, the notion of complex metaphors, based on multiple primary metaphors, offers an explanation for systematic, recurring metaphorical patterns, including a compelling account of partial projection from source to target. The construct of experiential correlation also provides a methodology for distinguishing between the three proposed types of metaphor. Primary metaphors and their related complex metaphors are strictly asymmetrical in their mappings and ultimately grounded in ubiquitous, fundamental human experiences with the world. In contrast, resemblance metaphors are not strictly asymmetrical. So we find not only Achilles is a lion but also The lion is the king of beasts. Moreover, metaphor theorists have not been able to find a basic bodily experience that would link lions (or any fierce animals) with people. Finally, identifying the three categories allows an coherent account of embodied metaphor without having to entirely abandon the insight that certain metaphors, such as ‘My wife’s waist is an hourglass’ or ‘Achilles is a lion’, are based on some sort of comparison. Lakoff & Johnson (1980) also hypothesized that certain idioms, such as ‘spill the beans’ were transparent (or made sense) to native speakers because they are linked to underlying conceptual structures, such as the CONTAINER image schema. Keysar & Bly (1995, 1999) question the claim that such transparent idioms can offer insights into cognition. They argue that the native speakers’ perception of the relative transparency of an idiom’s meaning is primarily a function of what native speakers believe the meaning of the idiom to be. This is essentially a claim for a language based, rather than conceptually based meaning of idioms and metaphors. Keysar & Bly carried out a set of experiments in which they presented subjects with attested English idioms which could be analyzed as transparent, but which had fallen out of use. Examples included ‘the goose
26. Metaphors and metonymies hangs high’ and ‘to play the bird with the long neck’. They argued that an idiom such as ‘the goose hangs high’ can be considered transparent because ‘high’ potentially links to physical verticality and the conceptual metaphor good is up. Subjects were taught either the actual meaning of the idiom or its opposite. For ‘the goose hangs high’ the actual meaning was ‘things look good’ and the opposite meaning was ‘things look bad’. For subjects learning the ‘opposite or non-transparent’ meaning, the instruction included the information that the meaning of the idiom was not analyzable, on parallel to ‘kick the bucket’. Essentially they found that, under these circumstances, subjects were able to learn the nontransparent meanings and that learning the nontransparent meaning suppressed inferences which might arise from the more transparent meaning. Conversely, for subjects learning the transparent meanings, inferences which might arise from the opposite interpretation were suppressed. They concluded that once a particular meaning is assigned to an idiom, native speakers do their best to mine possible additional meaning from it; simultaneously, learned meanings suppress possible alternative interpretations and inferences. They take this as evidence that transparent idioms, which Conceptual Metaphor Theory holds are related to established conceptual structure, at best provide potential insight into processing strategies. However, a careful look at the target idioms suggests that they are not particularly transparent, nor based in everyday experience. For instance, although ‘the goose hangs high’ contains a reference to vertical elevation, the rest of the content must also be taken into consideration. College students in late 20th century USA are hardly likely to have had much experience with hanging geese and what they might signify. Others, such as ‘to play the bird with the long neck’ whose actual meaning is ‘be on the look out for’, hardly appears to be based in everyday bodily experience. Such accounts overlook the fact that “certain conceptual pairings tend to recur and to motivate a great percentage of the actual metaphors that continue to exist in the language” (Grady 2007: 197). Moreover, transparent idioms such as ‘spill the beans’ often do not occur as isolated set phrases. Related idioms include ‘spill one’s guts’ and ‘spill it!’ While Keysar & Bly may be right in claiming there is no such thing as an ‘impossible metaphor’ and that humans tend to draw inferences appropriate to the meaning their discourse community assigns to an idiom regardless of its potential opposite interpretations, conceptual metaphor scholars have been able to identify numerous sets of patterned pairings, along with persuasive accounts of their embodied motivations. Moreover, these pairings persist in the language, rather than going out of fashion. To date, theories that hold that metaphors are only linguistic in nature and have no ties to conceptual structure have yet to offer an explanation for the occurrence of these enduring, frequently encountered mappings. Conceptual Blending and Integration Theory (Fauconnier & Turner 1998, 2002) developed out of conceptual metaphor theory and Mental Space Theory (Fauconnier 1994, 1997) to account for certain on-line, novel conceptualizations that involve emergent meaning, or additional, non-literal meaning, that cannot be accounted for solely from the information contained in the target and source spaces. The basic notion is that mappings occur across two or more temporary mental input spaces, which draw select, structured information from distinct domains. Simultaneously, select information from each of the input spaces is projected into a third, blended space; it is the integration of these select projections in blended space that gives rise to emergent meaning.
613
614
V. Ambiguity and vagueness This framework holds that words do not refer directly to entities in the world, but rather they act as prompts to access conceptual structure and to set up short term, local conceptual structures called mental spaces. Mental spaces can be understood as ‘temporary containers for relevant, partial information about a particular domain’ “(Coulson 2006: 35). The key architecture of the theory is the conceptual integration network which is ‘an array of mental spaces in which the processes of conceptual blending unfold’ ” (Coulson 2006: 35). A conceptual integration network is made up of two or more input spaces structured by information from distinct domains, a generic space that contains ‘skeletal’ conceptual structure that is common to both input spaces, and a blended space. Mappings occur across the input spaces as well as selectively projecting to the blended space. One of the most often cited examples of a conceptual blend is That surgeon is a butcher, which is interpreted to mean that the surgeon is incompetent and careless. Neither the surgeon space nor the butcher space contains the notion of incompetence. The key issue is how to account for this emergent meaning. Both spaces contain common structure such as an agent, an entity acted upon (human versus animal), goals (healing the patient versus severing flesh), means for achieving the goals (precise incisions followed by repair of the wound versus slashing flesh, hacking bones), etc. Analogy mappings project across the two input spaces linking the shared structure. The notion of incompetence arises in the blended space from the conflict between the goal of healing the patient, which projects from the surgeon space, and the means for achieving the goal, slashing flesh and hacking, which projects from the butcher space. Sweetser (1999) has also analyzed noun-noun compounds such as land yacht (large, showy, luxury car) and couch potato (person who spends a great deal of time sitting and watching TV) as blends. Such noun-noun compounds are particularly interesting as the entities being referred to are not yachts or potatoes of any kind. Conceptual blending theory subsumes conceptual metaphor, with its commitment to embodied meaning. Its adherents also argue that the processes occurring in non-literal language and a variety of other phenomenon, such as conditionals and diachronic meaning extension, are essentially the same as those needed to explain the interpretation of literal language. Over the past decade, there has been a growing convergence between cognitive semanticists and relevance theorists in the area of metaphor analysis (see Gibbs & Tendahl 2006). For instance, as we saw above, the latest versions of relevance theory (e.g., Wilson & Sperber 2002a, 2002b; Wilson & Carston 2006; Sperber & Wilson 2008) now posit entries for lexical items which include direct access to ‘encyclopedic assumptions’ which are exploited in the formation and processing of metaphors and metonymies, a position which is quite similar to Langacker’s (1987) analysis of words being prompts to encyclopedic knowledge. Wilson & Carston (2006: 425) also accept the analysis that physical descriptions such as ‘hard, rigid, inflexible, cold’, etc apply to human psychological properties through inferential routes such as ‘broadening’ to create superordinate concepts (hard*, rigid*, cold*, etc) which have both physical and psychological instances. They further argue that ad hoc, on the fly categories can be constructed through the same process of broadening. For instance, they posit that the process of broadening provides an inferential route to the interpretation of Robert is a bulldozer, where the encoded concept bulldozer has the logical feature, i.e. encoded meaning, machine of a certain kind and the following encyclopedic assumptions:
26. Metaphors and metonymies (5) a. b. c. d. e.
large, powerful, crushing, dangerous to bystanders, etc. looks like this (xxx); moves like this (yyy), etc. goes straight ahead regardless of obstacles pushes aside obstructions; destroys everything in its path hard to stop or resist from outside; drowns out human voices, etc.
They continue, “Some of these encyclopedic features also apply straightforwardly to humans. Others, (powerful, goes straight ahead …) have both a basic, physical sense and a further psychological sense, which is frequently encountered and therefore often lexicalized” (Wilson & Carston 2006: 425). However, the “inferential” theory of metaphor skirts the issue of how the very process of creating “ad hoc” concepts (such as ‘hard to stop’) and extended lexicalized concepts (such as cold* or rigid*) may be motivated in the first place. The qualities and consequences of physical coldness manifested by ice are not the same as the affectations of lack of caring or standoffishness which are evoked in the interpretation of an expression such as Sally is a block of ice. The physical actions and entities involved in a scenario of a bulldozer moving straight ahead crushing physical obstacles such as mounds of dirt and buildings is manifestly different than the actions of a person and the entities involved in a situation in which a person ignores the wishes of others and acts in an aggressive and obstinate manner. As noted above, it remains unclear how the notions of narrowing and broadening are ‘straight forward inferences’ in cases like Robert is a bulldozer, where the key psychological properties such as “aggressive” and “obstinate” cannot be drawn from the lexical entry of “bulldozer” or its encyclopedic entry because such properties simply are not part of our understanding of earth moving machines. Another gap in the relevance theory approach, and indeed all approaches that treat metaphor on a word-by-word basis, is the failure to recognize that most metaphors are part of very productive, systematic patterns of conceptualization based on embodied experience. For instance, Wilson & Carston argue that block of ice has a particular set of encyclopedic assumptions, such as solid, hard, cold, rigid, inflexible, and unpleasant to touch or interact with, associated with it. This treatment fails to systematically account for the fact that the words icy, frigid, frosty, chilly, even cool, are all used to express the same emotional state. In contrast, conceptual metaphor theory accounts for these pervasive patterns through the construct of experiential correlation and embodied meaning. Similarly, the account of Robert is a bulldozer outlined above fails to recognize the underlying image schemas relating to force dynamics, such as momentum and movement along a path, that give rise to any number of conceptually related metaphoric expressions to describe a person as obstinate, powerful and aggressive. Robert could have also been described as behaving like a run away freight train or as bowling over or mowing down his opponents to the same effect, even though the dictionary entries (and their related encyclopedic entries) for freight trains, bowling balls, and scythes or lawn movers would have very little overlap with that of a bulldozer.
6. Metonymy Over the centuries, metonymy has received less attention than metaphor. Indeed, Aristotle made no clear distinction between the two. Traditionally, metonymy was defined
615
616
V. Ambiguity and vagueness as a trope in which the name of one entity is used to refer to another entity. Thus, it was seen as a referring process involving meaning transfer. In particular, it was represented as a word that takes its expression from things that are ‘near and close’ (Nerlich 2006). Nunberg (1978) characterized metonymic ‘transfer’ as having a ‘referring function’, so that, for instance, the producer can be used to refer to the produced, as in There are several copies of Stoppard on the shelf or the part-whole relation (synecdoche) as in Many hands make light work. Some of the most common patterns in English include container for contents, cause for effect, possessor for possessed, type for token (This wine is one of our best sellers). Not surprisingly, cognitive semanticists have provided a strikingly different perspective on metonymy. Consistent with their analysis of metaphor, cognitive semanticists represent metonymy as a fundamental cognitive process which is reflected in linguistic expressions. As with metaphor, the linguistic expression used in a metonymy is understood as an access point to some other larger conceptual structure. The key aspect which distinguishes metonymy from metaphor is that metonymy establishes connections between conceptual entities which co-occur within a single conceptual domain, whereas metaphor establishes connections across two different conceptual domains. In contrast to traditional views that explain the connections between the two entities involved in a metonymy in terms of spatial or physical contiguity or closeness, cognitive semantics understands the entities to be conceptually ‘close’. Cognitive semanticists have pointed out that metonymy is not limited to the type of referring functions illustrated above. Langacker (1984) argues that a ubiquitous aspect of talk involves highlighting certain aspects of a scene or entities within a scene. He refers to the highlighted facets as ‘the active zone’, that which is more conceptually ‘active’ or salient in the particular conceptualization. He notes that in talking about our interaction with objects an active zone is usually invoked. So in the sentence Lucy used a hammer to pry the nail out of the wall the standard interpretation is that the claw of the hammer was applied to the head of the nail; this is in contrast to Lucy used her father’s new hammer to pound the steak in which the active zones are the face of the hammer which is applied to the entire surface of the steak. While such use of language certainly involves a part-whole relation, it is so ubiquitous within the contextualized interpretation of language that it has generally gone unrecognized and so treated as literal interpretation. Relevance theorists have recently addressed it as a straightforward contextual inference (Wilson & Carston 2006). Cognitive linguists point to such metonymic uses as support for the assertion of no strict divide between semantics and pragmatics, a point with which the current version of Relevance Theory concurs (Wilson & Carston 2006). Within the cognitive linguistic framework, this omnipresence of metonymic language use in everyday talk is ascribed to what is known as the “reference point” mechanism, or the fundamental human cognitive ability to “invoke the conception of one entity for purposes of establishing mental contact with another, i.e. to single it out for individual conscious awareness” (Langacker 1999: 173). Metonymy thus essentially reflects a pervasive reference point organization wherein the entity linguistically designated by a metonymic expression serves as a salient point of reference that affords mental access to a conceptually close but distinct entity actually intended by the speaker. This reference point ability has manifold linguistic ramifications and is reflected not only in metonymic expressions but also in various grammatical and discourse phenomena, most notably possessives, anaphoric relationships and topicality.
26. Metaphors and metonymies In his detailed exposition of metonymy from the cognitive linguistic perspective, Barcelona (e.g., 2002, 2003, 2005a, 2005b) defines metonymy as a mapping of a cognitive domain (the source) onto another domain (the target), wherein the source and target are in the same functional domain and are linked by a pragmatic function that provides mental access to the target. In this definition, metonymy (which is fundamentally “conceptual” in nature as seen above) is thus a mapping between two (sub)domains that are interrelated within a broader, functionally motivated domain essentially equivalent to what Lakoff (1987) calls an “idealized cognitive model (ICM)” or what Fillmore (1982) calls “frames”. The source maps onto (i.e. imposes a particular perspective on) and activates the target in virtue of a pragmatic (i.e. experientially based) link. Under this model, any semantic shift that satisfies these requirements is at least a “schematic” metonymy, which is one of the three types of metonymy posited by Barcelona on the basis of their different degrees of “metonymicity” as measured by the relative conceptual distinction between the source domain (i.e. the reference point) and the target domain. The other two types of metonymy are “typical” metonymies (i.e. schematic metonymies whose target is clearly distinct from the source, either because it is a relatively secondary or peripheral subdomain of the source, or because it is a functionally distinct subdomain within a larger overall domain) and “prototypical” metonymies (i.e. typical “referential” metonymies with individual entities as targets). These three classes of metonymy thus constitute a continuum of metonymicity, with each exemplified in the following instances (Barcelona 2005b: 314): (6) a. b. c. d. e.
Belgrade did not sign the Paris agreement. She’s just a pretty face. He walked with drooping shoulders. He had lost his wife. This book weighs two kilograms. This book is highly instructive.
According to Barcelona, (6a) is an instance of prototypical metonymy as it is referential and has an individual (the Yugoslavian government) as the target, while (6b) and (6c) are examples of typical metonymy as they are not referential in nature but the targets are clearly distinct from the sources (BODY FOR PERSON and BODY POSTURE FOR EMOTION, respectively). On the other hand, (6d) and (6e) are instances of “purely schematic” metonymies as the whole domain BOOK is mapped onto its subdomains PHYSICAL OBJECT and SEMANTIC CONTENT, respectively, thereby activating those aspects of the overall domain BOOK. Notice here that these examples would not qualify as instances of metonymy under more restrictive models, such as the one presented by Croft (2002), who defines metonymy as domain “highlighting”, a cognitive operation which highlights a secondary or noncentral subdomain within the overall domain matrix constituted by the speaker’s encyclopedic knowledge of the source concept. Since PHYSICAL OBJECT and SEMANTIC CONTENT would be highly intrinsic subdomains of the BOOK domain matrix (hence primary rather than secondary), (6d) and (6e) would not represent any salient highlighting in Croft’s sense. This noncentrality requirement, also shared by Ruiz de Mendoza (2000), is not a necessary condition in Barcelona’s definition because he fully embraces the notion of prototype effects in the category of metonymy and believes that his model has the advantage of presenting a more unified analysis where the fundamental similarity can be captured between undisputed
617
618
V. Ambiguity and vagueness examples of metonymy like (6a) and more controversial cases like (6d) and (6e) (Barcelona 2002: 226–229). While most semantists would argue that this approach is too inclusive, as most linguistic expressions used in context would be considered metonymic in one way or another, it may well be the sheer reflection of the fact that metonymic relationships in language use are “the rule rather than the exception” as testified by the ubiquity of active zone phenomena pointed out by Langacker (Bercelona 2002: 229).
7. References Barcelona, Antonio 2002. Clarifying and applying the notions of metaphor and metonymy within cognitive linguistics: An update. In: R. Dirven & R. Porings (eds.). Metaphor and Metonymy in Comparison and Contrast. Berlin: Mouton de Gruyter, 207–277. Barcelona, Antonio 2003. Metonymy in cognitive linguistics: An analysis and a few modest proposals. In: H. Cuyckens et al. (eds.). Motivation in Language: Studies in Honor of Günter Radden. Amsterdam: Benjamins, 223–255. Barcelona, Antonio 2005a. The fundamental role of metonymy in cognition, meaning, communication and form. In: A. Baicchi, C. Broccias & A. Sanso (eds.). Modeling Thought and Constructing Meaning: Cognitive Models in Interaction. Milano: Franco Angeli, 109–124. Barcelona, Antonio 2005b. The multilevel operation of metonymy in grammar and discourse, with particular attention to metonymic chains. In: F.J.I. Ruiz de Mendoza & M. Sandra Peña (eds.). Cognitive Linguistics: Internal Dynamics and Interdisciplinary Interaction. Berlin: Mouton de Gruyter, 313–352. Black, Max 1962. Metaphor. In: M. Black (ed.). Models and Metaphors. Ithaca, NY: Cornell University Press, 25–47. Black, Max 1979. More on metaphor. In: A. Ortony (ed.). Metaphor and Thought. Cambridge: Cambridge University Press, 19–45. Bowdle, Brian & Dedre Gentner 2005. The career of metaphor. Psychological Review 112, 193–216. Camp, Elisabeth 2006a. Contextualism, metaphor, and what is said. Mind & Language 21, 280–309. Camp, Elisabeth 2006b. Metaphor in the mind. The cognition of metaphor. Philosophy Compass 1, 154–170. Carston, Robyn 2002a. Metaphor, ad hoc concepts and word meaning – more questions than answers. UCL Working Papers in Linguistics 14, 83–105. Carston, Robyn 2002b. Thoughts and Utterances: The Pragmatics of Explicit Communication. Oxford: Blackwell. Coulson, Seana 2006. Metaphor and conceptual blending. In: K. Brown (ed.). The Encyclopedia of Language and Linguistics. 2nd edn. Amsterdam: Elsevier, 32–39. Croft, William 2002. The role of domains in the interpretation of metaphors and metonymies. In: R. Dirven & R. Porings (eds.). Metaphor and Metonymy in Comparison and Contrast. Berlin: Mouton de Gruyter, 161–205. Davidson, Donald 1978. What metaphors mean. Critical Inquiry 5, 31–47. Davidson, Donald 1981. What metaphors mean. In: M. Johnson (ed.). Philosophical Perspectives on Metaphor. Minneapolis, MN: University of Minnesota Press, 200–219. Evans, Vyvyan & Melanie Green 2006. Cognitive Linguistics: An Introduction. Mahwah, NJ: Lawrence Erlbaum Associates. Fauconnier, Gilles 1994. Mental Spaces: Aspects of Meaning Construction in Natural Language. Cambridge: Cambridge University Press. Fauconnier, Gilles 1997. Mappings in Thought and Language. Cambridge: Cambridge University Press. Fauconnier, Gilles & Mark Turner 1998. Conceptual integration networks. Cognitive Science 22, 133–187.
26. Metaphors and metonymies Fauconnier, Gilles & Mark Turner 2002. The Way We Think: Conceptual Blending and the Mind’s Hidden Complexities. New York: Basic Books. Fillmore, Charles 1982. Frame semantics. In: Linguistic Society of Korea (ed.). Linguistics in the Morning Calm. Seoul: Hanshin, 111–138. Gentner, Dedre & Brian Bowdle 2001. Convention, form, and figurative language processing. Metaphor and Symbol 16, 223–247. Gentner, Dedre & Brian Bowdle 2008. Metaphor as structure mapping. In: R.W. Gibbs (ed.). The Cambridge Handbook of Metaphor and Thought. Cambridge: Cambridge University Press, 109–128. Gentner, Dedre & Phillip Wolff 1997. Alignment in the processing of metaphor. Journal of Memory and Language 37, 331–355. Gibbs, Raymond W. 1994. The Poetics of Mind: Figurative Thought, Language, and Understanding. Cambridge: Cambridge University Press Gibbs, Raymond W. 2002. A new look at literal meaning in understanding what is said and implicated. Journal of Pragmatics 34, 457–486. Gibbs, Raymond W. 2006a. Embodiment and Cognitive Science. Cambridge: Cambridge University Press. Gibbs, Raymond W. 2006b. Metaphor: Psychological aspects. In: K. Brown (ed.). The Encyclopedia of Language and Linguistics. 2nd edn. Amsterdam: Elsevier, 43–50. Gibbs, Raymond W. 2006c. Introspection and cognitive linguistics: Should we trust our own intuitions? Annual Review of Cognitive Linguistics 4, 135–151. Gibbs, Raymond W. (ed.) 2008. The Cambridge Handbook of Metaphor and Thought. Cambridge: Cambridge University Press. Gibbs, Raymond W. & Marcus Perlman 2006. The contested impact of cognitive linguistic research on the psycholinguistics of metaphor understanding. In: G. Kristiansen et al. (eds.). Cognitive Linguistics: Current Applications and Future Perspectives. Berlin: Mouton de Gruyter, 211–228. Gibbs, Raymond W. & Markus Tendahl 2006. Cognitive effort and effects in metaphor comprehension: Relevance theory and psycholinguistics. Mind & Language 21, 379–403. Giora, Rachel 1997. Understanding figurative and literal language: The graded salience hypothesis. Cognitive Linguistics 8, 183–206. Giora, Rachel 1998. When is relevance? On the role of salience in utterance interpretation. Revista Alicantina de Estudios Ingleses 11, 85–94. Giora, Rachel 1999. On the priority of salient meanings: Studies of literal and figurative language. Journal of Pragmatics 31, 919–929. Giora, Rachel 2002. Literal vs. figurative language: Different or equal? Journal of Pragmatics 34, 487–506. Giora, Rachel 2008. Is metaphor unique? In: R.W. Gibbs (ed.). The Cambridge Handbook of Metaphor and Thought. Cambridge: Cambridge University Press, 143–160. Glucksberg, Sam 2001. Understanding Figurative Language: From Metaphors to Idioms. Oxford: Oxford University Press. Glucksberg, Sam 2008. How metaphors create categories – quickly. In: R.W. Gibbs (ed.). The Cambridge Handbook of Metaphor and Thought. Cambridge: Cambridge University Press, 67–83. Glucksberg, Sam & Boaz Keysar 1990. Understanding metaphorical comparisons: Beyond similarity. Psychological Review 97, 3–18. Grady, Joseph 1997. Theories are buildings revisited. Cognitive Linguistics 8, 267–290. Grady, Joseph 1999a. A typology of motivation for conceptual metaphor: Correlation vs. resemblance. In: R.W. Gibbs & G.J. Steen (eds.). Metaphor in Cognitive Linguistics. Amsterdam: Benjamins, 79–100. Grady, Joseph 1999b. Crosslinguistic regularities in metaphorical extension. Paper presented at The Annual Meeting of the Linguistics Society of America (= LSA). Los Angeles, CA, January 6–9.
619
620
V. Ambiguity and vagueness Grady, Joseph 2007. Metaphor. In: D. Geeraerts & H. Cuyckens (eds.). The Oxford Handbook of Cognitive Linguistics. Oxford: Oxford University Press, 188–213. Grady, Joseph 2008. ‘Superschemas’ and the grammar of metaphorical mappings. In: A. Tyler, Y. Kim & M. Takada (eds.). Language in the Context of Use: Discourse and Cognitive Approaches to Language. Berlin: Mouton de Gruyter, 339–360. Grady, Joseph, Todd Oakley & Seana Coulson 1999. Blending and metaphor. In: R.W. Gibbs & G.J. Steen (eds.). Metaphor in Cognitive Linguistics. Amsterdam: Benjamins, 101–124. Grice, H. Paul 1969. Utterer’s meaning and intentions. Philosophical Review 78, 147–177. Reprinted in: H. P. Grice. Studies in the Way of Words. Cambridge, MA: Harvard University Press, 1989, 86–116. Grice, H. Paul 1975. Logic and conversation. In: P. Cole (ed.). Syntax and Semantics 3: Speech Acts. New York: Academic Press, 41–58. Haught, Catrinel & Sam Glucksberg 2004. When old sharks are not old pros: Metaphors are not similes. Paper presented at The annual meeting of the Psychonomic Society (= PS), Minneapolis, MN. Johnson, Christopher 1999. Metaphor vs. conflation in the acquisition of polysemy: The case of see. In: M.K. Hiraga, C. Sinha & S. Wilcox (eds.). Cultural, Typological, and Psychological Perspectives in Cognitive Linguistics. Amsterdam: Benjamins, 155–169. Keysar, Boaz & Bridget M. Bly 1995. Intuitions of the transparency of idioms: Can one keep a secret by spilling the beans? Journal of Memory and Language 34, 89–109. Keysar, Boaz & Bridget M. Bly 1999. Swimming against the current: Do idioms reflect conceptual structure? Journal of Pragmatics 31, 1559–1578. Kövecses, Zoltán & Günter Radden 1998. Metonymy: Developing a cognitive linguistic view. Cognitive Linguistics 9, 37–77. Lakoff, George 1987. Women, Fire, and Dangerous Things: What Categories Reveal about the Mind. Chicago, IL: The University of Chicago Press. Lakoff, George & Mark Johnson 1980. Metaphors We Live by. Chicago, IL: The University of Chicago Press. Lakoff, George & Mark Johnson 1999. Philosophy in the Flesh: The Embodied Mind and its Challenge to Western Thought. New York: Basic Books. Lakoff, George & Mark Turner 1989. More Than Cool Reason: A Field Guide to Poetic Metaphor. Chicago, IL: The University of Chicago Press. Langacker, Ronald W. 1984. Active zones. In: C. Brugman et al. (eds.). Proceedings of the Annual Meeting of the Berkeley Linguistics Society (=BLS) 10. Berkeley, CA: Berkeley Lingiustics Society, 172–188. Langacker, Ronald W. 1987. Foundations of Cognitive Grammar, vol. 1. Theoretical Prerequisites. Stanford, CA: Stanford University Press. Langacker, Ronald W. 1991. Concept, Image, and Symbol: The Cognitive Basis of Grammar. Berlin: Mouton de Gruyter. Langacker, Ronald W. 1999. Grammar and Conceptualization. Berlin: Mouton de Gruyter. Langacker, Ronald W. 2008. Cognitive Grammar: A Basic Introduction. Oxford: Oxford University Press. Narayanan, Srini 1999. Moving right along: A computational model of metaphoric reasoning about events. In: Proceedings of the National Conference on Artificial Intelligence (=AAAI-99). Menlo Park, CA: AAAI Press, 121–128. Nerlich, Brigitte 2006. Metonymy. In: K. Brown (ed.). The Encyclopedia of Language and Linguistics. 2nd edn. Amsterdam: Elsevier, 109–113. Nunberg, Geoffrey 1978. The Pragmatics of Reference. Bloomington, IN: Indiana University Linguistics Club. Recanati, François 2004. Literal Meaning. Cambridge: Cambridge University Press. Richards, Ivar A. 1936. The Philosophy of Rhetoric. New York: Oxford University Press. Rorty, Richard 1989. Contingency, Irony, and Solidarity. Cambridge: Cambridge University Press.
26. Metaphors and metonymies Ruiz de Mendoza, Francisco J. 2000. The role of mappings and domains in understanding metonymy. In: A. Barcelona (ed.). Metaphor and Metonymy at the Crossroads: A Cognitive Perspective. Berlin, New York: Mouton de Gruyter, 109–132. Searle, John R. 1979. Metaphor. In: A. Ortony (ed.). Metaphor and Thought. Cambridge: Cambridge University Press, 92–123. Sperber, Dan & Deirdre Wilson 1995. Relevance. Communication and Cognition, 2nd edition. Oxford: Blackwell. Sperber, Dan & Deirdre Wilson 2004. Relevance theory. In: L.R. Horn & G.L. Ward (eds.). The Handbook of Pragmatics. Oxford: Blackwell, 607–632. Sperber, Dan & Deirdre Wilson 2008. A deflationary account of metaphors. In: R.W. Gibbs (ed.). The Cambridge Handbook of Metaphor and Thought. Cambridge: Cambridge University Press, 84–105. Stern, Josef 2006. Metaphor, literal, literalism. Mind & Language 21, 243–279. Sweetser, Eve 1999. Compositionality and blending. Semantic composition in a cognitively realistic framework. In: T. Janssen & G. Redeker (eds.). Cognitive Linguistics: Foundations, Scope, and Methodology. Berlin: Mouton de Gruyter, 129–162. Taverniers, Miriam 2002. Metaphor and Metaphorology. A Selective Genealogy of Philosophical and Linguistic Conceptions of Metaphor from Aristotle to the 1990s. Ghent: Academic Press. Wilson, Deirdre & Robyn Carston 2006. Metaphor, relevance and the ‘emergent property’ issue. Mind & Language 21, 404–433. Wilson, Deirdre & Dan Sperber 2002a. Relevance theory. UCL Working Papers in Linguistics 14, 249–287. Wilson, Deirdre & Dan Sperber 2002b. Truthfulness and relevance. Mind 111, 583–632. Zhong, Chen-Bo & Geoffrey J. Leonardelli 2008. Cold and lonely: Does social exclusion literally feel cold? Psychological Science 19, 838–842.
Andrea Tyler and Hiroshi Takahashi, Washington, DC (USA)
621
VI. Cognitively oriented approaches to semantics 27. Cognitive Semantics: An overview 1. 2. 3. 4. 5. 6. 7.
Introduction The semantics of grammar Schematic structure Conceptual organization Interactions among semantic structures Conclusion References
Abstract The linguistic representation of conceptual structure is the central concern of the two-tothree decades old field that has come to be known as “cognitive linguistics”. Its approach is concerned with the patterns in which and processes by which conceptual content is organized in language. It addresses the linguistic structuring of such basic conceptual categories as space and time, scenes and events, entities and processes, motion and location, and force and causation. To these it adds the basic ideational and affective categories attributed to cognitive agents, such as attention and perspective, volition and intention, and expectation and affect. It addresses the semantic structure of morphological and lexical forms, as well as of syntactic patterns. And it addresses the interrelationships of conceptual structures, such as those in metaphoric mapping, those within a semantic frame, those between text and context, and those in the grouping of conceptual categories into large structuring systems. Overall, its aim is to ascertain the global integrated system of conceptual structuring in language.
1. Introduction The linguistic representation of conceptual structure is the central concern of the twoto-three decades old field that has come to be known generally as “cognitive linguistics” through such defining works as Fauconnier (1985), Fauconnier & Turner (2002), Fillmore (1975, 1976), Lakoff (1987, 1992), Langacker (1987, 1991), and Talmy (2000a, 2000b), as well as through edited collections like Geeraerts & Cuyckens (2007). This field can first be characterized by contrasting its “conceptual” approach with two other approaches, the “formal” and the “psychological”. Particular research traditions have largely based themselves within one of these approaches, while aiming – with greater or lesser success – to address the concerns of the other two approaches. The formal approach focuses on the overt structural patterns exhibited by linguistic forms, largely abstracted away from or regarded as autonomous from any associated meaning. This approach thus includes the study of syntactic, morphological, and morphemic structure. The tradition of generative grammar has been centered in the formal Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 622–642
27. Cognitive Semantics: An overview approach. But its relations to the other two approaches have remained limited. It has all along referred to the importance of relating its grammatical component to a semantic component, and there has indeed been much good work on aspects of meaning, but this enterprise has generally not addressed the overall conceptual organization of language. The formal semantics that has been adopted within the generative tradition (e.g., Lappin 1997) has largely included only enough about meaning to correlate with the formal categories and operations that the main body of the tradition has focused on. And the reach of generative linguistics to psychology has largely considered only the kinds of cognitive structure and processing that might be needed to account for its formal categories and operations. The psychological approach regards language from the perspective of general cognitive systems such as perception, memory, attention, and reasoning. Centered in this approach, the field of psychology has also addressed the other two approaches. Its conceptual concerns (see e.g., Neely 1991) have in particular included semantic memory, the associativity of concepts, the structure of categories, inference generation, and contextual knowledge. But it has insufficiently considered systematic conceptual structuring – the global integrated system of schematic structures with which language organizes conceptual content. By contrast, the conceptual approach of cognitive linguistics is concerned with the patterns in which and processes by which conceptual content is organized in language. It has thus addressed the linguistic structuring of such basic conceptual categories as space and time, scenes and events, entities and processes, motion and location, and force and causation. To these it adds the basic ideational and affective categories attributed to cognitive agents, such as attention and perspective, volition and intention, and expectation and affect. It addresses the semantic structure of morphological and lexical forms, as well as of syntactic patterns. And it addresses the interrelationships of conceptual structures, such as those in metaphoric mapping, those within a semantic frame, those between text and context, and those in the grouping of conceptual categories into large structuring systems. Overall, the aim of cognitive linguistics is to ascertain the global integrated system of conceptual structuring in language. Cognitive linguistics, further, addresses the concerns of the other two approaches to language. First, it examines the formal properties of language from its conceptual perspective. Thus, it aims to account for grammatical structure in terms of the functions this serves in the representation of conceptual structure. Second, as one of its most distinguishing characteristics, cognitive linguistics aims to relate its findings to the cognitive structures that concern the psychological approach. It aims both to help account for the behavior of conceptual phenomena within language in terms of those psychological structures, and at the same time, to help work out some of the properties of those structures themselves on the basis of its detailed understanding of how language realizes them. It is this trajectory toward unification with the psychological that motivates the term “cognitive” within the name of this linguistic tradition. In the long term, its aim is to integrate the linguistic and the psychological perspectives on cognitive organization in a unified understanding of human conceptual structure. With its focus on the conceptual, cognitive linguistics regards “meaning” or “semantics” simply as conceptual content as it is organized by language. Thus, general conception as experienced by individuals – i.e., thought – includes linguistic meaning within its greater compass. And while linguistic meaning – whether that expressible by an individual language or by language in general – apparently involves a selection from or constraints upon general conception, it is nevertheless qualitatively of a piece with it.
623
624
VI. Cognitively oriented approaches to semantics Cognitive linguistics is as ready as other linguistic approaches to represent an aspect of language abstractively with a symbolic formula or schematic diagram, provided that that aspect is judged both to consist of discrete components in crisp relationships and to be clearly understood. But most cognitive linguists share the sensibility that such formal representations poorly accord with the gradients, partial overlaps, interactions that lead to mutual modification, processes of fleshing out, inbuilt forms of vagueness, and the like that they observe in semantics. They instead aim to set forth such phenomena through descriptive means that provide precision and rigor without formalisms. They further find that formal accounts present their representations of language organization with premature exhaustiveness and mistakenly uniform certainty. We might propose developing a field of “theoryology” that taxonomizes types of theories, according to their defining properties. In such a field, a formal theory of language, at any given phase of its grasp of language phenomena, would be of the type that requires encompassive and perfected mechanisms to account for those phenomena. But cognitive linguistics rests on a type of theory that, at its foundation, builds in gradients for the stage of development to which any given aspect of language under analysis has been brought, and for the certainty with which the analysis is held. While cognitive linguists largely share the approach to language outlined here, they may differ in more limited respects. For example, Ronald Langacker generally stresses the contribution of every morpheme and construction in a sentence to the unified meaning of the sentence, while George Lakoff and I see certain interactions among the elements as overriding such contributions. And Lakoff stresses a theory of embodiment that Langacker and I break into subtheories and in part challenge (see section 5.1). Terminologically, “cognitive linguistics” refers to the field as a whole. Within that field, “cognitive grammar” is largely associated with Langacker’s work, while “cognitive semantics” is largely associated with my own work (the main focus below), though it is sometimes used more extendedly. Externally, cognitive linguistics is perhaps closest to functional linguistics (e.g., Givon 1989). Discourse is central to the latter while more peripheral to the former, but both approach language with similar sensibilities. Jackendoff’s approach is comparable in spirit to cognitive linguistics, as seen in article 30 (Jackendoff) Conceptual Semantics. Thus, he assumes a mentalist, rather than a cognition-avoiding logical, basis for meaning. And he critiques the approaches of Fodor, Wierzbicka, and Levin and Rappaport (see article 19 (Levin & Rappaport Hovav) Lexical Conceptual Structure) much as does cognitive linguistics. But his reliance on an algebraic features-based formalism to represent meaning differs from the cognitive-linguistic view of its inadequacy in handling the semantic gradience and modulation cited above. And his privileging of spatial structure in semantics is at variance with the significance that cognitive linguistics sees in such further domains as temporal structure, force-dynamic/causal structure, cognitive state (including purpose, expectation, affect, familiarity Hovav), and reality status (including factual, counterfactual, conditional, potential), as well as domains that he himself cites, like social relations.
2. The semantics of grammar To turn to the specific contents of Cognitive Semantics, then, this outline opens with the semantics of grammar because it is the key to conceptual structuring in language. A
27. Cognitive Semantics: An overview universal design feature of languages is that their meaning-bearing forms are divided into two different subsystems, the open-class, or lexical, and the closed-class, or grammatical (see Talmy 2000a: ch. 1). Open classes have many members and can readily add many more. They commonly include (the roots of) nouns, verbs, and adjectives. Closed classes have relatively few members and are difficult to augment. They include bound forms – inflections, derivations, and clitics and such free forms as prepositions, conjunctions, and determiners. In addition to such overt closed classes, a language can have certain implicit closed classes such as word order patterns, a set of lexical categories (e.g., nounhood, verbhood, etc. per se), a set of grammatical relations (e.g., subject status, direct object status, etc.), and grammatical constructions.
2.1. Semantic constraint on grammar Within this formal distinction, the crucial semantic finding is that the meanings that openclass forms can express are virtually unrestricted, whereas those of closed-class forms are highly constrained. This constraint applies both to the conceptual categories they can refer to and to the particular member notions within any such category. For example, many languages around the world have closed-class forms in construction with a noun that indicate the number of the noun’s referent, but no languages have closed-class forms indicating its color. And even closed-class forms referring to number can indicate such notions as singular, dual, plural, paucal, and the like, but never such notions as even, odd, a dozen, or countable. By contrast, open-class forms can refer to all such notions, as the very words just used demonstrate. The total set of conceptual categories with their member notions that closed-class forms can ever refer to thus constitutes an approximately closed inventory. Individual languages draw in different patterns from this universally available inventory for their particular set of grammatically expressed meanings. The inventory is graduated, progressing from categories and notions that may well appear universally in all languages (a candidate is “polarity” with its members ‘positive’ and ‘negative’), through ones appearing in many but not all languages (a candidate is “number”), down to ones appearing in just a few languages (an example is “rate” with its members ‘fast’ and ‘slow’).
2.2. Topological principle for grammar The next issue is what determines the conceptual categories and member notions included in the inventory as against those excluded from it. No single global principle is evident, but several semantic constraints with broad scope have been found. One of these, the “topology principle”, applies to the meanings – or “schemas” – of closedclass forms referring to space, time or certain other domains. This principle largely excludes Euclidean properties such as absolutes of distance, size, shape, or angle from such schemas. Instead, these schemas exhibit such topological properties as “magnitude neutrality” and “shape neutrality”. To illustrate magnitude neutrality, the spatial schema of the English preposition across prototypically represents motion along a path from one edge of a bounded plane perpendicularly to its opposite. But this schema is abstracted away from magnitude so the preposition can be used equally well in The ant crawled across my palm, and in The bus drove across the country. Likewise in time, the temporal schema of the past tense
625
626
VI. Cognitively oriented approaches to semantics morpheme -ed represents occurrence at a point on the time line before that of the current speech event, but the magnitude of the interval between the two points is irrelevant. Thus, Alexander died young can refer to an acquaintance a year ago or to Alexander the Great over two millennia ago. The topological property of shape neutrality is seen in the preposition through. In one usage, its schema represents motion along a linear path located within a medium. But this path can be of any shape, as seen in I made a bee-line / circled / zigzagged through the woods.
2.3. Concept-structuring function of grammar Based on their formal and semantic differences, a further major finding is that the two types of form classes exhibit a functional difference. In the conceptual complex evoked by any portion of discourse, the open-class forms contribute most of the content, while the closed-class forms determine most of the structure. For illustration, consider the sentence A rustler lassoed the steers (a “rustler” being a cowboy who steals another’s livestock). Its three open-class morphemes – rustle, lasso, steer – are conceptually rich. Thus, rustle includes concepts of property ownership, illegality, theft, and livestock. Lasso includes the concepts of twirling a looped rope, casting the loop over an animal’s head, and tautening and drawing the rope’s end. Steer includes the concepts of breeding for human consumption, a certain type of animal, and castration. These morphemes seem to provide most of the conceptual content. By contrast, the more numerous closed-class forms are conceptually spare. They include: -ed ‘occurring before the present moment’; -s‘multiple instantiation’; -ø ‘unitary instantiation’; the ‘speaker infers that the addressee can identify the referent’; a ‘speaker infers that the addressee cannot identify the referent’; -er ‘performer of the represented action’; noun status for rustler and steer ‘thing’; verb status for lasso ‘process’; subject status for rustler ‘Agent’; and direct object status for steer ‘affected Patient’. These seem to set most of the conceptual structure. Shifting one class of forms while keeping the other class intact highlights their content/structure division of labor. A shift in all the closed-class forms – as in Will the lassoers rustle a steer?– restructures the conception, but leaves the cowboy-landscape content largely intact. By contrast, a shift in the open-class forms – as in A machine stamped the envelopes – changes content while leaving the structure intact. The crucial conclusion is that the closed-class subsystem is perhaps the most fundamental conceptual structuring system of language. The fact that language may thus have a formally distinct subsystem dedicated to representing conceptual structure may give it a central role in the larger aim of examining conceptual structure across human cognition overall.
3. Schematic structure The structuring of conception just outlined for language is also termed “schematic” in cognitive linguistics. When schematic structure pertains to the meaning of a single morpheme – as above for across – it is termed a “schema” in my own work and an “imageschema” in Lakoff’s (e.g., 1987) work. Schematic structure extends further, though. At a first level, closed-class notions group into conceptual categories (“number” was an initial
27. Cognitive Semantics: An overview example), each with its own distinctive schematic structure. Such categories also largely share certain structural properties, such as the capacity for converting from one category member to another, the multiple nesting of such conversions, and a structural parallelism between objects in space and events in time. At a second level, these categories join in extensive “schematic systems” that structure major sectors of conception. Four of these schematic systems are outlined next.
3.1. Configurational structure One schematic system, “configurational structure”, comprehends all the respects in which closed-class schemas represent structure for space or time or other conceptual domains often in virtually geometric patterns (see Talmy 2000a: ch. 3, 2003, 2006; Herskovits 1986; article 98 (Pederson) The expression of space; article 107 (Landau) Space in semantics and cognition). It thus includes much that is within the schemas represented by spatial prepositions, by temporal conjunctions, and by tense and aspect markers, as well as by markers that otherwise interact with open-class forms with respect to object and event structure. This last type of schema is seen in the categories of “plexity” and “state of boundedness”, treated next.
3.1.1. Plexity The conceptual category of plexity pertains to a quantity’s state of articulation into equivalent elements. Its two main member notions are “uniplexity” and “multiplexity”. The novel term “plexity” was chosen to capture an underappreciated generalization present across the traditional categories of “number” for objects in space and “aspect” for events in time. In turn, uniplexity thus covers both the singular and the semelfactive, while multiplexity covers both plural and iterative. If an open-class form is intrinsically lexicalized for a uniplex referent, a closed-class form in construction with it can trigger a cognitive operation of “multiplexing” that copies its original solo referent onto various points of space or time. Thus, in English, the noun bird and the verb (to) sigh intrinsically have a uniplex referent. But this can be multiplexed by adding -s to the noun, as in birds, or by adding keep -ing to the verb, as in keep sighing. (True, English keep is open-class, but parallel forms in other languages are closed-class iterative forms.) An operation of “unit excerpting” can perform the reverse conversion from multiplexity to uniplexity on an intrinsically multiplex open-class form. In English, this operation is performed only by grammatical complexes, as in going from furniture to (a) piece of furniture or from breathe to take a breath. But other languages have simplex forms. Thus, Yiddish goes from groz ‘grass’ to (a) grezl ‘(a) blade of grass’. And Russian goes from cˇ ixat’ ‘sneeze a multiplex number of times’ to cˇ ixnut’ ‘sneeze once’.
3.1.2. State of boundedness A second conceptual category is “state of boundedness”, with two main member notions, “unboundedness” and “boundedness”. An unbounded quantity is conceptualized as able to continue on indefinitely without intrinsic finiteness. A bounded quantity is conceptualized as an individuated unit entity with a boundary around it. As with plexity, these new terms are intended to capture the commonality across the space and time domains, and
627
628
VI. Cognitively oriented approaches to semantics to generalize over such usually separate distinctions as mass and imperfective on the one hand, and count and perfective on the other. An English noun and verb lexicalized for a bounded referent are lake and (to) dress, as seen by their compatibility with in, as in: We flew over a lake in 1 hour and I dressed in 8 minutes. But water and (to) sleep express unbounded referents, as seen by their incompatibility with in, as in: *We flew over water in 1 hour. / *I slept in 8 hours. But a closed-class form can trigger a cognitive operation of “bounding” or “portion excerpting” on these morphemes, as seen in: We flew over some water in 1 hour. / I slept some. The reverse operation of “debounding” to convert a bounded referent into an unbounded one is also represented, at least for objects in space. Thus, the English count nouns (a) shrub / panel can take closed-class suffixes to yield the mass nominals shrubbery / paneling.
3.1.3. Configurational nesting Schemas from all the schematic systems and the cognitive operations they trigger can be nested to form intricate structural patterns. Specifically, schemas from the plexity and boundedness categories of the configurational schematic system can nest in this way. Nesting can be illustrated first for events in time with the verb (to) flash. The basic uniplex status of this verb is seen in The beacon flashed (once). This uniplex event can be multiplexed as in The beacon kept flashing. This can be bounded as in The beacon flashed 5 times in a row. This can then be treated as a new uniplexity and remultiplexed as in The beacon kept flashing 5 times at a stretch. And this can in turn be rebounded, as in The beacon flashed 5 times at a stretch for 3 hours. A homologous set of structures can be represented for objects in space. This is seen in the following sequence of sentences:I saw a duck. / I saw ducks. / I saw a group of 5 ducks. / I saw groups of 5 ducks each. / I saw 3 acres of groups of 5 ducks each. The progressively greater structural nesting common across these sentence-sets can be represented as follows: (1)
a. b. c. d. e.
! …!!!!!!… [!!!!!] … [!!!!!] – [!!!!!] … [ [!!!!!] – [!!!!!] … [!!!!!] – [!!!!!] ]
3.2. Perspective point While the first schematic system, configurational structure, establishes the basic delineations by which a scene or event being referred to is structured, a second schematic system, “perspective point”, directs one as to where to place one’s “mental eyes” to look out at the structured scene or event (see Talmy 2000a: ch. 1). This perspectival system includes a number of conceptual categories, three of which are outlined next.
3.2.1. Perspectival location One conceptual category, “perspectival location” is a perspective point’s spatial or temporal positioning within a larger frame. The following two sentences are a spatial
27. Cognitive Semantics: An overview example: The lunchroom door slowly opened and two men walked in. / Two men slowly opened the lunchroom door and walked in. The first sentence induces the listener to locate her perspective point inside the room, whereas the second sentence is conducive to an external perspectival location (or perhaps to a non-specific one). How is this accomplished? The cognitive calculations at work appear to combine a rule of English with geometric knowledge. Though often breached, an apparent general rule in English is that if the initiator of an event is visible, it must be included in the clause expressing the event, but if not visible, it must be omitted. Thus, in the first sentence, no initiator of the door’s opening is mentioned, hence none must have been visible. But the second clause indicates that the apparent initiator, the two men, moved from outside to inside the lunchroom. Assuming opaque walls and door, the only way that an entering initiator could not be visible to an observer during the door’s opening is if that observer were located inside the lunchroom. In the second sentence, by contrast, the initiator is mentioned, hence must be visible. The only way a door-opening initiator who moves from the outside to the inside can be visible to an observational perspective point is if that perspective point is outside.
3.2.2. Perspectival distance and motive state Two further conceptual categories here are “perspectival distance”, with three main member notions: a perspective point’s distal, medial, or proximal distance from a referent entity; and “perspectival motive state”, with two main member notions: a perspective point’s remaining stationary or moving along a path. Both can be illustrated at once by the following two sentences: There are some houses in the valley. / There is a house every now and then through the valley. Both sentences could refer to the exact same physical scene, and that circumstance will be assumed here. But the closed-class forms in the first sentence – the plural subject, the collective quantifier some, and the stationary preposition in – direct a listener to cognize the scene as if from a stationary distal perspective point with global scope of attention. By contrast, the closed-class forms of the second sentence – the singular subject, the distributive temporal phrase, and the motion preposition through – direct a listener to cognize the scene as if with a moving proximal perspective point with local scope of attention, that is, as if with a series of close-up views of successive houses.
3.2.3. Perspectival nesting As with configurational nesting earlier, the perspectival schematic system also exhibits nesting. Its illustration here shows that perspective applies to time as well as to space as above, and introduces a further category, “direction of viewing”. Consider the sentence At the punchbowl, John was about to meet his first wife-to-be. The expression be about to establishes a perspective point for the speaker shortly before John’s encounter with a particular woman and a direction of viewing prospectively aimed toward that encounter. Next, the expression (wife-)to-be establishes a second prospective viewing that looks ahead to the time when the woman whom John encounters will be his wife. The originating point of this viewing can be taken either as the speaker’s from the same earlier perspective point or as John’s at the time of his encounter, nested within the speaker’s earlier perspective. Then, triggered by the word first, a further prospective
629
630
VI. Cognitively oriented approaches to semantics viewing, or family of viewings, points ahead to a subsequent wife or wives following John’s marriage with the woman at the punchbowl. Finally, a perspective point of the speaker at the present moment of speech is established by the past tense of the main verb was. It is this perspective point at which the speaker’s cumulative knowledge of the reported sequence of events is stored as memory and, in turn, which functions as the origin of a retrospective direction of viewing over the earlier sequence. The earlier perspective points are here nested within the scope of the viewing from the current perspective point.
3.3. Distribution of attention A third schematic system, “distribution of attention”, directs a listener’s attention differentially over the structured scene from the established perspective point (see Talmy 2000a: ch. 4, 2007). Grammatical and other devices set up regions with different degrees of salience, arrange these regions into different patterns, and map these patterns in one or another way over the components of the structured scene. Several patterns are outlined here.
3.3.1. Focal attention One attentional arrangement is a center-surround pattern with the center foregrounded as the focus and with the surround backgrounded. The grammatical relation of subject status can direct focal attention to the referent of the subject nominal, and alternative selections of subject can place the center of the pattern over different referents, even ones within the same event. Thus, focal attention can be mapped either onto the seller in a commercial transaction, with lesser attention on the remainder, as in The clerk sold the vase to the customer, or onto the buyer, with lesser attention on the new remainder, as in The customer bought the vase from the clerk (see Talmy 2000a: ch. 1). For another realization of this pattern, Fillmore’s (1976) term “frame” and Langacker’s (1987) term “base” refer to a structured set of coentailed concepts in the attentional background. Their respective terms “highlighting” and “profiling” then refer to the foregrounding of the portion of the set that a morpheme refers to directly. A Husserl (1970) example can illustrate. The nouns husband and wife both presuppose the conception of a married couple in the background of attention, while each focuses attention on one or the other member of such a pair in the foreground.
3.3.2. Level of synthesis In expressions referring to the same scene, different grammatical forms can direct greater attention to either of two main “levels of synthesis” or of granularity, the Gestalt level or the componential level. Thus, the head status of pyramid in the sentence The pyramid of bricks came crashing down, raises its salience over that of bricks with its dependent status. More attention is at the Gestalt level of the whole pyramid, conceptually tracking its overall movement. But the dependency relations are reversed in the sentence The bricks in the pyramid came crashing down. Here, more attention is at the componential level of the constituent bricks, tracking their multiple movements (see Talmy 2000a: ch. 1).
27. Cognitive Semantics: An overview
3.3.3. Window of attention A third pattern is the “window of attention”. Here, one or more (discontinuous) portions of a referent scene are foregrounded in attention (or “windowed”) by the basic device of their explicit mention, while the remainder of the scene is backgrounded in attention (or “gapped”) by their omission from mention. To illustrate, the sentence The pen kept rolling off the uneven table conveys the conception of an iterating cycle in which a pen progresses through the phases of lying on a table, falling down, lying on the ground, and being placed back on the table. But the overt linguistic material refers only to the departure phase of the pen’s cyclic path. Accordingly, only this portion of the total referent is foregrounded in attention, while the remainder of the cycle is relatively backgrounded. With enough context, the alternative sentence I kept placing the pen back on the uneven table could refer to the same cycle. But here, the presence of overt material referring to the return phase of that cycle foregrounds that phase in attention, while now the departure phase is backgrounded (see Talmy 2000a: ch. 4).
3.3.4. Attentional nesting Nesting was shown for configuration and for perspective, and it can also be seen in the schematic system of attention. It appears in the second of the following two sentences: The customer bought a vase. / The customer was sold a vase. In the second sentence, focal attention is first directed to the seller by the lexical choice of sell but is then redirected to the buyer by the passive voice. If this redirection of attention were total, then the second sentence would be semantically indistinguishable from the first sentence, but in fact it is not. Rather, the redirection of attention is only partial: it leaves intact the foregrounding of the seller’s active intentional role, but it shifts the main focus onto the buyer as target. Altogether, then, it can be said that attention on the seller is hierarchically embedded within a more dominant attention on the buyer.
4. Conceptual organization In addition to schematic systems, language has many other forms of extensive and integrated conceptual organization, such as the three presented next. Although Figure/ Ground organization and factive/fictive organization could be respectively comprehended under the attentional and the configurational schematic systems, and force dynamic organization has elsewhere been treated as a fourth schematic system, these are all extensive enough and cut across enough distinctions to be presented here as separate bodies of conceptual organization.
4.1. Figure/ground organization In representing many spatial, temporal, equational, and other situations, language is so organized as to single out two portions of the situation, the “Figure” and the “Ground”, and to relate the former to the latter (see Talmy 2000a: ch. 5). In particular, the Figure is a conceptually movable entity; its location or path is conceived as a variable whose particular value is at issue. The Ground is a reference entity with a stationary setting relative to a reference frame; the Figure’s variable is characterized with respect to it.
631
632
VI. Cognitively oriented approaches to semantics For a spatial example, consider the sentence The bike is near the house. The bike functions as Figure as a movable object whose location is characterized in terms of the house’s location. The stationary house, set within the implicit reference frame of the neighborhood, etc., correspondingly functions as Ground. The presence of these Figure / Ground functions is demonstrated by the fact that the sentence with the nominals reversed – The house is near the bike– in which the house is now the Figure and the bike is the Ground, clearly has a different meaning and is odd to boot. Since the ‘near’ concept is symmetrical, the meaning difference must be attributed to something like the reversed Figure / Ground roles. Since prototypically a house is not conceptually movable and a bike is not a fixed reference point, these new role assignments clash with our background knowledge and the sentence is flagged as different and odd. The temporal form of Figure / Ground roles can be seen in two events represented by the clauses of a complex sentence. Thus, in the sentence He exploded after he touched the button, the button-touching event, occurring earlier in time, functions as a Ground with its presumptively known location on the time line, while the explosion event functions as a Figure, getting localized on the time line with respect to the button-touching event. As before, these Figure / Ground roles are reversed in the otherwise synonymous sentence He touched the button before he exploded. And as before, these new role assignments clash with the prototypical bases for characterizing such temporal locations and so again flag the sentence as semantically different and unusual.
4.2. Factive/fictive organization At least in language and visual perception (see Talmy 2000a: ch. 2), a pervasive cognitive pattern can be posited in which two different cognitive subsystems in an individual form discrepant representations of the same entity. Further, a third subsystem in the individual assesses one of those representations as more veridical, or “factive”, and the other as less veridical, or “fictive”. In particular, language abounds in “fictive motion”, in which a factively stationary situation is represented in terms of motion. Of the many categories of fictive motion, two are outlined next.
4.2.1. Coextension paths The category of fictive motion previously most noticed, “coextention paths”, depicts the form, orientation, or location of a spatially extended object in terms of a path over the object’s extent. An example is the sentence The fence zigzags from the plateau down into the valley. Here, one cognitive subsystem in a listener has the world knowledge that the fence is stationary. But another subsystem responds to the literal wording – specifically, the motion words zigzag, from, down, and into – to evoke a sense of motion along the linear extent of the fence that serves to characterize the fence’s contour and positioning. A parallel sentence The fence zigzags from the valley up onto the plateau, evokes a sense of motion in the opposite direction. These two sentences together show how a concept – here, that of a sense of directed motion – can be imposed on or imputed to concepts of phenomena in the world through linguistic devices (see 5.1). By contrast, the factive stationariness of the fence might be represented, if poorly, by a sentence like The fence stands in a zigzag pattern at an angle between the plateau and the valley.
27. Cognitive Semantics: An overview
4.2.2. Emanation paths Another category of fictive motion, “emanation paths”, involves the fictive conceptualization of an intangible line emerging from a source object, passing in a straight line through space, and terminating on a target object, where factively nothing is in motion. In one subtype, “demonstrative paths”, a directed line emerges from the pointed front of a source object. This is seen in The arrow points toward / past / away from the town. In the “radiation paths” subtype, a beam of radiation emanates from a radiant object and terminates on an irradiated object. This is seen in Light shone from the sun into the cave. It might be claimed that photons do factively emanate from a radiant object, so that fictive motion need not be invoked. However, we do not see photons, so any representation of motion is cognitively imputed. In any case, in a related subtype, “shadow paths”, none will claim the existence of “shadowons”, and yet once again fictive motion is seen in a sentence like The pole threw its shadow on the wall. Finally, a “sensory path” is represented as moving from the experiencer to the experienced object in a sentence like I looked into / past / away from the tunnel. Such an emanating “line of sight” can also be represented as moving laterally. Both these forms of fictivity – first lateral, then axial – are represented in I slowly looked down into the well. One question for this fictive category, though, is what determines the direction of the intangible emanation. Logically, since motion is imagined, it should be possible to conceptualize a reversed path. Attempts at representing such reversed paths appear in sentences like *Light shone from my hand onto the sun, or *The shadow jumped from the wall onto the pole, or *I looked from that distant mountain into my eyes. But such formulations do not exist in any language that represents such events fictively. Rather, an “active-determinative” principle appears to govern the direction of emanation. Of the two objects, the more active or determinative one is conceptualized as the source. Thus, relative to my hand, the sun is brighter, hence, more active, and must be treated as the source of radiative emanation. My agency in looking is more active than the inanimate perceived object, so I am treated as the source of sensory emanation. And the pole is more determinative – I can move the pole and the shadow will also move, but I cannot perform the opposite operation of moving the shadow and getting the pole to move – so the pole is treated as the source of shadow emanation.
4.3. Force dynamics Language has an extensive conceptual system of “force dynamics” for representing the patterns in which one entity, the “Agonist”, has force exerted on it by another entity, the “Antagonist” (see Talmy 2000a: ch. 7). It covers such concepts as an Agonist’s natural tendency toward action or rest, an Antagonist’s opposition to such a tendency, the Agonist’s resistance to this opposition, and the Antagonist’s overcoming of such resistance. It includes the concepts of causing and letting, helping and hindering, and blockage and the removal of blockage. It generalizes over the causative concepts of traditional linguistics, placing them naturally within a matrix of finer distinctions. It also cuts across conceptual domains, from the physical, to the psychological, to the social, as illustrated next.
633
634
VI. Cognitively oriented approaches to semantics
4.3.1. The physical domain A contrast between two sentences can illustrate the physical domain. The sentence The ball rolled along the green represents motion in a force-dynamically neutral way. But The ball kept rolling along the green adds force dynamics to the otherwise same spatial movement. In fact, it has readings for two different force dynamic patterns. Interpreted under the “extended causing of motion” pattern, the ball as Agonist has a natural tendency toward rest but is being overcome by a stronger Antagonist such as the wind. Alternatively, interpreted under one of the “despite” patterns, the ball as Agonist has a natural tendency toward motion and is overcoming a weaker Antagonist such as stiff grass – that is, it moves along despite opposition from the grass.
4.3.2. The psychological domain An individual’s psyche can be conceptualized and linguistically represented as a “divided self” in which two different components are in force dynamic opposition. To illustrate, the sentence I didn’t respond is force dynamically neutral. But the sentence I refrained from responding, though it still represents a lack of response, now adds in the force dynamic pattern “extended causing of rest”. Specifically, a more central part of me, the Agonist, has a tendency toward responding, while a more peripheral part of me, the Antagonist, opposes this tendency, is stronger, and so blocks a response. The two opposing parts are explicitly represented in the corresponding sentence I held myself back from responding.
4.3.3. The social domain Much as the closed-class category of prepositions is largely associated with a specific semantic category, that of paths or sites in relation to a Ground object, so the closed-class category of modals is largely associated with the semantic category of force dynamics – in particular, with its social application. Here, certain interpersonal interactions, mediated solely through communication, can be metaphorically represented in terms of force or pressure exerted by one individual or group on another. For example, must, as in You must go to school, represents one of the “causing” force dynamic patterns between individuals. It sets the subject up as an Agonist whose desire – taken as a kind of tendency – is to do the opposite of the predicate’s referent. And it sets up an implicit Antagonist – for example, I, your mother, people at large – that exerts psychological pressure on the Agonist toward performance of the undesired action. The modal may, as in You may go to the playground, instead represents a “letting” force dynamic pattern. Here, the subject as Agonist has a desire or tendency toward the stated action that could have been blocked by an implicit stronger Antagonist, but this potential blockage is withheld.
5. Interactions among semantic structures The preceding discussion has mostly dealt with conceptual structures each in its own terms. But a major aspect of language organization is that conceptual structures, from small to large, can also interact with each other in accordance with certain principles. Such interactions are grouped together below under four extensive categories.
27. Cognitive Semantics: An overview
5.1. Conceptual imposition A widespread view about the contents and structures of cognition is that they ultimately derive from real properties of external phenomena, through processes of perception and abstraction, in what John Searle has called the “world-to-mind direction of fit”. While acknowledging such processes, cognitive linguistics calls attention instead to intrinsic content and structure in cognition – presumably mainly of innate origin – and to how extensive they are. Such native cognitive properties certainly apply to the general functioning of cognition. But they also apply in many forms of “conceptual imposition” – the imputation of certain contents and structures to our conceptions and perceptions of the world in a “mind-to-world direction of fit”. Several realizations of such conceptual imposition are outlined next.
5.1.1. The imputation of content or structure An initial non-linguistic example of autochthonous cognition is “affect”. Emotions such as anger or affection are experienced either as such or as applied to outside entities. But it is difficult to see how such feelings could arise from a process of abstraction from the external world. Linguistic examples of course abound. Fictive motion offers some immediately striking ones, such as the “shadow path” in The pole threw its shadow onto the wall. As described in 4.2.2, the literal wording here depicts a movement from pole to wall that is not overtly perceived as occurring “out there”. That is, at least the languagerelated portion of our cognition imposes the conceptualization of motion onto what would be perceived as static. Actually, though, virtually all the semantic structures described so far are forms of conceptual imposition. Thus, in the scene represented by the sentence The post office is near the bank, based on the discussion in 4.1, it could hardly be claimed that the post office is inherently Figure-like and the bank Ground-like, beyond our cognitive imputation of those roles to those objects. And houses dispersed over a valley, as described in 3.2.2, could scarcely possess an associated moving or stationary perspective point from which they are viewed, apart from the linguistic forms that ascribe such perspective to the represented scene.
5.1.2. Alternatives of conceptualization A consequence of the fact that a particular structural or contentful conception can be imputed to a phenomenon is that a range of alternative conceptions could also be imputed to it. These are alternatives of what my work has termed the “conceptualization” and Langacker’s has termed the “construal” of a phenomenon. For example, as seen in 4.2.1, the fictive motion that could be imputed to a fence along one coextension path, as in The fence goes from the plateau down into the valley could also be imputed to it along the reverse path, as in The fence goes from the valley up onto the plateau. Or consider the deictics this and that, which establish a conceptual boundary in space and depict an indicated object as being respectively either on the speaker’s side of the boundary or on the side opposite the speaker. Then, referring to the exact same bicycle standing, say, some 8 feet away, a speaker could opt to say either This bike is in my way, or That bike is in my way. The speaker can thus impose alternatives of conceptualization
635
636
VI. Cognitively oriented approaches to semantics on the scene, imputing a conceptual boundary either between himself and the bike or on the other side of the bike.
5.1.3. Embodiment The notion of “embodiment” extends the idea of conceptual imposition. It assumes that such imposed concepts are largely based on experiences humans have of their bodies interacting with environments or on psychological or neural structure. It proposes that such experiences are imputed to, or form the basis of, our understanding of most phenomena (Lakoff & Johnson 1999). In my view, though, the linguistic literature has largely applied the blanket term “embodiment” to a range of insufficiently distinguished ideas that differ in their validity. Four such distinct ideas of embodiment in current use are outlined here. First, in what might be called the “bulk encounter” idea of embodiment, phenomena are grouped and categorized in terms of the way in which our bodies – with their particular shape and mesoscopic size – can interact with them. But this idea is either incorrect or limited. For example, many languages have closed-class representation for a linear configuration, as English does with the preposition along. This preposition applies to a Ground object schematizable as linear. But due to magnitude neutrality (see 1.2), this schema can be applied to objects of quite different sizes, as in The ant climbed up along the matchstick, and The squirrel climbed up along the tree trunk. Yet, although the along schema can group a matchstick and a tree trunk together, we bodily interact with those objects in quite different ways. Accordingly, the bulk encounter idea of embodiment does not account for this and perhaps much else in the structure of linguistically represented conception. Second, in what could be called the “neural infrastructure” idea of embodiment, it is the organization and operation of our neural structure that determines how we conceptualize phenomena. Thus, the linear schematization just cited might arise from neurally based processes of visual perception that function to abstract out just such onedimensional contours. Comparably, the concept evoked on hearing a word such as bicycle or coffee might arise from the reactivation of the visual, motor, and olfactory areas that were previously active during interaction with those objects, in the manner of Damasio’s “convergence zones”. The problem with this idea of embodiment is that, although generally correct, it is simply subsumed by psychology and needs no separate statement of its own. Third, in what could be called the “concreteness as basic” idea of embodiment, the view is that experience with the tangible world is developmentally earlier and provides the basis for later conceptions of intangible phenomena, much as in Piagetian theory. A commonly cited example is concepts of time based on those of space. Another is the conception of purpose based on that of destination, that is, one’s destination in a physical journey in what Lakoff (1992) terms the “event structure metaphor”. While something of this directional bias is evident in metaphoric mapping (see below), it is not clear that it correctly characterizes cognitive organization. On the contrary, we may well have an innate cognitive system dedicated to temporal processing – perhaps already evident very early – that includes perception of and control over duration; starting, continuing, and stopping; interrupting and resuming; repeating; waiting; and speeding up and slowing down. We may likewise have an innate cognitive system for intention or purpose. In any
27. Cognitive Semantics: An overview case, “purpose” cannot be derived from “destination”. After all, the concept that a person moving from point X to point Y has Y as a “destination” already includes a component of purpose. When such a component is lacking, we do not say that a person has point Y as her destination but rather that her motion simply “stops” at that point. Accordingly, the notion of purpose present in the concept of “destination” could not derive from perceptions of concrete motion patterns, but might originate in an innate cognitive system for the enactment and conception of intention or purpose. In a fourth and final type here, what can be called the “anti-objectivism” idea of embodiment faults the view that there exists an autonomous truth, uniform and pervasive, in such realms as logic and mathematics that the human mind taps into for its understandings and activities in those realms. Rather, we deal with such realms by imputing or mapping onto them various of our conceptual schemas, motor programs, or other cognitive structures. On this view, we do much of our thinking and reasoning in terms of such experientially derived structures. For example, our sense of the meaning of the word angle is not derived from some independent ideal mathematical realm, but is rather built up from our experience, e.g., from perceptions of a static forking branch, from moving two sticks axially until their ends touch, or from rotating one stick while its end touches that of another. This view of how we think may be largely correct. But if applied too broadly, it might obscure the possible existence of an actual cognitive system for objectivity and reason. Such a system might have the capacity to check for coherence across concepts, for global consistency across conceptual and cognitive domains, and for consistency across inferences and reasoning, – whether or not the assessed components themselves arose through otherwise embodied processes – and it might be the source of the very conception of an objective domain.
5.2. Cognitive recruitment I propose the term “ recruitment” for a pervasive cognitive process in which a cognitive configuration with a certain original function or conceptual content gets used to perform another function or to represent some other concept. That is, the basic function or concept is appropriated or co-opted in the service of manifesting another one. Such recruitment would certainly cover all tropes, including fictivity and metaphor. Thus, in a coextension path example of fictive motion like The fence goes from the plateau to the valley (see 4.2.1), the morphemes go, from, and to originally and basically refer to motion, but this reference is conscripted in the service of representing a stationary configuration. And metaphor can also be understood in terms of recruitment. In cognitive linguistics, metaphor has been mainly studied not for its salient poetic form familiar from literature but – under the term “conceptual metaphor” – for its largely unconscious pervasive structuring of everyday expression (see e.g., Lakoff 1992; article 26 (Tyler & Takahashi) Metaphors and metonymies). In the basic analysis, certain structural elements of a conceptual “source domain” are mapped onto the content of a conceptual “target domain”. But in our present terms, it can also be said that the conceptual structures and morphemic meanings original to the source domain are recruited for use as structures and meanings within the target domain. The directionality of the mapping – based on the “concrete as basic” view of embodiment (see 5.1.3) – is typically from a more concrete domain grounded in bodily experience to a more abstract domain. Thus, the more palpable
637
638
VI. Cognitively oriented approaches to semantics domain of space is systematically mapped onto the more abstract domain of time in such everyday expressions as Christmas is ahead / near / almost here / upon us / past. Recruitment can be seen as well in the appropriation of one type of construction to serve as another type. For example, the English question construction with certain modals can serve as a request, as in Could you pass me the salt?. Fictivity terminology could be extended to label the host construction here as a fictive question, and the parasitic construction as a factive request. Or it could be said that a request construction has recruited a question construction. Finally, to illustrate functional recruitment, repair mechanisms in discourse, in their basic function, comprise a variety of devices that a speaker uses to remedy hitches that arise in the production of an utterance. Talmy (2000b: ch. 6) cites a recorded example of a young woman rejecting a suitor in which she uses an inordinate density of repair mechanisms, including false starts, interruptions, corrections, and repetitions. But it is evident that these originally corrective devices have been co-opted to perform a different function: to manifest embarrassed concern for the addressee’s sensitive feelings. And, built in turn upon that function is the further function of the speaker’s signaling to the addressee that she did have his feelings in mind.
5.3. Semantic conflict resolution A conflict or incompatibility often exists between the references of two constituents in a sentence, or between the reference of a constituent and the context or one’s general knowledge (see Talmy 2000b: ch. 5). The treatment of such semantic conflict thus complements treatments of semantic “unification” in which the referents of constituents integrate unproblematically. A hearer of a conflict generally applies one out of a set of resolutions to it. These include shifts, blends, juxtapositions, and juggling. Of these, the first two are characterized next.
5.3.1. Shifts In the type of resolution Talmy (1977) termed a “shift” – now largely called “coercion” after Pustejovsky (1993) – the reference of one of the two conflicting forms changes so as to accord with the reference of the other form. A shift can involve the cancellation, stretching, or replacement of a semantic feature. Each of these three types of shifts is illustrated next. The across schema cited in 2.2 can illustrate component cancellation. This schema prototypically involves a horizontal path on a bounded plane from one edge perpendicularly to its opposite. But the path’s termination on the distal edge can be canceled, as in a sentence like The shopping cart rolled across the boulevard and was hit by an oncoming car. Here, the English preposition is not blocked from usage, or replaced by some preposition referring to partial planar traversal, but continues on with one of its semantic components missing. In fact, the preposition can continue in usage even with both of the path’s edge contacts canceled, as seen in The tumbleweed rolled across the desert for an hour. The across schema can also illustrate component stretching. A prototypical constraint on this schema, not mentioned earlier, is that the main axis of the plane, which is perpendicular to the path, may be longer than the path or of the same length, but cannot be
27. Cognitive Semantics: An overview shorter. Accordingly, I can swim “across” a square swimming pool from one edge to the other, or “across” a canal from one bank to the other, but if my path parallels a canal’s banks, I am not swimming “across” the canal but “along” it. But what if I am at an oblong pool and swim from one of the narrow edges to its opposite? In referring to this situation, the acceptability of the sentence I swam across the pool is great where the pool is only slightly longer than a square shape, and decreases as its relative length increases. The across schema thus permits the path length within the relative-axis constraint to be stretched moderately but not too far. Finally, component replacement can be seen in a sentence like She is somewhat pregnant. Here, the gradient specification of somewhat conflicts with the basic all-or-none specification of pregnant. A hearer might resolve this conflict through the mechanism of juxtaposition, to yield the “incongruity effect” of humor. If not, though, the hearer can shift pregnant into accord with somewhat by replacing its ‘all-or-none’ component with that of ‘gradience’. Then the overall meaning of pregnant shifts as well from involving the presence or absence of a fetus to involving the length of gestation.
5.3.2. Blends An incompatibility between two sets of specifications in a sentence can also be resolved as a “blend”, in which a hearer generates an often imaginative conceptual hybrid that accommodates both of the original conceptual inputs in some novel relation to each other. Talmy (1977) distinguished two types of blends, superimposition and introjection, and illustrated the former with the sentence My sister wafted through the party. The conflict here is between waft suggesting something like a leaf moving gently in an irregular pattern through the air, and the rest of the sentence suggesting a person (moving) through a group of other people. In myself, this sentence evokes the blended conceptualization of my sister wandering aimlessly through the party, somewhat unconscious of the events around her, and of the party somehow suffused with a slight rushing sound of air. Fauconnier & Turner (2002) have greatly elaborated on this process, also terming it a “blend” or a “conceptual integration”. In their terms, two separate mental spaces (see below) can map elements of their content and structure into a third mental space that constitutes a blend of the two inputs, with potentially novel structure. Thus, in referring to a modern catamaran reenacting a century-old voyage by an early clipper, a speaker can say At this point, the catamaran is barely maintaining a 4 day lead over the clipper. The speaker here conceptually superimposes the two treks and generates the apparency of a race.
5.4. Semantic interrelations In the preceding three subsections, semantic structures have in effect “acted on” each other to yield a novel conceptual derivative. But semantic elements and structures can also simply relate to each other in particular patterns. Four such patterns are outlined next.
5.4.1. Within one sense of a morpheme Several bodies of research within cognitive linguistics address the structured relations among the semantic components of the meaning of a morpheme in one of its polysemous senses. Two of these are “frame semantics” and “prototype theory”, outlined next.
639
640
VI. Cognitively oriented approaches to semantics Fillmore’s (e.g., 1976) Frame Semantics (see article 29 (Gawron) Frame Semantics) shows that the meaning of a morpheme does not simply consist of a central concept – the main concern of a speaker in using the morpheme – but extends out indefinitely with ever further conceptual associations that bear particular relations to each other and to the central concept. In fact, several different morphemes can share roughly the same extended frame while foregrounding different portions in the center. Thus, such “commercial frame” verbs as sell, buy, spend, charge, and cost all share in their frames a seller, a buyer, money, and goods, as well as the transfer of money from the buyer to the seller and, in return for that, the transfer of the goods from the seller to the buyer. Each of these concepts in turn rests on a further conceptual infrastructure. For example, the ‘money’ concept rests on notions of governmental minting and socially agreed value, while the ‘in return for’ concept rests on notions of reciprocity and equity. In Lakoff’s (1987) prototype theory (see article 28 (Taylor) Prototype theory), a morpheme’s meaning can generally be viewed as a category whose members differ in privilege, whose properties can vary in number and strength, and whose boundary can vary in scope. In its most prototypical usage, then, the morpheme refers to the most privileged category member, assigns the fullest set of its properties at their greatest strength to that member, and tightens its boundary to enclose the smallest scope. For an example from Fillmore (1976), the meaning of breakfast – in its most prototypical usage – consists of eating certain foods, namely, eggs, bacon, toast, coffee, orange juice, and the like, at a certain time of day, namely, in the morning. But the meaning can be extended to less prototypical values, for example, either to different foods – The Joneses eat liver and onions for breakfast – or to different times of the day – Breakfast is served all day.
5.4.2. Across different senses of a morpheme Brugman (1981) was the first to show that for a polysemous morpheme, one sense can function as the prototype to which the other senses are progressively linked by conceptual increments within a “radial category”. Thus, for the preposition over, the prototype sense may be ‘horizontal motion above an object’ as in The bird flew over the hill. But linked to this by “endpoint focus” is the sense in Sam lives over the hill.
5.4.3. Relations from within a morpheme to across a sentence The “Motion typology” of Talmy (2000b: ch. 1) proposes a universal semantic framework for an event of motion or location. This consists of four components in the main event proper – the moving or stationary “Figure”, its state of “Motion” (moving or being located), its “Path” (path or site), and the “Ground” that serves as its reference point – plus an outside “Co-event” typically of Manner or of Cause. Languages differ typologically as to which of these components they characteristically include within the verb of a sentence, and which they locate elsewhere in the sentence. And these two sets of allocations are correlated. Thus, a “verb-framed” language like Spanish characteristically places the components of Motion and Path together in the verb, and so has an extensive series of “path verbs” with meanings like ‘enter’, ‘exit’, ‘ascend’, ‘descend’, ‘cross’, ‘pass’, and ‘return’. In correlation with this lexicalization pattern for the verb, the language has a ready colloquial construction for representing the Co-event – typically a gerund form that can appear
27. Cognitive Semantics: An overview right after the path verb. For example, ‘I ran into the cave’ might be expressed as Entré corriendo a la cueva – literally, “I entered running to the cave”. By contrast, a “satellite-framed” language like English characteristically places the components of Motion and Co-event together in the verb, and so has a series of “Manner verbs” like run, limp, scuttle and speed. In correlation with this lexicalization pattern for the verb, the language also has an extensive series of “path satellites” – e.g., in, out, up, down, past, across and back – as well as a partially overlapping set of path prepositions, together with the syntactic construction for their inclusion after the verb. The English sentence corresponding to the preceding Spanish one is thus: I ran into the cave. This correlation within a language between a verb’s lexicalization pattern and the rest of the syntax in a motion sentence can be put into relief by noting minimally occurring patterns (see Slobin 1996). Thus, Spanish does not have a path satellite category or an extensive set of path prepositions, and in fact can largely not use the prepositions it does have to represent a path. For instance, it could not do so for the cave example. For its part, English does not have a colloquial gerund construction for use with its few path verbs (which in any case are mostly borrowed from Romance languages, where they are native). Thus, a sentence like I entered the cave running is fully awkward.
5.4.4. Across a sentence Fauconnier (1985) shows how different portions of a sentence can set up distinct “mental spaces” with particular relations to each other. Each such space is a relatively selfcontained conceptual domain with its component elements in a particular arrangement; two spaces can share many of the same elements; and a mapping can be established between corresponding elements. The mapping is directional, going from a “base” space – a conceptual domain generally factual for the speaker – to a “subordinate” space that can be counterfactual, representational, at a different time, etc. Thus, in Max thinks Harry’s name is Joe, the speaker’s base space includes ‘Max’ and ‘Harry’ as elements; the word thinks sets up a subordinate space for a portion of Max’s belief system; and this contains an element ‘Joe’ that corresponds to ‘Harry’.
6. Conclusion In this survey, the field of cognitive linguistics in general and of Cognitive Semantics in particular is seen to have as its central concern the representation of conceptual structure in language. The field addresses properties of conceptual structure both local and global, both autonomous and interactive, and both typological and universal. And it relates these linguistic properties to more general properties of cognition. While much has already been done in this relatively young linguistic tradition, it remains quite dynamic and is extending its explorations in a number of new directions.
7. References Brugman, Claudia 1981. The Story of ‘Over’ . MA thesis. University of California, Berkeley, CA. Fauconnier, Gilles 1985. Mental Spaces. Aspects of Meaning Construction in Natural Language. Cambridge, MA: The MIT Press.
641
642
VI. Cognitively oriented approaches to semantics Fauconnier, Gilles & Mark Turner 2002. The Way We Think. Conceptual Blending and the Mind’s Hidden Complexities. New York: Basic Books. Fillmore, Charles 1975. An alternative to checklist theories of meaning. In: C. Cogen, et al. (eds.). Proceedings of the First Annual Meeting of the Berkeley Linguistics Society (= BLS). Berkeley, CA: Berkeley Linguistics Society, 155–159. Fillmore, Charles 1976. Frame semantics and the nature of language. Annals of the New York Academy of Sciences 280, 20–32. Geeraerts, Dirk & Hubert Cuyckens (eds.) 2007. The Oxford Handbook of Cognitive Linguistics. Oxford: Oxford University Press. Givón, Talmy 1989. Mind, Code, and Context. Essays in Pragmatics. Hillsdale, NJ: Erlbaum. Herskovits, Annette 1986. Language and Spatial Cognition. An Interdisciplinary Study of the Prepositions in English. Cambridge: Cambridge University Press. Husserl, Edmund 1900-01/1970. Logische Untersuchungen. 2nd edn. Halle/Saale: Niemeyer. English translation in: J. N. Findlay. Logical Investigations, vol. 2. London: Routledge & Kegan Paul, 1970. Lakoff, George 1987. Women, Fire, and Dangerous Things. What Categories Reveal about the Mind. Chicago, IL: The University of Chicago Press. Lakoff, George 1992. The contemporary theory of metaphor. In: A. Ortony (ed.). Metaphor and Thought. 2nd edn. Cambridge: Cambridge University Press. 1st edn. 1979. Lakoff, George & Mark Johnson 1999. Philosophy in the Flesh. The Embodied Mind and its Challenge to Western Thought. New York: Basic Books. Langacker, Ronald 1987. Foundations of Cognitive Grammar, vol. 1. Theoretical Prerequisites. Stanford, CA: Stanford University Press. Langacker, Ronald 1991. Foundations of Cognitive Grammar, vol. 2. Descriptive Application. Stanford, CA: Stanford University Press. Lappin, Shalom (ed.) 1997. The Handbook of Contemporary Semantic Theory. Oxford: Blackwell. Neely, James H. 1991. Semantic priming effects in visual word recognition. A selective review of current findings and theories. In: D. Besner & G. W. Humphreys (eds.). Basic Processes in Reading. Visual Word Recognition. Hillsdale, NJ: Erlbaum, 264–336. Pustejovsky, James 1995. Linguistic constraints on type coercion. In: P. Saint-Dizier & E. Viegas (eds.). Computational Lexical Semantics. Cambridge: Cambridge University Press, 71–97. Slobin, Dan 1996. Two ways to travel. Verbs of motion in English and Spanish. In: M. Shibatani & S. Thompson (eds.). Grammatical Constructions. Their Form and Meaning. Oxford: Clarendon Press, 195–219. Talmy, Leonard 1977. Rubber-sheet cognition in language. In: W. Beach, S. Fox & S. Philosoph (eds.). Papers from the Thirteenth Regional Meeting of the Chicago Linguistic Society (= CLS). Chicago, IL: Chicago Linguistic Society, 612–628. Talmy, Leonard 2000a. Toward a Cognitive Semantics, vol. 1. Concept Structuring Systems. Cambridge, MA: The MIT Press. Talmy, Leonard 2000b. Toward a Cognitive Semantics, vol. 2. Typology and Process in Concept Structuring. Cambridge, MA: The MIT Press. Talmy, Leonard 2003. The representation of spatial structure in spoken and signed language. In: K. Emmorey (ed.). Perspectives on Classifier Constructions in Sign Language. Mahwah, NJ: Erlbaum, 169–195. Talmy, Leonard 2006. The fundamental system of spatial schemas in language. In: B. Hampe (ed.). From Perception to Meaning. Image Schemas in Cognitive Linguistics. Berlin: Mouton de Gruyter, 199–234. Talmy, Leonard 2007. Attention phenomena. In: D. Geeraerts & H. Cuyckens (eds.). The Oxford Handbook of Cognitive Linguistics. Oxford: Oxford University Press, 264–293.
Leonard Talmy, Buffalo, NY (USA)
28. Prototype theory
643
28. Prototype theory 1. 2. 3. 4. 5. 6. 7. 8. 9.
Introduction Prototype effects Prototypes and the basic level The cultural context of categories Prototypes and categories Objections to prototypes Words and the world Prototypes and polysemy References
Abstract According to a long-established theory, categories are defined in terms of a set of features. Entities belong in the category if, and only if, they exhibit each of the defining features. The theory is problematic for a number of reasons. Many of the categories which are lexicalized in language are incompatible with this kind of definition, in that category members do not necessarily share the set of defining features. Moreover, the theory is unable to account for prototype effects, that is, speakers’ judgements that some entities are ‘better’ examples of a category than others. These findings led to the development of prototype theory, whereby a category is structured around its good examples. This article reviews the relevant empirical findings and discusses a number of different ways in which prototype categories can be theorized, with particular reference to the functional basis of categories and their role in broader conceptual structures. The article concludes with a discussion of how the notion of prototype category has been extended to handle polysemy, where the various senses of a word can be structured around, and can be derived from, a more central, prototypical sense.
1. Introduction In everyday discourse, the term ‘prototype’ refers to an engineer’s model which, after testing and possible improvement, may then go into mass production. In linguistics and in cognitive science more generally, the term has acquired a specialized sense, although the idea of a basic unit, from which other examples can be derived, may still be discerned. The term, namely, refers to the best, most typical, or most central member of category. Things belong in the category in virtue of their sharing of commonalities with the prototype. Prototype theory refers to this view on the nature of categories. This article examines the role of prototypes in semantics, especially in lexical semantics. To the extent that words can be said to be names of categories, prototype theory becomes a theory of word meaning. Prototype theory contrasts with the so-called classical, or Aristotelian theory of categorization (Lakoff 1982, 1987; Taylor 2003a). According to the classical theory, a category is defined in terms of a set of properties, or features, and an entity is a member of the category if it exhibits each of the features. Each of the features is necessary, jointly they are sufficient. The classical theory captures the ‘essence’ of a category in contrast to Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 643–664
644
VI. Cognitively oriented approaches to semantics the ‘accidental’ properties of category members. The theory entails that categories have clear-cut boundaries and that all members, in their status as category members, have equal status within the category. An example which is often cited to illustrate the classical theory is the category ‘bachelor’, defined in terms of the features [+human], [+adult], [+male], and [–married]. Any entity which exhibits each of the four defining features is, by definition, a member of the category and thus can bear the designation bachelor. Something which exhibits only three or fewer of the features is not a member. In their status as bachelors, all members of the category are equal (though they may, of course, differ with respect to non-essential, accidental features, such as their height, wealth, and such like). The classical theory is attractive for a number of reasons. First, the theory neatly accounts for the relation of entailment. If X is a bachelor, then necessarily X is unmarried, because ‘unmarried’ is a property already contained in the definition of bachelor. Second, the theory explains why some expressions are contradictions. In married bachelor, or This bachelor is married, the property designated by married conflicts with a definitional feature of bachelor. Third, the theory accounts for the distinction between analytic and synthetic statements. This bachelor is a man is synthetic; it is necessarily true in virtue of the definitions of the words. This man is a bachelor is analytic; its truth is contingent on the facts of the matter. The theory also goes some way towards explaining concept combination. A rich bachelor refers to an entity which exhibits the features of bachelor plus the features of rich. In spite of these obvious attractions, there are many problems associated with the classical theory. First, the theory is unable to account for the well-documented prototype effects, to be discussed in section 2. Another problem is that the theory says nothing about the function of categories. Organisms categorize in order to manage the myriad impressions of the environment. If something looks like an X, then it may well be an X, and we should behave towards it as we would towards other Xs. The classical theory is unable to accommodate this everyday kind of inferencing. According to the classical theory, the only way to ascertain whether something is an X is to check it out for each of the defining features of X. Having done that, there is nothing further to say about the matter. Categorization of the entity serves no useful purpose to the organism. A related matter is that the theory makes no predictions as to which categories are likely to be lexicalized in human languages. In principle, any random set of features can define a category. But many of these possible categories – for example, a category defined by the features [is red], [was manufactured before 1980], [weighs 8 kg] – although perfectly wellformed in terms of the theory, are not likely to be named in the lexicon of any human language. A further set of problem arises in regard to the features. In terms of the theory, the features are more basic than the categories which they define. In many cases, however, the priority of the features might be questioned. [Having feathers] may well be a necessary feature of ‘bird’. But do we comprehend the category ‘bird’ on the basis of a prior understanding of what it means for a creature to have feathers, in contrast, say, to having fur, hair, scales, or spines? Probably not. Rather, we understand ‘feathers’ in consequence of our prior acquaintance with birds. Another point to bear in mind is that each feature will itself define a category, namely, the category of entities exhibiting that feature. The feature [+adult] – one component of the definition of bachelor – singles out the category of adults. We will need to provide a classical definition of the category ‘adult’, whose
28. Prototype theory features will in turn need to be given classical definitions. Unless we are prepared to postulate a set of universal, primitive features, out of which all possible categories are constructed – such as the controversial programme pursued by Wierzbicka and her associates (see Wierzbicka 1996; also article 21(Cann) Sense relations) – we are faced with an infinite regress. Not the least of the problems associated with the classical theory is that it is difficult to find convincing examples of words which designate classical categories. The word bachelor is cited with such depressing regularity in expositions of the classical theory – the present article is no exception – because it is one of the few words which might be amenable to this kind of analysis. (However, as we shall see, even the case of bachelor is not so straightforward.) Of course, scientists, bureaucrats, and various other kinds of experts may attempt to give rigorous definitions of categories relevant to their activities. But even technical and scientific terms may turn out to be problematic for the classical theory. Consider the recent discussions as to whether Pluto is a planet or some other kind of solar object, such as a comet or an asteroid. That such an issue could arise amongst the experts (in this case, the astronomers) demonstrates that ‘planet’ may not be susceptible to a classical definition, and, even if it were, there may still be uncertainty over whether a given entity exhibits each of the defining features. As we shall see, prototype theory is able to accommodate, with varying degrees of success, the objections raised above, and therefore offers itself as an alternative to the classical theory. The starting point is the observation that for many categories, certain members seem to be more central, more basic, more typical than others; categories, therefore, have an internal structure, in that their members are not all of equal status. Given this approach, we need not stipulate that members of a category have to share a set of category-defining features. Moreover, the boundary of the category (the distinction between what is in and what is outside the category) may not be clear-cut. The possibility also arises that categories are learned and represented, not as combinations of features, but, in the first instance, on the basis of good examples. Before proceeding, a word of caution is called for. In spite of the title of this article and the introductory remarks above, it may be inappropriate to speak of ‘prototype theory’ tout court. What we have, in the first instance, are prototype effects – very robust and undisputed empirical findings concerning goodness-of example ratings of members of a category. These are discussed below. The interpretation of prototype effects and their theoretical significance, however, are far from uncontroversial, in both psychology and linguistic semantics; for overviews, see Geeraerts (1989), Kleiber (1990), LewandowskaTomaszczyk (2007), MacLaury (1991), Murphy (2002), and Violi (1997). The common ground is that prototype effects impose a condition on a theory of categorization; the theory, namely, must be able to accommodate, and even predict these effects. But instead of there being a single ‘theory of prototypes’, there are, as we shall see, a number of distinct theoretical approaches to the issue.
2. Prototype effects Prototype effects have been documented by many researchers, for many different kinds of categories. An early and well-known study is Labov (1973) on the names of household receptacles like cup, mug, bowl, vase, pitcher. Labov showed subjects line drawings of receptacles which varied in a number of ways, such as the ratio of height to depth, the
645
646
VI. Cognitively oriented approaches to semantics shape of the cross-section (circular, square, or triangular), whether tapering towards the bottom or not, whether with or without a handle. The finding was that certain receptacles were unanimously called cups, others bowls or vases, whereas others elicited variable judgements. Judgements could also be shifted by asking subjects to imagine the receptacles holding coffee, mashed potato, or flowers. The upshot for Labov was that words such as cup, bowl, and vase could not be defined by clear-cut criteria, but rather in terms of what typical exemplars might look like and what they might typically be used for. Cups usually have a handle, though some cups do not; they are usually tapering towards the bottom, though need not be; they are typically used for drinking hot tea or coffee, but could also be used for drinking soup or cold milk; they usually come with a saucer, though not necessarily. None of these properties, though typical and expected, is strictly speaking a necessary feature of ‘cup’. There is, as Labov asserts, no ‘essence’ (in the classical theoretical sense) of ‘cup’. The idea that category membership may not be dependent on the sharing of features had also been argued by Wittgenstein (1978: 33–34) in his remarks on game (more precisely, since Wittgenstein was writing in German, on Spiel). Features such as ‘for amusement’, ‘requires two or more players’, ‘involves competition between players’, ‘requires skill’, ‘depends on chance’ are distributed over members of the category rather like the characteristics of a family (the family chin, the family nose, and the like) are distributed over the family members. Each game does not have to exhibit the full set of game-like features, just as a family member does not have to exhibit each of the family attributes in order to be recognized as such. Wittgenstein’s analogy has given rise the concept of the ‘family resemblance category’. It is worth noting, however, that Wittgenstein did not propose that certain games might be ‘better’, or more prototypical examples than others. The researcher who is perhaps best known to linguists for work on prototype effects is the cognitive psychologist, Eleanor Rosch. Rosch’s earliest work (published under the name of Heider) addressed colour categories and their encoding in language. The fact that different languages carve up the colour spectrum in different ways has long fascinated scholars, and was cited by Gleason (1955: 4) as an illustration of the essentially arbitrary way in which semantic domains are structured by language. This view was questioned by Berlin & Kay’s (1969) cross-linguistic survey of colour terminology. They confirmed that languages do indeed differ with respect to the number of their colour categories and the extensional range of their colour terms. However, when speakers of different languages are asked to identify ‘good examples’ of their colour words, the amount of cross-linguistic diversity is greatly reduced. Speakers of different languages, when asked to pick out a ‘good red’, tend to select colours from a very limited area of the colour spectrum. Rather than corroborating the cross-linguistic diversity of colour terminology, Berlin & Kay’s research strongly suggested the existence of a universal set of focal colours (eleven, to be precise), from which all languages make their selection. It also suggested that the sequence in which languages elaborate their colour terminology tends to follow a universal path, with black, white, and red lexicalized first, pink, brown, purple, and grey appearing last. Heider (1971, 1972) found other experimental correlates of what she referred to as focal colours. For example, subjects were faster in naming focal than non-focal colours, and focal colours were better remembered on a short-term memory task. Speakers of a language with a very restricted colour vocabulary (the Dani, of Irian Jaya) were able to learn terms for focal colours faster than terms for non-focal colours. The evidence led
28. Prototype theory Rosch to suppose that colour categories were learned and structured around their focal, or prototypical exemplars: [c]olor categories are processed by the human mind (learned, remembered, denoted, and evolved in languages) in terms of their internal structure; color categories appear to be represented in cognition not as a set of criterial features with clear-cut boundaries but rather in terms of a prototype (the clearest cases, best examples) of the category, surrounded by other colors of decreasing similarity to the prototype and of decreasing degree of membership. (Rosch 1975: 193)
Subsequently, Rosch extended her research to the categories encoded by other linguistic items, specifically, names for natural kind terms such as fruit and bird, and nominal kind terms such as furniture and vehicle (Rosch 1975; Rosch et al. 1976). The basic experimental paradigm was very simple. A group of subjects is presented with a category name, e.g. furniture. They are then given a list of possible members of the category, and asked to rate on a 7-point scale the extent to which each item “represented their idea or image of the meaning of the category name” (Rosch 1975: 198). A principal finding was that subjects tended to rank the category members rather similarly, with chair and sofa being judged good examples of ‘furniture’, chest and bookcase less good, and clock and vase as very poor examples. These are the goodness-of-example ratings referred to above. Prototype effects – the finding that members of a category can be rated in terms of how good they are – are now very well documented. They pertain to natural kind terms (bird, tree, etc.), names of artifacts (furniture, vehicle), emotion concepts (Fehr & Russel 1984), as well as artificial categories (such as displays of dots, or sequences of letters and numbers). They show up on ad hoc categories (such as ‘things that can fall on your head’: Barsalou 1983) and goal-oriented categories (‘things to pack in a suitcase’: Barsalou 1991). While most research has focused on categories designated by nominals, prototype effects have also been reported for verbal (Pulman 1983) and adjectival (Dirven & Taylor 1988) categories. Most spectacularly, they show up even with categories which arguably do have a classical definition, such as ‘odd number’ (Armstrong, Gleitman & Gleitman 1983). One might, of course, counter that the goodness-of-example ratings reported by Rosch and others are simply artifacts of the experimental situation and of the specific instructions that the subjects received. This view must be tempered by the fact that the goodness-of-example ratings turn out to be relevant on a number of other tasks. These are reported in Rosch et al. (1976) and include list effects, verification times, and priming effects. Thus, when asked to list members of a category, subjects tend to name good examples first. When asked to evaluate as true or false a sentence of the form X is a Y, subjects respond faster if X is a good example of Y (or not at all an example of Y) than when it is a not-so-good or marginal example. In addition, performance on a lexical decision task (in which subjects are required to decide, as quickly as possible, whether a string of letters constitutes a word or not) is enhanced if the target word is a good example of a category for which subjects have been primed by prior exposure to the category name. Thus, exposure to the word fruit facilitates recognition of apple as a word, as compared to recognition of olive as a word. The converging evidence from these different experimental paradigms – some, it will be noted, like priming, involving on-line tasks – strongly suggests that the goodnessof-example ratings cannot be dismissed as artifacts of the rating technique. On the
647
648
VI. Cognitively oriented approaches to semantics contrary, there is reason to suppose that the various paradigms are tapping into a common representational format for the categories in question. Mention should also be made of a specifically linguistic manifestation of goodnessof-example effects, namely, the use of hedges (Lakoff 1972). The hedges in question are adverb-like expressions which speakers can use in order to comment on the appropriateness of an entity’s categorization. While penguins are undoubtedly birds, it would be odd (perhaps, even false) to say that penguins are birds par excellence. Par excellence picks out prototypical members of a category. And while bats are not birds – at least, strictly speaking they are not birds – it may nevertheless be true (or not obviously false) to claim that loosely speaking they are birds. Certain syntactic constructions may also have a hedging effect. I am not much of a cook conveys that the speaker regards herself as only a very marginal member of the category ‘cook’.
3. Prototypes and the basic level An important topic in Rosch’s work was the relation between prototype effects and levels of categorization (Rosch 1978). As an illustration of what is meant by ‘levels of categorization’, consider an example from Brown (1958). The thing on the lawn may be named in various ways; it could be called a dog, a boxer, a quadruped, or an animate being. These categories stand in a taxonomic relation: a boxer is a kind of dog, a dog is a kind of quadruped, and a quadruped is a kind of animate creature. Each of the designations for the thing on the lawn may be equally correct. Yet they are not equally likely to be used. If a foreigner were to ask what the thing is called in English, you would probably say that it was a dog, possibly that it was a boxer, but hardly that it was a quadruped. The basic level is the level in a taxonomy at which things are normally named, in the absence of reasons to the contrary. ‘Dog’ is a basic level category, ‘boxer’ a subordinate category, ‘quadruped’ a superordinate category. To investigate the taxonomic relation between categories, Rosch & Mervis (1975) asked subjects to list the properties of basic, superordinate, and subordinate level terms. The general finding was that superordinate terms, such as vehicle, clothing, and fruit, elicited relatively few and rather general properties. Names at an intermediate level of categorization, e.g. apple, were associated with a much richer set of properties; moreover, the properties tended to be distinctive to that term and not to apply to other members of the superordinate category. Importantly, these features often had to do with the overall appearance of the entity, its constitution, its parts and their arrangement, as well as interactional properties, that is, how the thing is handled and how one would behave with respect to it. Subordinate terms (e.g. Granny Smith) also elicited a rich set of properties. These, however, tended to overlap with those of the basic level, and also with those of neighbouring terms (that is, names for other kinds of apple). These findings shed light on a number of issues. First, they are able to provide a functional explanation for the salience of the basic level. Superordinate categories tend to be rather uninformative in comparison with basic and subordinate terms. To learn that something is a ‘piece of fruit’ does not tell you very much about it. Basic level and subordinate terms are much richer in information. However, in comparison to subordinate terms, basic level terms tend to be contrastive. Apples, oranges, and bananas contrast on many dimensions, particularly their appearance and how we go about eating them, whereas different kinds of apple contrast only minimally in these respects. The basic
28. Prototype theory level thus turns out to be the most informative and efficient of the taxonomic levels. It is the level which packs the most information, both in terms of what something is, also with respect to what it is not. It is not surprising, therefore, that basic level terms tend to be of high frequency, they are short, and are learned early in first language acquisition. The findings also make possible a more sophisticated understanding of prototypes. As noted, basic level categories tend to be contrastive. It is possible, then, to view the prototype as a category member which exhibits the maximum number of features which are typical of the category and which are not shared by members of neighbouring categories. Rosch et al. (1976) in this connection speak of the cue validity of features. A feature has high cue validity to the extent that presence of the feature is a (fairly) reliable predictor of category membership. For example, [having a liver] has almost zero cue validity with respect to the category ‘bird’. Although all birds do have a liver, so also do countless other kinds of creature. On the other hand, [having feathers] has very high cue validity, since being feathered is a distinctive characteristic of most, if not all birds, and birds alone. [Being able to fly] has a somewhat lower, but still quite high cue validity, since there are some other creatures (butterflies, wasps, bats) which also fly. [Not being able to fly] would have very low cue validity, not only because it applies to only a few kinds of birds, but because there are countless other kinds of things which do not fly. It is for these reasons that being able to fly, and having feathers, feature in the bird prototype. Other researchers (e.g. Murphy 2002: 215) have drawn attention to the converse of cue validity, namely, category validity. If cue validity can be defined as the probability of membership in category C, given feature f, i.e. P(C | f), category validity can be defined as the probability that an entity will exhibit feature f, given its membership in C, i.e. P(f | C). The interaction of cue and category validity offers an interesting perspective on inferencing strategies, mentioned in section 1 in connection with the classical theory. (See also article 106 (Kelter & Kaup) Conceptual knowledge, categorization and meaning.) A person observes that entity e exhibits feature f. If the feature has high cue validity with respect to category C, the person may infer, with some degree of confidence, that e is a member of C. One may then make inferences about e, based on category validity. For example, an entity with feathers is highly likely to be a bird. If it is a bird, it is likely to be able to fly, and much more besides. The hypothetical category mentioned in section 1 – the category defined by the features [is red], [was manufactured before 1980], and [weighs 8 kg] – exhibits extremely low cue and category validity, and this could be one reason why such a category would never be lexicalized in a human language. The fact that something is red scarcely predicts membership in the category, since there are countless other red things in the universe; likewise for the two other features. The only way to assign membership in the category would be to check off each of the three features. Having done this, one can make no predictions about further properties of the entity. To all intents and purposes, the hypothetical category would be quite useless. A functional account of prototypes, outlined above, may be contrasted with an account in terms of frequency of occurrence. In response to the question ‘Where does prototypicality come from?’ (Geeraerts 1988), many people are inclined to say that prototypes (or prototypical instances) are encountered more frequently than more marginal examples and that that is what makes them prototypical. Although frequency of occurrence certainly may be a factor (our prototypical vehicles are now somewhat different from those of 100 years ago, in consequence of changing methods of transportation) it cannot be the
649
650
VI. Cognitively oriented approaches to semantics whole story. Sofas and chairs are prototypical pieces of furniture, clocks and bookcases are not. But this is not due to the fact (if it is a fact) that we encounter sofas and chairs more frequently than clocks and bookcases. The intuition that prototypes occur more frequently could well be a consequence of prototype structure, not its cause.
4. The cultural context of categories Rosch’s work on categorization appealed extensively to features, attributes, and properties. This manner of speaking is liable to suggest that the attributes have some kind of priority vis-à-vis the categories. Rosch (1978: 42) came to question this assumption. She noted that a typical attribute of chairs is that they ‘have a seat’. However, the very notion of something ‘having a seat’ is based on prior knowledge of how one interacts with chairs and chair-like objects. It is as if the attribute derives from knowledge of the category, rather than the category being a function of its attributes. This observation has led to several interesting developments. The first is that objects – especially, basic level objects – may be apprehended holistically and experientially, in terms of what they look like, how they are put together, how we behave with respect to the them, and the roles they play in our interaction with the environment. Features, in turn, come to be seen, not as pre-existing building blocks out of which categories are constructed, but as commonalities which speakers perceive in an array of category instances (Langacker 1987: 22). The second issue concerns the cultural embeddedness of categories. The categories that we recognize in the world are not objectively ‘there’, but are mediated by human concerns, interests, and values. As Rosch came to recognize, it would be an error to suppose that the categories that we identify in the world merely reflect “the natural correlation of attributes” (Rosch 1975: 197). This kind of objectivist view would predict (incorrectly) that all languages would identify the same categories in the world, and that categories would change only if the environment changed. The theme of the cultural embeddedness of categories was pursued by Murphy & Medin (1985), who argued that a category is coherent and useful to its users to the extent that it plays a role in wider scenarios, in causal relations, or in deeply held beliefs. The theme was also addressed by Lakoff (1987) in terms of his notion of the Idealized Cognitive Model (ICM). The notion can be illustrated on the example of ‘bachelor’, introduced at the beginning of this article. Intuitively, the definition of bachelor in terms of the four features [+human], [+adult], [+male], and [–married] seems reasonable enough. However, even this parade example of a classical category raises a number of issues. Consider, for example, the feature [+adult]. This feature itself defines a category, namely the category of adults. But how do we define this category? Bureaucrats may, of course, give the category a precise classical definition, namely in terms a person’s age (18 years or older, or whatever). But in everyday usage, the word surely appeals to a number of aspects in addition to age, such as emotional and physical maturity, independence from parents, assumption of responsibilities, and so on. The category will inevitably have fuzzy boundaries and this fuzziness will be inherited by bachelor (would one confidently apply the term to an immature 18-year-old?) Consider, also, the feature [–married]. Marriage is a cultural institution par excellence, and such a feature can in no way be regarded as an ‘objective’ feature of the environment.
28. Prototype theory Although often cited as an example of a classical category, ‘bachelor’ is arguably subject to prototype effects. There are good examples of the category, less good, and marginal examples. Do Catholic priests count as bachelors? Is the Pope a bachelor? Tarzan? Men in long-term unmarried relationships? Gay men? Men in polygamous societies, who have only one wife but who are eligible to have another? Is it totally excluded to apply the word to women? Is bachelor girl a contradiction, and therefore meaningless? One approach to this issue was suggested by Fillmore (1982) and developed by Lakoff (1987). The proposal is that the concept ‘bachelor’ needs to be understood against an Idealized Cognitive Model of society. According to the ICM, everyone is heterosexual and there is a certain age range at which everyone is expected to marry. Men who pass this age do so out of choice; they do not want the ‘commitments’ of marriage. Women who pass the age do so out of necessity; they cannot find a willing mate. (From these aspects of the ICM follow the generally positive connotations of bachelor and the negative associations of spinster.) In terms of the ICM, bachelor can indeed be defined, quite simply, as an (as yet) unmarried man, as per the classical theory. Prototype effects arise because the model does not always fit the social reality. The ICM makes no allowance for Catholic priests, gay people, or people in unmarried relationships. Another example is provided by the notion of telling a lie. In a well-known article, Coleman & Kay (1981) promoted the notion of prototype category on the example of ‘lie’. They surmised that there might be three features relevant to the categorization of a statement as a lie: its factual incorrectness, the speaker’s belief in its factual incorrectness, and the speaker’s intention to deceive the hearer. Coleman and Kay constructed eight little stories, one exhibiting all three of the features, the others exemplifying either two or only one of the features. Subjects were asked to evaluate the stories in terms of how good an example they were of lying. Predictably, the story with all three features was considered the best example, those with only one feature the poorest examples. Coleman and Kay were also able to show that the three features were differentially weighted with respect to category membership. The speaker’s belief that the statement is factually incorrect was the most important, factual incorrectness was the least important. Sweetser (1987) returned to Coleman and Kay’s data and argued that lying should be understood against an ICM of verbal communication. According to ICM, people communicate in good faith, they state only that for which they have evidence, and, if they have evidence, they are justified in believing that their statements are true. In addition, the imparting of true information is deemed to be beneficial to hearers, and speakers strive to benefit hearers by providing true information. In terms of the ICM, lying can be defined, quite simply, as the making of a statement which is not true. Moreover, making a statement which is not true can only be with the intention of harming the hearer. Once again, however, there are many circumstances in which the ICM does not apply, and in these cases we may be less confident to speak of lying. The ICM does not apply when language is being used to entertain, as when telling stories or making jokes. (No one, presumably, would accuse a joke-teller of lying, on the grounds that the events described never happened). It does not apply when the main purpose of linguistic activity is to establish and maintain social relations. Telling your host that you have had a delightful evening, when you haven’t, would not normally be considered a lie, certainly not a prototypical one. The ICM also ignores cases where speakers might be genuinely ignorant of the facts, where they are simplifying information for pedagogical reasons, where ‘the truth’ might be distressing to the hearer, or where information is considered to be
651
652
VI. Cognitively oriented approaches to semantics confidential and not at all public property. The status of information as public or private property can be expected to vary according to circumstances and cultural conventions.
5. Prototypes and categories There are several ways of understanding the notion of prototype and of the relation between a prototype and a category (Taylor 2008). Some of these have been hinted at in the preceding discussion. In this section we examine them in more detail.
5.1. Categories are defined with respect to a ‘best example’ On this approach, a category is understood and mentally represented simply in terms of a good example. One understands ‘red’ in terms of a mental image of a good red, other hues being assimilated to the category in virtue of their similarity to the prototype. Some of Rosch’s statements may be taken in support of this view. For example, Heider (1971: 455) surmises that “much actual learning of semantic reference, particularly in perceptual domains, may occur through generalization from focal exemplars”. Elsewhere she writes of “conceiving of each category in terms of its clear cases rather than its boundaries” (Rosch 1978: 35–36). An immediate problem arises with this approach. Any colour can be said to be similar to red in some respect (if only in virtue of its being a colour) and is therefore eligible to be described as red ‘to some degree’. In order to avoid this manifestly false prediction we might suppose that the outer limits of category membership will be set by the existence of neighbouring, contrasting categories. As a colour becomes more distant from focal red and approaches focal orange, there comes a point at which it will no longer be possible to categorize it as red, not even to a small degree. The colour is, quite simply, not red. Observe that this account presupposes a structuralist view of lexical semantics, whereby word meanings divide up conceptual space in a mosaic-like manner, such that the denotational range of one term is restricted by the presence of neighbouring terms (Lyons 1977: 260). It predicts (correctly in the case of colours, or at least, basic level colours) that membership will be graded, in that an entity may be judged to be a member of a category only to a certain degree depending on its distance from the prototype. The category, as a consequence, will have fuzzy boundaries, and degree of membership in one category will inversely correlate with degree of membership in a neighbouring category. The ‘redder’ a shade of orange, the less it is orange and the more it is red. There are a small number of categories for which the above account may well be valid, including the household receptacles studied by Labov: as a vessel morphs from a prototypical cup into a prototypical bowl, categorization as cup gradually decreases, offset by increased categorization as bowl. The account may also be valid for scalar concepts such as hot, warm, cool, and cold, where the four terms exhaustively divide up the temperature dimension. But for a good many categories it will not be possible to maintain that they are understood simply in terms of a prototype, with their boundaries set by neighbouring terms. In the first place, the mosaic metaphor of word meanings may not apply. This is the case with near synonyms, that is, words which arguably have distinct prototypes, but whose usage ranges overlap and which are not obviously contrastive. Take the pair high and tall (Taylor 2003b). Tall applies prototypically to humans (tall man), high to inanimates (high
28. Prototype theory mountain). Yet the words do not mutually circumscribe each other at their boundaries. Many entities can be described equally well as tall or high; use of one term does not exclude use of the other. It would be bizarre to say of a mountain that it is high but not tall, or vice versa. The approach is also problematic in the case of categories which cannot be reduced to values on a continuously varying dimension or on set of such dimensions. Consider natural kind terms such as bird, mammal, and reptile, or gold, silver, and platinum. Natural kinds are presumed to have a characteristic ‘essence’, be it genetic, molecular, or whatever. (This said, the category of natural kind terms may not be clear-cut; see Keil 1989. As a rule of thumb, we can say that natural kinds are the kinds of things which scientists study. We can imagine scientists studying the nature of platinum, but not the nature of furniture.) While natural kind categories may well show goodness-of-example effects, they tend to have very precise boundaries. Birds, as we know them, do not morph gradually into mammals (egg-laying monotremes like the platypus notwithstanding), neither can we conceive of a metal which is half-way between gold and silver. And, indeed, it would be absurd to claim that knowledge of the bird prototype (e.g. a small songbird, such as a robin) is all there is to the bird concept, to claim, in other words, that the meaning of bird is ‘robin’, and that creatures are called birds simply on the basis of their similarity to the prototype. While a duck may be similar to a robin in many respects, we cannot appeal to the similarity as evidence that ducks should be called robins. In the case of categories like ‘bird’, the prototype is clearly insufficient as a category representation. We need to know what kinds of things are likely to be members of the category, how far we can generalize from the prototype, and where (if only approximately) the boundaries lie. We need an understanding of the category which somehow encompasses all its members.
5.2. The prototype as a set of weighted attributes In subsequent work Rosch came to a more sophisticated understanding of prototype, proposing that categories tend to become defined in terms of prototypes or prototypical instances that contain the attributes most representative of items inside and least representative of items outside the category. (Rosch 1978: 30; italics added)
A category now comes to be understood as a set of attributes which are differentially weighted according to their cue validity, that is, their importance in diagnosing category membership, and an entity belongs in the category if the cumulative weightings of its attributes achieve a certain threshold level. On this approach, category members need not share the same attributes, nor is an attribute necessarily shared by all category members. Rather, the category hangs together in virtue of a ‘family resemblance’ (Rosch & Mervis 1975), in which attributes ‘criss-cross’, like the threads of a rope (Wittgenstein 1978: 32). The more similar an instance to all other category members (this being a measure of its family resemblance), the more prototypical it is of the category. A major advantage of the weighted attribute view is that it makes possible a “summary representation” of a category, which, like the classical theory, “somehow encompass[es] an entire concept” (Murphy 2002: 49). As a matter of fact, a classical category would
653
654
VI. Cognitively oriented approaches to semantics turn out to be a limiting case, where each of the features has an equal and maximal weighting, and without the presence of each of the features the threshold value would not be attained. The weighted attribute view raises the interesting possibility that the prototype may not correspond to any actual category member; it is more in the nature of an idealized abstraction. Confirmation comes from work with artificial categories (patterns of dots which deviate to varying degrees from a pre-established prototype), where subjects have been able to identify the prototype of a category they have learned, even though they had not been previously exposed to it (Posner & Keele 1968).
5.3. Categories as exemplars A radical alternative to feature-based approaches construes a category simply as a collection of instances. Knowledge of a category consists in a memory store of encountered exemplars (Smith & Medin 1981). Categorization of a new instance occurs in virtue of similarities to one or more of the stored exemplars, a prototypical example being one which exhibits the highest degree of similarity with the greatest number of instances. There are several variants of the exemplar view of categories. The exemplars might be individual instances encountered on specific occasions; especially for superordinate categories, on the other hand, the exemplars might be the basic categories which instantiate them (Storms, de Boeck & Ruts 2000). In its purest form, the exemplar theory denies that people make generalizations over category exemplars. Mixed representations might also be envisaged, however, whereby instances which closely resemble each other might coalesce into a generic image which preserves what is common to the instances and filters out the idiosyncratic details (Ross & Makin 1999). On the face of it, the exemplar view, even in its mixed form, looks rather implausible.The idea that we retain specific memories of previously encountered instances would surely make intolerable demands on human memory. Several factors, however, suggest that we should not dismiss the exemplar theory out of hand, and indeed Storms, de Boeck & Ruts (2000) report that the exemplar theory outperforms the summary representation theory, at least with respect to membership in superordinate categories. First, computer simulations have shown that exemplar models are able to account for a surprising range of experimental findings on human categorization, including, importantly, prototype effects (Hintzman 1986). Second, there is evidence that human memory is indeed rich in episodic detail (Schacter 1987). Even such apparently irrelevant aspects of encountered language, such as the position on a page of a piece of text (Rothkopf 1971), or the voice with which a word is spoken (Goldinger 1996), may be retained over substantial periods of time. Moreover, humans are exquisitely sensitive to the frequency with which events, including linguistic events, have occurred (Ellis 2002). Bybee (2001) has argued that frequency should be recognized as a major determinant of linguistic performance, acceptability judgements, and language change. A focus on exemplars would tie in with the trend towards usage-based models of grammar (Langacker 2000, Tomasello 2003). It is axiomatic, in a usage-based model, that linguistic knowledge is acquired on the basis of encounters with actual usage events. While generalizations may be made over encountered events, the particularities of the events need not thereby be erased from memory (Langacker 1987: 29). Indeed, it is now widely recognized that a great deal of linguistic knowledge must reside in rather particular facts
28. Prototype theory about a language, such as its phraseologies, idioms, and collocations (Moon 1998). Moreover, the frequency with which linguistic phenomena have been encountered would itself form part of linguistic knowledge and be a crucial factor in future performance (Bybee 2001, Hoey 2005).
5.4. Prototypes as category defaults Another approach to prototypes and categorization is the view that prototypes constitute the default value of a category, activated in the absence of more specific information (cf. the notion of ‘default inheritance’ in Word Grammar: Hudson 1990). Thus, on hearing mention of birds, one would assume that the creatures in question possess the typical attributes of the category, for example, that they fly, perch on trees, and so on. Rosch (1977) showed that a statement involving birds tends to make sense if it is changed to one referring to a prototypical member of the category, such as robins, but becomes ludicrous if reference is changed to a non-prototypical member, such as turkeys. Imagine a person who muses I wish I were a bird. They would probably feel somewhat cheated if their wish was granted and they were miraculously transformed into a turkey (especially before Christmas or Thanksgiving!). The prototypes as defaults approach would be compatible with each of the above mentioned approaches. The default could be the best example, an instance which maximizes attribute weighting, or one which maximizes similarity to stored instances. If prototypes are defaults, we should expect that attributes of the prototype will be overridden as more specific information becomes available. The notion of ‘wooden spoon’ evokes its own prototype, whose properties (for example, its size) override the specifications of the spoon prototype (Hampton 1987). Moreover, the default might vary according to context, background expectations, and the specific task in hand. If asked to take a Chinese perspective, American subjects select swan and peacock as typical exemplars of the bird category, whereas robin and eagle are taken as typical from an American perspective (Barsalou 1987: 106–107). This does not, of course, mean that Chinese subjects would rate swans and peacocks over robins and eagles, only that American subjects are able to construct a Chinese perspective, based on their stereotypical views of Chinese culture.
6. Objections to prototypes Although prototype effects are very well documented, their relevance to linguistic semantics is by no means without controversy. Some skeptical views are reviewed below.
6.1. Combining concepts: the problem of the pet fish Osherson & Smith (1981) observed that complex expressions typically fail to inherit the prototypes of their constituents, a point taken up by Fodor in his sustained criticism of the role of prototypes in linguistic semantics (Fodor 1980, 1998; Fodor & Lepore 1996). We might consider a prototypical fish to be herring and a prototypical pet to be a poodle. However, we do not arrive at an understanding of ‘pet fish’ by combining the prototypes of the constituents and imagining some sort of hybrid between a herring and a poodle. On the contrary, a pet fish is a fish (any kind of fish) which happens also to be a pet, a prototypal example being, perhaps, a goldfish. The prototypical fish and the prototypical
655
656
VI. Cognitively oriented approaches to semantics pet play no role in our understanding of ‘pet fish’. Similarly, we may well have an image of a prototypical grandmother (say, as a kindly, frail old lady with grey hair), but the prototype plays no role in our understanding of the expressions my grandmother and grandmothers most of whose grandchildren are married to dentists (Fodor 1980: 197). Fodor’s criticism is based on the assumption that a category is to be represented solely by its prototype. As we have seen, there are other ways to understand categories and their prototypes. The cases mentioned above clearly need to make reference to ‘summary representations’ (Murphy 2002: 49) of the respective categories, e.g. in terms of a set of weighted features, not simply to a prototypical exemplar. And, as already noted, concept combination can result in the overriding of certain features and in the setting of particular values and weightings to the features (Hampton 1987, 1991), as in the example wooden spoon.
6.2. Core definitions and recognition procedures: The problem of odd numbers Armstrong, Gleitman & Gleitman (1983) queried the linguistic significance of goodnessof-example ratings, not by challenging the empirical evidence for prototype effects, but by demonstrating the very ubiquity of these effects. Thus, they reported goodness-of-example ratings even for odd numbers, with subjects judging 3 to be a ‘better’ odd number than 91. ‘Odd number’ is a category which uncontroversially requires a classical definition, a definition, moreover, which the subjects in Armstrong et al.’s experiments were familiar with and fully endorsed. The existence of prototype effects cannot therefore be taken as evidence against the classical view of categories. A first point to note in connection with Armstrong et al.’s seemingly very strange findings is that the presence of goodness-of-example ratings does not entail that a category will have fuzzy boundaries. The bird category is not fuzzy, even though some birds are more birdy than others. Even so, Armstrong et al.’s findings can be interpreted to mean that prototype effects might have to do primarily with the process of assigning an instance to a category, not with the mental representation of the category as such. We might therefore wish to distinguish between the ‘core’, or strictly linguistic meaning of an expression, and the ‘recognition procedures’ on whose basis people make rapid decisions on category membership, as proposed by Osherson & Smith (1981). The recognition procedures would appeal to typical, easily observable properties, which may nevertheless not be defining of the category. More generally, the distinction between core definitions and recognition procedures raises the possibility that prototype effects might simply be due to the imperfect fit between concepts and the things that we encounter in the world. Coseriu (2000) took this line in his spirited critique of prototype categories. Against this is the fact that in many cases it is the concept itself that is structured prototypically, a point argued by Taylor (1999) in his riposte to Coseriu. The distinction between a core definition and recognition procedures may, however, have some force in the case of some natural kind categories. Natural kinds, such as water and gold, are presumed to have a defining essence. Most speakers act in ignorance of the defining essence and how they might access it; for this, they defer to the experts. In everyday usage they rely instead on what Putnam (1975) refers to as a stereotype – what the things look like, where they are found, and so on. However, the distinction between the real essence of a thing and its stereotype may not be applicable outside the domain
28. Prototype theory of natural kind terms. As mentioned earlier, Labov queried the idea that the set of things called cups might possess a defining essence, distinct from the recognition features which allow a person to categorize something as a cup. In this case, the stereotype turns out to be nothing other than the prototype. Not to be forgotten also is the fact that speakers may operate with more than one understanding of a category. The case of ‘adult’ was already mentioned, where an ‘expert’ bureaucratic definition might co-exist with a looser, multi-dimensional, and inherently fuzzy understanding of what constitutes an adult.
6.3. Prototypes save: An excuse for lazy lexicographers? Wierzbicka (1990) maintained that appeal to prototypes is simply an excuse for lazy semanticists to avoid having to formulate rigorous word definitions. Underlying Wierzbicka’s position is the view that words are indeed amenable to definitions which are able to predict their full usage range. She offers sample definitions of the loci classici of the prototype literature, including ‘game’, ‘lie’, and ‘bird’. However, as Geeraerts (1997: 13–16) has aptly remarked, Wierzbicka’s definitions often sneak in prototype effects by the back door, as it were. For example, Wierzbicka (1990: 361–362) claims that ability to fly is part of the bird-concept, in spite of the fact that some birds are flightless. The discrepancy is expressed in terms of how a person would imagine a bird, namely, as a creature able to move in the air, with the proviso that ‘some creatures of this kind cannot move in the air’. Far from discrediting prototype structure, Wierzbicka’s definition simply incorporates them.
7. Words and the world Rosch’s work addressed the relation between words and the things in the world to which the words can refer. The relation can be studied from two perspectives (Taylor 2007). We can ask, for this word, what are the things which it can be used to refer to? This is the semasiological, or referring perspective. Alternatively we can ask, for this thing, what are the words that we can use to refer to it? This is the onomasiological, or naming perspective (Blank 2003). The two perspectives roughly correspond to the way in which dictionaries and thesauri are organized. A dictionary lists words and gives their meanings. A thesaurus lists concepts and gives words which can refer to them. The two perspectives underlie much research in colour terminology. Consider, for example, the data elicitation techniques employed by MacLaury (1995). Three procedures were involved. The first requires subjects to name a series of colour chips presented in random sequence. This procedure elicits the basic colour terms of the language. Next, for each of the colour terms proffered on the naming task, subjects are asked to identify its focal reference on a colour chart. This procedure elicits the prototypes of the colour terms. Third, for each colour term, subjects map the term on the colour chart, indicating which colours could be named by the word. This procedure shows the referential range of the term. MacLaury’s research, therefore, combined an onomasiological perspective (from world to word, i.e. “What do you call this?”) with a semasiological perspective (from
657
658
VI. Cognitively oriented approaches to semantics word to world, i.e. “What can this word refer to?”). Importantly, the elicitation procedures make it possible to operationalize the notions of basic level term (the term preferentially used to describe a state of affairs), as well the prototype (in the sense of focal reference). By including mapping data, it becomes possible also to identify various kinds of semantic relations between words, such as inclusion, synonymy, overlap (or partial synonymy), and contrast. The methodology also makes it possible to rigorously study between-language differences, as well as differences between speakers of the same language, and indeed, differences within a single speaker on different occasions. The onomasiological and semasiological perspectives have been employed in several studies of semantic typology; these include Bowerman (1996) on spatial relations (see also article 107 (Landau) Space in semantics and cognition), Enfield, Majid & van Staden (2006) on body-part terms, and Majid et al. (2007) on verbs of cutting and breaking. Perhaps the most thorough application of the two perspectives outside the colour domain, however, is Geeraerts, Grondelaers & Bakema (1994), who studied terms for outer clothing garments as depicted, and named, in fashion magazines and mail-order catalogues. The data allowed the researchers to identify the features of the garments named by a particular clothing term. The prototype could then be characterized by a cluster of frequently co-occurring features. Conversely, the researchers were able to identify the terms which were most frequently used to refer to garments exhibiting a certain set of features. In this way, basic level terms could be identified. One of the many findings of this study was that the basic level does not constitute a fixed and stable level in a taxonomy. For example, there are good reasons to regard ‘trousers’ as a basic level term, in contrast to ‘skirt’, ‘shirt’, ‘jacket’, and ‘coat’. ‘Jeans’ would be a subcategory of trousers. Yet jeans, and jeans-like garments, are typically referred to as such, not as trousers. What is from one point of view a subordinate term has acquired something of basic level status.
8. Prototypes and polysemy The prototype concept was eagerly taken up by a number of linguists in the late 1980’s and early 1990’s (Lakoff 1982, 1987; Taylor 1989/2003a; Langacker 1987), especially for its relevance to lexical semantics and meaning change (Geeraerts 1997; see also article 100 (Geeraerts) Cognitive approaches to diachronic semantics). Since then, it has found applications in areas of linguistic description outside of semantics, including syntax, morphology, and phonology (Taylor 2002; 2008). A particularly fruitful application has been in the study of lexical polysemy. The idea is that the different senses of a word are structured similar to how the different members of a category are structured, namely in terms of a central, or prototypical sense, to which less central senses are related. The word over provides a parade example. Lakoff (1987), based on Brugman (1981), proposed that the basic sense of the preposition involves movement of a trajector (or fig. object) ‘above and across’ a landmark (or ground) entity, as in The plane flew over the city. Other senses introduce modifications of some feature or features of the prototypical sense. Thus, The plane flew over the hill requires a concave landmark. Sam walked over the hill is similar, except that the trajector (Sam) is in contact with the landmark. Sam climbed over the wall involves an up-down movement, from one side of the landmark to the other. Sam lives over the hill locates Sam at the end-point of a path which goes ‘over the hill’. Other senses involve a covering relation. In I walked all
28. Prototype theory over the hill, the trajector traces a random path which ‘covers’ the hill. In The board is over the hole, the board completely obscures the hole. In this usage, the verticality of the trajector vis-à-vis the landmark is no longer obligatory: the board could be positioned vertically against the hole. The examples give the flavour of what came to be known, for obvious reasons, as a radial category. The various senses radiate out from the central, prototypical sense, like spokes in a wheel. This approach to polysemy has proved extremely attractive to many researchers, not least because it lends itself to the visual display of central and derived senses. For a particularly well worked-out example, see Fillmore & Atkins’ (2000) account of English crawl in comparison to French ramper. The approach has been seen as a convenient way to handle the fact that the various senses of a word may not share a common definitional core. Just as the various things we call ‘furniture’ may not exhibit a set of necessary and sufficient features, so also the various senses of a word may resist a definition in terms of a invariant semantic core. The approach has also been taken up with respect to the semantics of constructions (Goldberg 1995, 2006). Take, for example, the ditransitive [V NP1 NP2] construction in English. Its presumed prototype, illustrated by give the dog a bone, involves the transfer of one entity, NP2, to another, NP1, such that NP1 ends up having NP2. But in throw the dog a bone there is only the intention that NP1 should have NP2, there is no entailment that NP1 does end up having NP2. More distant from the prototypical sense are examples such as deny someone access, where the intention is that NP2 should be withheld from NP1. In applying the notion of a prototype category to cases of polysemy (whether lexical or constructional), we must be aware of the differences between the two phenomena. On the one hand, we can use the word fruit to refer, firstly, to apples and oranges, but also to olives. Although the word can refer to different kinds of things, the word presumably has a single sense and designates a single category of objects (albeit, a prototypically structured category). But when we use the word to refer to the outcome of a person’s efforts, as in the fruit of my labours or The project bore fruit, we are using the word in a different sense. The outcome of a person’s efforts cannot be regarded as just another marginal example of fruit, akin to an olive or a coconut. Rather, the metaphorical sense has to be regarded as an extension from the botanical sense. Even so, to speak of the two senses as forming a category, and to claim that one of the senses is the prototype, is to use the terms ‘category’ and ‘prototype’ also in an extended sense. In the case of fruit, it is reasonably clear which of the senses is to be taken as basic and which are extensions therefrom. But in other cases a decision may not be so easy. As noted above, for Lakoff and Brugman the central sense of over was movement ‘above and across’ (The plane flew over the city). For Tyler & Evans (2001), on the other hand, the ‘protoscene’ of the preposition is exemplified by The bee is hovering over the flower, which lacks the notion of movement ‘across’. The question now arises, on what basis is the central sense identified as such? Whereas Rosch substantiated the prototype notion by a variety of experimental techniques, linguists applying the prototype model to polysemous items appeal (implicitly or explicitly) to a variety of principles, which may sometimes be in conflict. One is descriptive elegance, whereby the prototype is identified as that sense to which the others can most reasonably, or most economically, be related. However, as the example of over demonstrates,
659
660
VI. Cognitively oriented approaches to semantics different linguists are liable to come up with different proposals as to what is the central sense. Another principle appeals to the organic growth of the polysemous category, with a historically older sense being taken as more central than senses which have developed later. Relevant here are certain assumptions concerning metaphorical extension (see article 26 (Tyler & Takahashi) Metaphors and metonymies). Thus, Lakoff (1987: 416–417) claims that the spatial sense of long (as in a long stick) is ‘more central’ than the temporal sense (a long time), on the basis of what is supposed to be a very general conceptual metaphor which maps spatial notions onto non-spatial domains. A controversial question concerns the psychological reality of radial categories. Experimental evidence, such as it is, would suggest that radial categories might actually have very little psychological reality for speakers of the language (Sandra & Rice: 1995). One might, for example, suppose that the radial structure represents the outcome of the acquisition process. Data from acquisition studies, however, do not always corroborate the radial analysis. Amongst the earliest uses of over which are acquired by children are uses such as fall over, over here, and all over (i.e. ‘finished’) (Hallan 2001). These would probably be regarded as marginal senses on just about any radial analysis. It is also legitimate to ask, what it would mean, in terms of a speaker’s linguistic knowledge, for a particular sense of a word to be ‘marginal’ or ‘non-prototypical’. Both the temporal and the spatial uses of long are frequent and both have to be mastered by any competent speaker of the language. In the case of a prototype category (as studied by Rosch) we are dealing with a single sense (with its prototype structure) of a word. In the case of a radial category (as proposed by Lakoff) we are dealing with several senses (each of which will also no doubt have a prototype structure). The distinction is based on whether we are dealing with a single sense of a word or multiple (related) senses. However, the allocation of the various uses of a word to a single sense or to two different senses can be fraught with difficulty, and various tests for diagnosing the matter can give conflicting results (Geeraerts 1993, Tuggy 1993). For example, do paint a portrait, paint the kitchen, and paint white stripes on the road exemplify a single sense of paint, or two (or perhaps even three) closely related senses? There are arguments for each of these positions. Recently, a number of scholars have queried whether it is legitimate in principle to try to identify the senses of a word (Allwood 2003, Zlatev 2003). Perhaps the most reasonable conclusion to draw from the above is that knowing a word involves learning a set (possibly, a very large and open-ended set) of established uses and usage patterns (Taylor 2006). Such an account would be reminiscent of the exemplar theory of categorization, in that a speaker retains memories, not of category members, but of word uses. Whether, or how, a speaker of the language perceives these uses to be related may not have all that much bearing on the speaker’s proficiency in the language. The notion of prototype in the Roschean sense might not therefore be all that relevant. The notion of prototype, and extensions therefrom, might, however, be important in the case of novel, or creative uses. In this connection, Langacker (1987: 381) speaks of ‘local’ prototypes. Langacker (1987: 57) characterizes a language as an inventory of conventionalized symbolic resources. Often, the conceptualization that a speaker wishes to symbolize on a particular occasion will not correspond exactly with any of the available resources. Inevitably, some extension of an existing resource will be indicated. The existing resource constitutes the local prototype and the actual usage is an extension from it. If the extension is used on future occasions, it may become entrenched and will
28. Prototype theory itself acquire the status of an established unit in the language and become available as a local prototype for further extensions.
9. References Allwood, Jens 2003. Meaning potential and context. Some consequences for the analysis of variation in meaning. In: H. Cuyckens, R. Dirven & J. Taylor (eds.). Cognitive Approaches to Lexical Semantics. Berlin: Mouton de Gruyter, 29–65. Armstrong, Sharon L., Lila R. Gleitman & Henry Gleitman 1983. What some concepts might not be. Cognition 13, 263–308. Barsalou, Laurence 1983. Ad hoc categories. Memory & Cognition 11, 211–227. Barsalou, Laurence 1987. The instability of graded structure. Implications for the nature of concepts. In: U. Neisser (ed.). Concepts and Conceptual Development. Ecological and Intellectual Factors in Categorization. Cambridge: Cambridge University Press, 101–140. Barsalou, Laurence 1991. Deriving categories to achieve goals. In: G. H. Bower (ed.). The Psychology of Learning and Motivation, vol. 27. New York: Academic Press, 1–64. Berlin, Brent & Paul Kay 1969. Basic Color Terms. Their Universality and Evolution. Berkeley, CA: University of California Press. Blank, Andreas 2003. Words and concepts in time. Towards diachronic cognitive onomasiology. In: R. Eckardt, K. von Heusinger & Ch. Schwarze (eds.). Words in Time. Diachronic Semantics from Different Points of View. Berlin: Mouton de Gruyter, 37–65. Bowerman, Melissa 1996. Learning how to structure space for language. A crosslinguistic perspective. In: P. Bloom et al. (eds.). Language and Space. Cambridge, MA: The MIT Press, 385–436. Brown, Roger 1958. How shall a thing be called? Psychological Review 65, 14–21. Reprinted in: R. C. Oldfield & J. C. Marshall (eds.). Language. Selected Readings. Harmondsworth: Penguin, 1968, 81–91. Brugman, Claudia 1981. The Story of ‘Over’. MA thesis. University of California, Berkeley, CA. Bybee, Joan L. 2001. Phonology and Language Use. Cambridge: Cambridge University Press. Coleman, Linda & Paul Kay 1981. Prototype semantics. The English word ‘lie’. Language 57, 26–44. Coseriu, Eugenio 2002. Structural semantics and ‘cognitive’ semantics. Logos and Language 1, 19–42. Dirven, René & John R. Taylor 1988. The conceptualization of vertical space in English. The case of tall. In: B. Rudzka-Ostyn (ed.). Topics in Cognitive Linguistics. Amsterdam: Benjamins, 379–402. Ellis, Nick 2002. Frequency effects in language processing. A review with implications for theories of implicit and explicit language acquisition. Studies in Second Language Acquisition 24, 143–188. Enfield, Nick, Asifa Majid & Miriam van Staden 2006. Parts of the body. Cross-linguistic categorisation. Language Sciences 28, special issue, 137–147. Fehr, Beverly & James A. Russel 1984. Concept of emotion viewed from a prototype perspective. Journal of Experimental Psychology: General 113, 464–486. Fillmore, Charles 1982. Towards a descriptive framework for spatial deixis. In: R. J. Jarvella & W. Klein (eds.). Speech, Place, and Action. Studies in Deixis and Related Topics. Chichester: Wiley, 31–59. Fillmore, Charles & Beryl Atkins 2000. Describing polysemy. The case of ‘crawl’. In: Y. Ravin & C. Leacock (eds.). Polysemy. Theoretical and Computational Approaches. Oxford: Oxford University Press, 91–110. Fodor, Jerry 1980. The present status of the innateness controversy. In: J. Fodor. Representations. Philosophical Essays on the Foundations of Cognitive Science. Cambridge, MA: The MIT Press, 257–316.
661
662
VI. Cognitively oriented approaches to semantics Fodor, Jerry 1998. Concepts. Where Cognitive Science Went Wrong. Oxford: Oxford University Press. Fodor, Jerry & Ernest Lepore 1996. The red herring and the pet fish. Why concepts still can’t be prototypes. Cognition 58, 253–270. Geeraerts, Dirk 1988. Where does prototypicality come from? In: B. Rudzka-Ostyn (ed.). Topics in Cognitive Linguistics. Amsterdam: Benjamins, 207–229. Geeraerts, Dirk 1989. Prospects and problems of prototype theory. Linguistics 27, 587–612. Geeraerts, Dirk 1993. Vagueness’s puzzles, polysemy’s vagaries. Cognitive Linguistics 4, 223– 272. Geeraerts, Dirk 1997. Diachronic Prototype Semantics. A Contribution to Historical Lexicology. Oxford: Oxford University Press. Geeraerts, Dirk, Stefan Grondelaers & Peter Bakema 1994. The Structure of Lexical Variation. Meaning, Naming, and Context. Berlin: Mouton de Gruyter. Gleason, Henry A. 1955. An Introduction to Descriptive Linguistics. New York: Holt, Rinehart & Winston. Goldberg, Adele 1995. Constructions. A Construction Grammar Approach to Argument Structure. Chicago, IL: The University of Chicago Press. Goldberg, Adele 2006. Constructions at Work. The Nature of Generalization in Language. Oxford: Oxford University Press. Goldinger, Stephen D. 1996. Words and voices. Episodic traces in spoken word identification and recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition 22, 1166–1183. Hallan, Naomi 2001. Paths to prepositions? A corpus-based study of the acquisition of a lexicogrammatical category. In: J. Bybee & P. Hopper (eds.). Frequency and the Emergence of Linguistic Structure. Amsterdam: Benjamins, 91–120. Hampton, James 1987. Inheritance of attributes in natural concept conjunctions. Memory & Cognition 15, 55–71. Hampton, James 1991. The combination of prototype concepts. In: P. Schwanenflugel (ed.). The Psychology of Word Meanings. Hillsdale, NJ: Erlbaum, 91–116. Heider, Eleanor 1971. ‘Focal’ color areas and the development of color names. Developmental Psychology 4, 447–455. Heider, Eleanor 1972. Universals in color naming and memory. Journal of Experimental Psychology 93, 10–20. Hintzman, Douglas 1986. ‘Schema abstraction’ in a multiple-trace memory model. Psychological Review 93, 411–428. Hoey, Michael 2005. Lexical Priming. London: Routledge. Hudson, Richard 1990. English Word Grammar. Oxford: Blackwell. Keil, Frank C. 1989. Concepts, Kinds, and Cognitive Development. Cambridge, MA: The MIT Press. Kleiber, Georges 1990. La sémantique du prototype. Catégories et sens lexical. Paris: PUF. Labov, William 1973. The boundaries of words and their meanings. In: C.-J. Bailey & R. W. Shuy (eds.). New Ways of Analyzing Variation in English. Washington, DC: Georgetown University Press, 340–373. Reprinted in: B. Aarts et al. (eds.). Fuzzy Grammar. A Reader. Oxford: Oxford University Press, 2004, 67–89. Lakoff, George 1972. Hedges. A study in meaning criteria and the logic of fuzzy concepts. In: P. M. Peranteau, J. N. Levi & G. C. Phares (eds.). Papers from the Eighth Regional Meeting of the Chicago Linguistic Society (= CLS). Chicago, IL: Chicago Linguistic Society, 183–228. Lakoff, George 1982. Categories. An essay in cognitive linguistics. In: The Linguistic Society of Korea (ed.). Linguistics in the Morning Calm. Selected Papers from SICOL-1981. Seoul: Hanshin, 139–193. Lakoff, George 1987. Women, Fire, and Dangerous Things. What Categories Reveal about the Mind. Chicago, IL: The University of Chicago Press.
28. Prototype theory Langacker, Ronald 1987. Foundations of Cognitive Grammar, vol. 1. Theoretical Prerequisites. Stanford, CA: Stanford University Press. Langacker, Ronald 2000. A dynamic usage-based model. In: M. Barlow & S. Kemmer (eds.). UsageBased Models of Language. Stanford, CA: CSLI Publications, 1–63. Lewandowska-Tomaszczyk, Barbara 2007. Polysemy, prototypes, and radial categories. In: D. Geeraerts & H. Cuyckens (eds.). The Oxford Handbook of Cognitive Linguistics. Oxford: Oxford University Press, 139–169. Lyons, John 1977. Semantics. Cambridge: Cambridge University Press. MacLaury, Robert 1991. Prototypes revisited. Annual Review of Anthropology 20, 55–74. MacLaury, Robert 1995. Vantage theory. In: J. Taylor & R. MacLaury (eds.). Language and the Cognitive Construal of the World. Berlin: Mouton de Gruyter, 231–276. Majid, Asifa, Melissa Bowerman, Miriam van Staden & James S. Boster 2007. The semantic categories of cutting and breaking events. A crosslinguistic perspective. Cognitive Linguistics 18, special issue, 133–152. Moon, Rosamund 1998. Fixed Expressions and Idioms in English. A Corpus-Based Approach. Oxford: Clarendon Press. Murphy, Gregory 2002. The Big Book of Concepts. Cambridge, MA: The MIT Press. Murphy, Gregory & Douglas Medin 1985. The role of theories in conceptual coherence. Psychological Review 92, 289–316. Osherson, Daniel & Edward E. Smith 1981. On the adequacy of prototype theory as a theory of concepts. Cognition 9, 35–58. Posner, Michael & Steven Keele 1968. On the genesis of abstract ideas. Journal of Experimental Psychology 77, 353–363. Pulman, Stephen G. 1983. Word Meaning and Belief. London: Croom Helm. Putnam, Hilary 1975. Philosophical Papers, vol. 2. Mind, Language and Reality. Cambridge: Cambridge University Press. Rosch, Eleanor 1975. Cognitive representations of semantic categories. Journal of Experimental Psychology. General 104, 192–233. Rosch, Eleanor 1977. Human categorization. In: N. Warren (ed.). Studies in Cross-Cultural Psychology, vol. 1. London: Academic Press, 3–49. Rosch, Eleanor 1978. Principles of categorization. In: E. Rosch & B. Lloyd (eds.). Cognition and Categorization. Hillsdale, NJ: Erlbaum, 27–48. Reprinted in: B. Aarts et al. (eds.). Fuzzy Grammar. A Reader. Oxford: Oxford University Press, 2004, 91–108. Rosch, Eleanor & Carolyn B. Mervis 1975. Family resemblances. Studies in the internal structure of categories. Cognitive Psychology 7, 573–605. Rosch, Eleanor, Carolyn Mervis, Wayne Grey, David Johnson & Penny Boyes-Braem 1976. Basic objects in natural categories. Cognitive Psychology 8, 382–439. Ross, Brian H. & Valerie S. Makin 1999. Prototype versus exemplar models. In: R. J. Sternberg (ed.). The Nature of Cognition. Cambridge, MA: The MIT Press, 205–241. Rothkopf, Ernst Z. 1971. Incidental memory for location of information in text. Journal of Verbal Learning and Verbal Behavior 10, 608–613. Sandra, Dominiek & Sally Rice 1995. Network analysis of prepositional meaning. Mirroring whose mind – the linguist’s or the language user’s? Cognitive Linguistics 6, 89–130. Schacter, Daniel 1987. Implicit memory. History and current status. Journal of Experimental Psychology. Learning, Memory, and Cognition 13, 501–518. Smith, Edward E. & Douglas L. Medin 1981. Categories and Concepts. Cambridge, MA: Harvard University Press. Storms, Gert, Paul de Boeck & Wim Ruts 2000. Prototype and exemplar-based information in natural language categories. Journal of Memory and Language 42, 51–73. Sweetser, Eve 1987. The definition of lie. An examination of the folk models underlying a semantic prototype. In: D. Holland & N. Quinn (eds.). Cultural Models in Language and Thought. Cambridge: Cambridge University Press, 43–66.
663
664
VI. Cognitively oriented approaches to semantics Taylor, John R. 1999. Cognitive semantics and structural semantics. In: A. Blank & P. Koch (eds.). Historical Semantics and Cognition. Berlin: Mouton de Gruyter, 17–48. Taylor, John R. 2003a. Linguistic Categorization. 3rd edn. Oxford: Oxford University Press. 1st edn. 1989. Taylor, John R. 2003b. Near synonyms as co-extensive categories. ‘High’ and ‘tall’ revisited. Language Sciences 25, 263–284. Taylor, John R. 2006. Polysemy and the lexicon. In: G. Kristiansen et al. (eds.). Cognitive Linguistics. Current Applications and Future Perspectives. Berlin: Mouton de Gruyter, 51–80. Taylor, John R. 2007. Semantic categories of cutting and breaking. Some final thoughts. Cognitive Linguistics 18, 331–337. Taylor, John. R. 2008. Prototypes in cognitive linguistics. In: P. Robinson & N. Ellis (eds.). Handbook of Cognitive Linguistics and Second Language Acquisition. New York: Routledge, 39–65. Tomasello, Michael 2003. Constructing a Language. A Usage-Based Theory of Language Acquisition. Cambridge, MA: Harvard University Press. Tuggy, David 1993. Ambiguity, polysemy, and vagueness. Cognitive Linguistics 4, 273–290. Tyler, Andrea & Vyvyan Evans 2001. Reconsidering prepositional polysemy networks. The case of over. Language 77, 724–765. Violi, Patrizia 1997. Significato ed esperienza. Milan: Bompiani. English Translation in: P. Violi. Meaning and Experience. Bloomington, IN: Indiana University Press, 2001. Wierzbicka, Anna 1990. ‘Prototypes save’. On the uses and abuses of the notion of ‘prototype’ in linguistics and related fields. In: S. Tsohatzidis (ed.). Meanings and Prototypes. Studies in Linguistic Categorization. London: Routledge, 347–367. Reprinted in B. Aarts et al. (eds.). Fuzzy Grammar. A Reader. Oxford: Oxford University Press, 2000, 461–478. Wierzbicka, Anna 1996. Semantics. Primes and Universals. Oxford: Oxford University Press. Wittgenstein, Ludwig 1978. Philosophical Investigations. Translated by G. E. M. Anscombe. Oxford: Blackwell. Zlatev, Jordan 2003. Polysemy or generality? Mu. In: H. Cuyckens, R. Dirven & J. Taylor (eds.). Cognitive Approaches to Lexical Semantics. Berlin: Mouton de Gruyter, 447–494.
John R. Taylor, Dunedin (New Zealand)
29. Frame Semantics 1. 2. 3. 4. 5. 6. 7. 8.
Introduction Fillmorean frames Related conceptions Events, profiling, and perspectivalization Lexicography Discourse understanding Conclusion References
Abstract Frames are conceptual structures that provide context for elements of interpretation; their primary role in an account of text understanding is to explain how our text interpretations can leap far beyond what the text literally says. The present article explores the role of Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 664–687
29. Frame Semantics frames in providing a principled account of the openness and richness of word-meanings, distinguishing a frame-based account from classical approaches, such as accounts based on conceptual primitives, lexical fields, and connotation, and showing how they can play a role in the account of how word meaning interacts with syntactic valence. For there exists a great chasm between those, on the one side, who relate everything to a single central vision, one system more or less coherent or articulate, in terms of which they understand, think and feel – a single, universal, organizing principle in terms of which alone all that they are and say has significance – and, on the other side, those who pursue many ends, often unrelated and even contradictory, connected, if at all, only in some de facto way, for some psychological or physiological cause, related by no moral or aesthetic principle. Berlin (1957: 1), cited by Minsky (1975)
1. Introduction Two properties of word meanings contribute mightily to the difficulty of providing a systematic account. One is the openness of word meanings. The variety of word meanings is the variety of human experience. Consider defining words such as Tuesday, barber, alimony, seminal, amputate, and brittle. One needs to make reference to diverse practices, processes, and objects in the social and physical world: repeatable calendar events, grooming and hair, marriage and divorce, discourse about concepts and theories, and events of breaking. Before this seemingly endless diversity, semanticists have in the past stopped short, excluding it from the semantic enterprise, and attempting to draw a line between a small linguistically significant set of primitive concepts and the openness of the lexicon. The other problem is the closely related problem of the richness of word meanings. Words are hard to define, not so much because they invoke fine content specific distinctions, but because they invoke vast amounts of background information. The concept of buying presupposes the complex social fact of a commercial transaction. The concept of alimony presupposes the complex social fact of divorce, which in turn presupposes the complex social fact of marriage. Richness, too, has inspired semanticists simply to stop, to draw a line, saying exact definitions of concepts do not matter for theoretical purposes. This boundary-drawing strategy, providing a response if not an answer to the problems of richness and openness, deserves some comment. As linguistic semanticists, the story goes, our job is to account for systematic, structurally significant properties of meaning. This includes: (1) a. the kinds of syntactic constructions lexical meanings are compatible with. i. the kinds of participants that become subjects and objects ii. regular semantic patterns of oblique markings and valence alternations b. Regular patterns of inference licensed by category, syntactic construction or closed class lexical item. The idea is to carve off that part of semantics necessary for knowing and using the syntactic patterns of the language. To do this sort of work, we do not need to pay attention to every conceptually possible distinction. Instead we need a small set of semantic
665
666
VI. Cognitively oriented approaches to semantics primitives that make the distinctions that linguistically matter; what is left over can be dealt with using some open class of predicates or features whose internal details are not of concern. Jackendoff (1990) is a good example of this kind of approach. The generative semantics program, especially as outlined in Lakoff (1972), is another. Dowty (1979) has many of the same features, but in places expresses doubts that the program can be completely carried out. The kind of analysis I have in mind can be exemplified through Dowty’s generative-semantics-like analysis of causatives like break.tr (transitive break): (2) a. John broke the glass. b. do(John, cause(become(broken(glass)))) Here the predicates in capitals (do, cause, become) are from the inventory of linguistically significant primitives, and the lower case predicates (broken, glass) are from the open class predicates whose internal structure does not matter. At most we need to know that one expresses a state (broken) and the other a kind (glass). The details beyond that are linguistically insignificant. Of course there are differences in truth-conditions between states like broken and dead, but these have only minor selectional effects on the causative inchoatives created from them (break.tr = do ... cause become broken’ and kill = do ... cause become dead’). I will refer to this view of lexical semantics as the classical view. In this paper I wish to consider a view of semantics in general and lexical semantics in particular that is quite at odds with this classical picture: Frame Semantics (Fillmore 1975, 1977b, 1978, 1982, 1985). Someone wishing to contest the classical picture has two options: first, contend that the wrong kinds of questions are being asked; second, argue that the program as outlined is not very well-suited to attaining its goals. As we shall see, both kinds of objection motivate Frame Semantics.
2. Fillmorean frames 2.1. Motivations The version of Frame Semantics I will present here is largely the brainchild of Charles J. Fillmore. Although Frame Semantics has sprouted off in a number of directions and been applied to a number of problems, I will limit the present discussion in two ways: First I will confine myself largely to fleshing out the Fillmorean picture; second, I will confine myself mostly to questions of the lexicon, lexicography, and the lexicon-syntax interface, leaving for other work questions of discourse and text understanding to which frames are also relevant. I will briefly consider the different roles frames play in the account of sign meaning and discourse interpretation. Although Fillmore has had many interesting things to say about the kinds of problems listed in (1) in early and late works on Case Grammar, the primary motivations given in Fillmore (1982, 1985) focus on Frame Semantics as a contribution to a theory of text understanding. Consider for example, the very different scenes evoked by the following pair of sentences, discussed in Fillmore (1985): (3) a. I can’t wait to be on the ground again. b. I can’t wait to be on land again.
29. Frame Semantics Sentence (3a) evokes a speaker who is in the air (on a plane), sentence (3b) a speaker who is at sea (on a ship). This contrast is tied to some difference between the words land and ground, yet, on the face of it, land and ground denote very similar things. Fillmore would say land is understood within a conceptual frame of sea travel, and within that frame it is opposed to sea, while ground is understood within a conceptual frame of air travel, and within that frame, it is opposed to air. Thus we can explain something that is very difficult to explain in terms what the words in the sentence denote by investigating the conceptual background against which the relevant word senses are defined. That conceptual background is what Fillmore calls a frame. Frames are conceptual structures that provide context for elements of interpretation; their primary role in an account of text understanding is to explain how our text interpretations can (validly) leap far beyond what the text literally says. Frames can be introduced into interpretation in a variety of ways. They may be directly tied to word senses as in the example of land and ground or they may be introduced by patterns among the facts the text establishes. To use another example of Fillmore’s (1985: 232): (4) We never open our presents until morning. This sentence evokes the Christmas frame by describing a situation that matches salient facts of Christmas practice, even though no word in it is specific to Christmas. If in fact the Christmas frame is the right one, that evocation makes a significant contribution to the understanding of the surrounding text. Frames are motivated not just by words, then, but by stereotypes about customs, practices, institutions, and games. Moreover, the kinds of cognitive structures Fillmore has in mind have been proposed by a variety of researchers for a variety of purposes. Fillmore has adopted the terminology of AI researcher Minsky (1975) in calling them frames, but schemata in psychology (Bartlett 1932, Rumelhart 1980) are getting at something very similar, as are scripts (Schank & Abelson 1977), cognitive models (Lakoff 1983), experiential gestalts (Lakoff & Johnson 1980), the base (as opposed to the profile) (Langacker 1984), and Fillmore’s own notion of scene (Fillmore 1976, 1977a). More recently, in articulating a simulation view of conceptual processing, Barsalou (1992, 1999) has proposed that object conceptualization is processed through simulators of objects linked to components of a variety of situation memories; one consequence is that objects may activate components from different situations in different perceptual contexts. In this theory, too, then, conceptualization is framed against a background with components that help provide an interpretation for scenes or objects. For more discussion, see article 106 (Kelter & Kaup) Conceptual knowledge, categorization, and meaning. As an approach to word meanings specifically, the starting point for Frame Semantics is that the lexical semantics “problems” of openness and richness are connected. Openness depends on richness. Openness does not mean lack of structure. In fact, it presupposes structure. Most concepts are interpretable or understandable or definable only against the background of other concepts. Many backgrounds are rich enough to define a cluster of concepts, in particular, a cluster of words. These backgrounds are the frames. Thus because words are networked together through their shared backgrounds, frames can provide an organizing principle for the openness of the lexicon. Consider one of the examples already discussed, discussed in Fillmore (1982). The concept of alimony depends on the concept of divorce. The concept of divorce in turn
667
668
VI. Cognitively oriented approaches to semantics depends on the concept of marriage. The dependency is definitional. Unless you define what a marriage is, you can’t define what a divorce is. Unless you define what a divorce is, you can’t define what alimony is. Thus there is a very real sense in which the dependencies we are describing move us toward simpler concepts. Notice, however, that the dependency is leading in a different direction than an analysis that decomposes meanings into a small set of primitives like cause and become. Instead of leading to concepts of increasing generality and abstractness, we are being led to define the situations or circumstances which provide the necessary background for the concepts we are describing. The concepts of marriage and divorce are equally specific, but the institution of marriage provides the necessary background for the institution of divorce. Or consider the complex subject of Tuesdays (Fillmore 1985). We live in a world of cyclic events. Seasons come and go and then return. This leads to a cyclic calendar which divides time up into repeating intervals, which are divided up further. Years are divided into months, which are divided into weeks, which are divided into days, which have cyclic names. Each week has a Sunday, a Monday, a Tuesday, and so on. Defining Tuesday entails defining the notion of a cyclic calendar. Knowing the word Tuesday may not entail knowing the word Sunday, but it does entail understanding at least the concept of a week and a day and their relation, and that each week has exactly one Tuesday. We thus have words and background concepts. We will call the background concept the frame. Now the idea of a frame begins to have some lexical semantic bite with the observation that a single concept may provide the background for a set of words. Thus the concept of marriage provides the background for words/suffixes/phrases such as bride, groom, marriage, wedding, divorce, -in-law, elope, fiancee, best man, maid-of-honor, honeymoon, husband, and wife, as well as a variety of basic kinship terms omitted here for reasons of space. The concept of calendar cycle provides the frame for lexical items such as week, month, year, season, Sunday, ..., Saturday, January, ..., December, day, night, morning, and afternoon. Notice that a concept once defined may provide the background frame for further concepts. Thus, divorce itself provides the background frame for lexical items such as alimony, divorce, divorce court, divorce attorney, ex-husband, and ex-wife. In sum, a frame may organize a vocabulary domain: Borrowing from the language of gestalt psychology we could say that the assumed background of knowledge and practices – the complex frame behind this vocabulary domain – stands as a common ground to the figure representable by any of the individual words. [Words belonging to a frame] are lexical representatives of some single coherent schematization of experience or knowledge. Fillmore (1985: 223)
Now a premise of Frame Semantics is that the relation between lexical items and frames is open ended. Thus one way in which the openness of the lexicon manifests itself is in building concepts in unpredictable ways against the backdrop of other concepts. The concept of marriage seems to be universal or near-universal in human culture. The concept of alimony is not. No doubt concepts sometimes pop into the lexicon along with their defining frames (perhaps satellite is an example), but the usual case is to try to build them up out of some existing frame (Thus horseless carriage leading to car is the more usual model).
29. Frame Semantics
669
Summing up: openness does not mean structurelessness. Concepts and their related words have certain unidirectional backgrounding relations that frames capture. (5)
Words
Frames
bride, groom, marriage, wedding, divorce, -in-law, elope, fiancee, best man, maid-of-honor, honeymoon, husband, wife
marriage
alimony, divorce court, divorce attorney, ex-husband, and ex-wife
divorce
week, month, year, Sunday, ..., Saturday, January, ... , December, morning, afternoon
calendar cycle
freezing, cold, cool, tepid, lukewarm, warm, hot, temperature, thermometer
temperature
All of this obviously points in exactly the opposite direction from the classical view, a few salient primitives, a hard distinction between linguistic and encyclopedic, and a large uninvestigated class of open class predicates. But from the other direction, support for the classical view has been eroding even among those whose concerns have primarily departed from the problems in (1) such as Levin (1993) or from classic lexical semantic problems like polysemy (Pustejovsky 1995). Consider the kind of problem Levin (1993) discusses in her seminal study of English verb classes. A theory that does not posit a systematic difference between the broken state of the verb break in (2) and the dead state in the decomposition of kill cannot account for the following contrast: (6) a. John broke the glass against the wall. b. # John killed the cockroach against the wall. Nor can it account for the fact that verbs in some sense close in meaning to break (shatter, smash, crack, flatten) will follow pattern (a), while verbs in some sense close to kill will follow pattern (b) (strangle, murder, smother, and drown). The generalization at issue is (roughly) that state change or directed action verbs whose effect is commonly achieved by moving one object against another will allow pattern (a) when the object whose state is changed or potentially changed is direct object. Other examples are hit, knock, rap, bang, and slam. None of the kill-type verbs fit the bill. Thus if valence patterns are part of what is to be explained, then a language like English, with its rich inventory of prepositions and situationally specific constructions (see for example the pattern lists in Levin 1993), will require reference to a large inventory of concepts. It is difficult to see how a principled line between open class and closed class concepts can be drawn in carrying out this program. It is clear for example, that Levin’s verbs of contact, which include the verbs like hit and slap discussed above, overlap signicantly with the verbs list for the impact frame in FrameNet, a large computational instantiation of the ideas of Frame Semantics (Fillmore & Atkins 1994; Baker, Fillmore & Lowe 1998; Fillmore & Atkins 1998; Baker & Fillmore 2001; Boas 2001, 2005; Chang, Narayanan & Petruck 2002a, 2002b). At last count the NSF FrameNet project (Fillmore & Baker 2000) which is building a frame lexicon for English had over 800 frames for about 4500 words. Thus the problems of openness and richness arise whether one starts from text understanding or from syntax/semantics interface.
670
VI. Cognitively oriented approaches to semantics
2.2. Basic tools We have thus far focused on the role of frames in a theory of word meanings. Note that nothing in particular hangs on the notion word. Frames may also have a conventional connection to a simple syntactic constructions or idiom; give someone the slip probably belongs to the same frame as elude. Or they may be tied to more complex constructions such as the Comparative Correlative (cf. article 86 (Kay & Michaelis) Constructional meaning). (7) The more I drink the better you look. This construction has two “slots” requiring properties of quantity or degree. The same issues of background situation and profiled participants arise whether the linguistic exponent is a word or construction. The term sign, used in exactly the same sense as it is used by construction grammarians, will serve here as well. As a theory of the conventional association of schematized situations and linguistic exponents, then, Frame Semantics makes the assumption that there is always some background knowledge relative to which linguistic elements do some profiling, and relative to which they are defined. Two ideas are central: 1. 2.
a background concept a set of signs including all the words and constructions that utilize this conceptual background.
Two other important frame theoretic concepts are frame elements and profiling. Thus far in introducing frames I have emphasized what might be called the modularity of knowledge. Our knowledge of the the world can usefully be divided up into concrete chunks. Equally important to the Fillmorian conception of frames is the integrating function of frames. That is, frames provide us with the means to integrate with other frames in context to produce coherent wholes. For this function, the crucial concept is the notion of a frame element (Fillmore & Baker 2000). A frame element is simply a regular participant, feature, or attribute of the kind of situation described by a frame. Thus, frame elements of the wedding frame will include the husband, wife, wedding ceremony, wedding date, best man and maid of honor, for example. Frame elements need not be obligatory; one may have a wedding without a best man; but they need to be regular recurring features. Thus, frames have slots, replaceable elements. This means that frames can be linked to to other frames by sharing participants or even by being participants in other frames. They can be components of an interpretation. In Frame Semantics, all word meanings are relativized to frames. But different words select different aspects of the background to profile (we use the terminology in Langacker 1984). Sometimes aspects profiled by different words are mutually exclusive parts of the circumstances, such as the husband and wife in the marriage frame, but sometimes word meanings differ not in what they profile, but in how they profile it. In such cases, I will say words differ in perspective (Fillmore 1977a). I will use Fillmore’s muchdiscussed commercial event example (Fillmore 1976) to illustrate:
29. Frame Semantics (8) a. John sold the book to Mary for $100. b. Mary bought the book from John for $100. c. Mary paid John $100 for the book. Verbs like buy, sell, pay have as background the concept of a commercial transaction, an event in which a buyer gives money to a seller in exchange for some goods. Now because the transaction is an exchange it can be thought of as containing what Fillmore calls two subscenes: a goods_transfer, in which the goods is transferred from the seller to the buyer, and a money_transfer, in which the money is transferred from the buyer to the seller. Here it is natural to say that English has as a valence realization option for transfers of possession one in which the object being transferred from one possessor to another is realized as direct object. Thus verbs profiling the money transfer will make the money the direct object (pay and collect) and verbs profiling the goods transfer will make the goods the direct object (buy and sell). Then the difference between these verb pairs can be chalked up to what is profiled. But what about the difference between buy and sell? By hypothesis, both verbs profile a goods transfer, but in one case the buyer is subject and in another the seller is. Perhaps this is just an arbitrary choice. This is in some sense what the thematic role theory of Dowty (1991) says: Since (8a) and (8b) are mutually entailing, there can be no semantic account of the choice of subject. In Frame Semantics, however, we may attempt to describe the facts as follows: in the case of buy the buyer is viewed as (perspectivalized as) agent, in the case of sell, the seller is. There are two advantages to this description. First, it allows us to preserve a principle assumed by a number of linguists, that cross-linguistically agents must be subjects. Second, it allows us to interpret certain adjuncts that enter into special relations with agents: instrumentals, benefactives, and purpose clauses. (9) a. John bought the book from Mary with/for his last pay check. [Both with and for allow the reading on which the pay check provides the funds for the purchase.] b. Mary sold the book to John ?with/for his last paycheck. [Only for allows the reading on which the pay check provides the funds.] c. John bought the house from Sue for Mary. [allows reading on which Mary is ultimate owner, disallows the reading on which Mary is seller and Sue is seller’s agent.] d. Sue sold the house to John for Mary. [allows reading on which Mary is seller and Sue is seller’s agent; disallows reading on which Mary is ultimate owner.] e. John bought the house from Sue to evade taxes/as a tax dodge. [tax benefit is John’s] f. Sue sold the house to John to evade taxes/as a tax dodge. [tax benefit is Sue’s] But what does it mean to say that a verb takes a perspective which “views” a particular participant as an agent? The facts are, after all, that both the buyer and the seller are agents; they have all the entailment properties that characterize what we typically call agents; and this, Dowty’s theory of thematic roles tells us, is why verbs like buy and sell can co-exist. I will have more to say on this point in section 4; for the moment I will confine myself to the following general observation on what Frame Semantics allows: What is profiled and what is left out is not determined by the entailment facts of its frame. Complex
671
672
VI. Cognitively oriented approaches to semantics variations are possible. For example, as Fillmore observes, the commercial transaction frame is associated with verbs that have no natural way of realizing the seller: (10) John spent $100 on that book. Nothing in the valence marking of the verb spend suggests that what is being profiled here is a possession transfer; neither the double object construction, nor from nor to is possible for marking a core commercial transaction participant. Rather the pattern seems to be the one available for what one might call resource consumption verbs like waste, lose, use (up), and blow. In this profiling, there is no room for a seller. Given that such variation in what is profiled is allowed, the idea that the agenthood of a participant might be part of what’s included or left out does not seem so far-fetched. As I will argue in section 4, the inclusion of events into the semantics can help us make semantic sense of what abstractions like this might mean. These considerations argue that there can be more than one frame backgrounding a single word meaning; for example, concepts of commercial event, possession transfer, and agentivity simultaneously define buy. A somewhat different but related issue is the issue of event structure. There is strong evidence cross-linguistically at least in the form of productive word-formation processes that some verbs – for example, causatives – represent complex events that can only be expressed through a combination of two frames with a very specific semantics. So it appears that a word meaning can simultaneously invoke a configuration of frames, with particulars of the configuration sometimes spelled out morphologically. The idea that any word meaning exploits a background is of use in the account of polysemy. Different senses will in general involve relativization to different frames. As a very simple example, consider the use of spend in the following sentence: (11) John spent 10 minutes fixing his watch. How are we to describe the relationship of the use of spend in this example, which basically describes a watch fixing event, with that in (10), which describes a commercial transaction? One way is to say that one sense involves the commercial transaction, and another involves a frame we might call action duration which relates actions to their duration, a frame that would also be invoked by durative uses of for. A counterproposal is that there is one sense here, which involves an actor using up a resource. But such a proposal runs up against the problem that spend really has rather odd disjunctive selection restrictions: (12) John spent 30 packs of cigarettes that afternoon. Sentence (12) is odd except perhaps in a context (such as a prison or boarding school) where cigarette packs have become a fungible medium of exchange; what it cannot mean is that John simply used up the cigarettes (by smoking them, for example). The point is that a single general resource consumption meaning ought to freely allow resources other than time and money, so a single resource consumption sense does not correctly describe the readings available for (12); however, a sense invoking a commercial transaction frame constrained to very specific circumstances does. Note also, that the fact
29. Frame Semantics that 30 packs of cigarettes can be the money participant in the right context is naturally accommodated. The right constraint on the money participant is not that it be cash (for which Visa and Mastercard can be thankful), but that it be a fungible medium of exchange. Summarizing: 1. Frames are motivated primarily by issues of understanding and converge with various schema-like conceptions advanced by cognitive psychologists, AI researchers, and cognitive linguists. They are experientially coherent backgrounds with variable components that allow us to organize families of concepts. 2. The concept of frames has far reaching consequences when applied to lexical semantics, because a single frame can provide the organizing background for a set of words. Thus frames can provide an organizing principle for a rich open lexicon. FrameNet is an embodiment of these ideas. 3. In proposing an account of lexical semantics rich enough for a theory of understanding, Frame Semantics converges with other lexical semantic research which has been bringing to bear a richer set of concepts on problems of the syntax semantics interface. Having sketched the basic idea, I want in the next two sections to briefly contrast the notion frame with two other ideas that have played a major role in semantics, the idea of a relation, as incorporated via set theory and predicate logic into semantics, and the idea of a lexical field.
3. Related conceptions In this section I compare the idea of frames with two other concepts of major importance in theories of lexical semantics, relations and lexical fields. The comparison offers the opportunity to develop some other key ideas of Frame Semantics, including profiling and saliency.
3.1. Frames versus relations: profiling and saliency Words (most verbs, some nouns, arguably all degreeable adjectives) describe relations in the world. Love and hate are relations between animate experiencers and objects. The verb believe describes a relation between an animate experiencer and a proposition. These are commonplace views among philsophers of language, semanticists, and syntacticians, and they have provided the basis for much fruitful work. Where do frames fit in? For Fillmore, frames describe the factual basis for relations. In this sense they are “pre-”relational. To illustrate, Fillmore (1985) cites Mill’s (1847) discussion of the words father and son. Although there is a single history of events which establishes both the father- and the son- relation, the words father and son pick out different entities in the world. In Mill’s terminology, the words denote different things, but connote a single thing, the shared history. This history, which Mill calls the fundamentum relationis (the foundation of the relation), determines that the two relations bear a fixed structural relation to
673
674
VI. Cognitively oriented approaches to semantics each other. It is the idea of a determinate structure for a set of relations that Fillmore likens to the idea of a frame. Thus, a frame defines not a single relation but, minimally, a structured set of relations. This conception allows for a natural description not just of pairs of words like father and son, but also of single words which do not in fact settle on a particular relation. Consider the verb risk, discussed in Fillmore & Atkins (1998), which seems to allow a range of participants into a single grammatical “slot”. For example,
(13) Joan risked
{
a. censure. b. her car. c. a trip down the advanced ski slope.
The risk frame has at least 3 distinct participants, (a) the bad thing that may happen, (b) the valued thing that may be lost, and (c) the activity that may cause the bad thing to happen. All can be realized in the direct object position, as (13) shows. Since there are three distinct relations here, a theory that identifies lexical meanings with relations needs to say there are 3 meanings as well. Frame Semantics would describe this as one frame allowing 3 distinct profilings. It is the structure of the frame together with the profiling options the language makes available which makes the 3 alternatives possible. Other verbs with a similar indeterminacy of participant are copy, collide, and mix: (14) a. b. c. d. e. f. g.
Sue copied her costume (from a film poster). Sue copied the film poster. The truck and the car collided. The truck collided with the car. John mixed the soup. John mixed the paste into the soup. John mixed the paste and the flour.
In each of these cases the natural Frame Semantics account would be to say the frame remains constant while the profilings or perspective changes. Thus, under a Frame Semantics approach, verbal valence alternations are to be expected, and the possibility of such alternations provides motivation for the idea of a background frame with a range of participants and a range of profiling options. Now on a theory in which senses are relations, all the verbs in (14) must have different senses. This is, for example, because the arguments in (14a) and (14b) fill different roles. Frame Semantics allows another option. We can say the same verb sense is used in both cases. The differences in interpretation arise because of differences in profiling and perspectivalization.
3.2. Frames versus lexical fields Because frames define lexical sets, it is useful to contrast the concept of frames with an earlier body of lexical semantic work which takes as central the identification of lexical sets. This work develops the idea of lexical fields (Weisgerber 1962; Coseriu 1967; Trier 1971; Geckeler 1971; Lehrer & Kittay 1992). Lexical fields define sets of lexical items in mutually defining relations, in other words, lexical semantic paradigms. The
29. Frame Semantics classic example of a lexical field is the set of German labels used for evaluating student performance (Weisgerber 1962: 99): (15) sehr gut, gut, genügend and mangelhaft The terms are mutually defining because the significance of a single evaluation obviously depends on knowing the entire set and the relations of the terms in the set. Thus gut means one thing in a school system with the 4 possibilities in (15) and quite another if the possibilities are: (16) sehr gut, gut, befriedigend, ausreichend, mangelhaft and ungenügend Fillmore also cites the example of the tourist industry use of the term first class in their categorization of hotels; to many travelers, first class sounds pretty good; in fact, the top ranked class of hotels is luxury and first class is fourth from the top. The misunderstanding here seems exactly like a case of applying the wrong frame in the process of understanding. Domains in which lexical fields have provided fruitful analyses include color, temperature, furniture and artifacts, kinship relations, intelligence, livestock, and terrain features (Fillmore 1985: 227). The general hypothesis of lexical field theory is that the lexicon can be carved up into a number of (sometimes overlapping) lexical sets, each of which functions as a closed system. To this extent, there is agreement with the conception of frames, and in fact, the lexical sets associated with frames can include lexemes in paradigmatic, mutually defining relations. For example, we identified the temperature frame in section 2, and this includes the lexical field of temperature words like cold, cool, lukewarm, warm, and hot. However, the idea of a frame is distinct from the idea of a lexical field. To start with, the idea of a one-word lexical field is incoherent: How can a word have a function in a field in which there is nothing for it to be opposed to? However, there is no inherent difficulty with the idea of a one-word frame. Fillmore (1985) cites the example of hypotenuse, which requires for its background the concept of a right triangle. There appear to be no other English lexical items specific to right triangles (the term leg in the relevant sense seems to apply to triangle sides in general); and that is neither surprising nor problematic. The notion mutually defining is not necessary for lexical frame sets because words in frames are defined in contrast to or in terms of the frame alone. The frame, not its lexical instantiations, provides the background necessary to identify a semantic function. The primitive notion is not defined in opposition to but profiled from the background of. A second way in which frames differ from lexical fields is that, even when there is more than one word, there is no requirement that words in the set function in paradigmatic opposition to one another. Thus the temperature frame cited above also contains the noun temperature, just as the height frame containing polar adjectives like tall and short will contain the noun height. Thirdly, because of the notion of mutual definition, lexical fields come with strict criteria of individuation. In contrast, as we saw in section 2, frames of arbitrary specificity make sense. Thus, we have very general frames of temperature and height. But we also have a set of specific frames that recover the traditional mutually defining sets
675
676
VI. Cognitively oriented approaches to semantics that preoccupied lexical field theorists, a specialization of height that includes just the polar adjectives, a specialization of temperature that includes just the set cold, cool, warm, hot, and so on. This level of specificity in fact roughly describes the granularity of FrameNet.
3.3. Minskian frames As described in Fillmore (1982), the term frame is borrowed from Marvin Minsky. It will be useful before tackling the question of how profiling and perspectivalization work to take a closer look at this precursor. In Minsky’s original frames paper (Minsky 1975), frames were put forth as a solution to the problem of scene interpretation in vision. Minsky’s proposal was in reaction to those who, like the Gestalt theorists (Koffka 1963), viewed scene perception as a single holistic process governed by principles similar to those at work in electric fields. Minsky thought scenes were assembled in independent chunks, constituent by constituent, in a series of steps involving interpretation and integration. To describe this process, a model factoring the visual field into a number of discrete chunks, each with its own model of change with its own discrete phases, was needed. A frame was thus a dynamic model of some specific kind of object with specific participants and parameters. The model had built-in expectations about ways in which the object could change, either in time or as a viewer’s perspective on it changed, formalized as operations mapping old frame states to new frame states. A frame also included a set of participants whose status changed under these operations; those moving into certain distinguished slots are foregrounded. Thus, for example, in the simplified version of Minsky’s cube frame, shown before and after a rotation in Figs. 29.1 and 29.2, a frame state encodes a particular view of a cube and the participants are cube faces. One possible operation is a rotation of the cube, defined to place new faces in certain view-slots, and move old faces out and possibly out of view. The faces that end up in view are the foregrounded participants of the resulting frame state. Thus the cube frame offers the tools for representing particular views or perspectives on a cube, together with the operations that may connect them in time.
Fig. 29.1: View of cube together with simplified cube frame representing that view. Links marked “fg” lead to foregrounded slots; slots marked “invis” are backgrounded. Faces D and C are out of view.
29. Frame Semantics
Fig. 29.2: Cube frame after counterclockwise rotation. Faces D and A are now foregrounded, B has moved out of view.
Fillmore’s innovation, then, was to apply this Minskian idea in the domain of word meaning, importing not only the idea of chunked modular knowledge units, but also the idea of operations that take perspectives on such chunks. I used the terms profiling and perspectivalization to describe such operations in section 2. Although Fillmore himself does not attempt a formalization of these operations, I believe it is possible to clearly describe what is at issue using some ideas from event semantics (Davidson 1967, 1980, Parsons 1990), building on the event-based approach to frames in Gawron (1983).
4. Events, profiling, and perspectivalization To spell out a bit better how word senses might invoke multiple frames, let us return to the case of the commercial transaction frame discussed in section 2. The following development takes up and extends the ideas of Gawron (1983). A rather natural account of the interface between frames and compositional semantics becomes available if we make use of neo-Davidsonian event-semantics (Davidson 1967, 1980; Parsons 1990). On a neo-Davidsonian account, we have, as the schematic semantics for John bought the book on sale: ∃e[buy’(e) ∧ agent(e) = j ∧ patient(e) = b ∧ on-sale(e, b)] We call e in the above representation the lexical event. I assume that Fillmorean frames classify events. That is, there is such a thing as a commercial transaction event. Further, I assume that lexical predicates like give and buy are predicates true of events. These lexical events cannot be directly identified with Fillmorean frame events. Rather the lexical events are perspectivalizations of Fillmorean frame events. Thus, for example, buying will be associated with three events, one perspectivalizing event that is directly related to syntactic realization, a second profiling event that is a profiling of a third commercial transaction (or Filmorean frame event).
677
678
VI. Cognitively oriented approaches to semantics I will call this latter the circumstance event. Perspectivalizing, profiling, and circumstance events will be related by functions. Borrowing the machinery of sorted logic (Carpenter 1992; Smolka 1992; Rounds 1997), I will assume that all predicates are sorted; that is, it is a property of predicates and relations that in all models, for any given argument position, there is a sort of individuals for which that argument position is defined. I will write sorts in boldface and predicates in roman. (17) agent patient : agent_patient → truth-values agent: agent_patient → animate patient: agent_patient → entity source : agent_patient → (entity) goal: agent_patient → (entity) These declarations just say, in roughly standard mathematical notation that agent and patient are functions from one set to another. For example, the first declaration says that agent patient is a function from the set (sort) to truth-values; the second says agent is a function from the set (sort) of agent patient events to animates; patient from the set of agent patient events to the set of things (the domain of entities). The parentheses in the source and goal role definitions may be taken to mean that the role is optional (or the function is partial). Not every agent patient event has a source or a goal, but some do. I assume the declarations (or axioms) in (17) are sufficient to define a very simple kind of frame. The first axiom defines a predicate agent patient that is true of events of that sort; the rest define a set of roles for that sort of event. Thus a minimal frame is just an event sort defined for a set of roles. I will call agent patient an argument frame because syntactic arguments of a verb will need to directly link to the roles of argument frames (such as agent and patient). We can represent this set of axioms as an attribute-value matrix (AVM): (18)
agent patient agent animate source entity goal entity patient entity
Henceforth I use AVM notation for readability, but the reader should bear in mind that it is merely a shorthand for a set of axioms like those in (17), constraining partial functions and relations on sorts. I will call agent patient an argument frame because syntactic arguments of a verb will need to directly link to the roles of argument frames (such as agent and patient). The agent patient frame is very general, too general to be of much semantic use. In order to use it a lexical item must specify some circumstance frame in which participant roles are further specified with further constraints.
29. Frame Semantics
679
The connection between an argument frame like agent patient and simple circumstance frames can be illustrated through the example of the possession transfer frame (related to verbs like give, get, take, receive, acquire, bequeath, loan, and so on). Represented as an AVM, this is: (19)
possession donor possession recipient
transfer animate entity animate
Now both give and acquire will be defined in terms of the possession transfer frame, but give and acquire differ in that with give the donor becomes subject and with acquire the recipient does. (Compare the difference between buy and sell discussed in section 2.2.) We will account for this difference by saying that give and acquire have different mappings from the agent patient frame to their shared circumstance frame (possession transfer). This works as follows. We define the relation between a circumstance and argument frame via a perspectivalizing function. Here are the axioms for what we will call the acquisition function, on which the recipient is agent: (20) a. b. c. d.
acquisition : possession_transfer → agent_patient agent o acquisition = recipient patient o acquisition = possession source o acquisition = donor
The first line defines acquisition as a mapping from the sort possession_transfer to the sort agent_patient, that is as a mapping from possession transfer eventualities to agent patient eventualities. The mapping is total; that is, each possession transfer is guaranteed to have an agent patient eventuality associated with it. In the second line, the symbol O stands for function composition; the composition of the agent function with the acquisition function (written agent O acquisition) is the same function (extensionally) as the recipient relation. Thus the filler of the recipient role in a possession transfer must be the same as the filler of the agent role in the associated agent patient eventuality. And so on, for the other axioms. Summing up AVM style: (21) ⎡ POSSESSION TRANSFER ⎢ ⎢donor ⎢ ⎢ recipient ⎢ possession ⎣
⎡ AGENT ⎤ ⎤ ⎢ PATIEN T ⎥ ⎥ ⎢ ⎥ 1⎥ ⎢ ⎥ agent 2 ⎥ ⎯⎯⎯⎯→ ⎥ 2 ⎥ acquisition ⎢ ⎢source 1⎥ ⎢ ⎥ 3 ⎥⎦ 3 ⎥⎦ ⎢⎣ patient
I will call the mapping that makes the donor agent donation.
680
VI. Cognitively oriented approaches to semantics (22)
⎡ POSSESSION TRANSFER ⎢ ⎢donor ⎢ ⎢ recipient ⎢ possession ⎣
⎡ AGENT ⎤ ⎤ ⎢ ⎥ PATIENT ⎥ ⎢ ⎥ 1⎥ ⎢agent ⎥ 1 ⎯ ⎯⎯⎯ → ⎥ ⎥ 2 ⎥ donation ⎢ ⎢goal 2⎥ ⎢ ⎥ 3 ⎥⎦ 3 ⎥⎦ ⎢⎣ patient
With the acquisition and donation mappings defined, the predicates give and acquire can be defined as compositions with donation and acquisition: give = POSSESSION transfer O donation–1 acquire = POSSESSION transfer O acquisition–1 donation–1 is an inverse of donation, a function from agent patient eventualities to possession transfers defined only for those agent patient events related to possession transfers. Composing this with the possession transfer predicate makes give a predicate true of those agent patient events related to possession transfers, whose agents are donors and whose patients are possessions. The treatment of acquire is parallel but uses the acquisition mappings. For more extensive discussion, see Gawron (2008). Summarizing: a. an argument frame agent patient, with direct consequences for syntactic valence (agents become subject, patients direct object, and so on). b. a circumstance frame possession transfer, which captures the circumstances of possession transfer. c. perspectivalizing functions acquisition and donation which map participants in the circumstances to argument structure. This is the basic picture of perspectivalization. The picture becomes more interesting with a richer example. In the discussion that follows, I assume a commercial transaction frame with at least the following frame elements: ⎤ (23) ⎡COMMERCIAL ⎢ TRANSACTION ⎥ ⎢ ⎥ ⎢ buyer animate ⎥ ⎢ ⎥ animate ⎥ ⎢seller ⎢money fun ngible ⎥ ⎢ ⎥ ⎢⎣goods entity ⎥⎦ This is a declaration that various functions from event sorts to truth values and entity sorts exist, a rather austere model for the sort of rich backgrounding function we have assumed for frames. We will see how this model is enriched below. Our picture of profiling and perspectivalization can be extended to the more complex cases of commercial transaction predicates with one more composition. For example, we may define buy’ as follows:
29. Frame Semantics
681
(24) buy = commercial transaction O (acquisition O goods-transfer)–1 What this says is that the relation buy’ is built in a series of steps, out of 3 functions: 1. acquisition: the function from possession transfer events to agent_patient events already introduced. 2. goods-transfer: a new function from commercial events to possession transfers in which the goods is transferred: ⎡COMMERCIAL ⎢ TRANSACTION ⎢ ⎢ buyer ⎢ ⎢seller ⎢ ⎢money ⎢ ⎢⎣goods
1 2 3 4
⎤ ⎡ POSSESSION ⎤ ⎥ ⎢ TRANSFE R ⎥ ⎥ ⎢ ⎥ ⎥ ⎢ ⎥ ⎥ ⎯⎯⎯⎯⎯⎯ recipient 1 → ⎥ ⎥ goods-transfer⊑ ⎢ ⎢donor ⎥ 2 ⎥ ⎢ ⎥ ⎥ ⎢⎣ possession 4 ⎥⎦ ⎥ ⎥⎦
3. The inverse of the composition of goods-transfer with acquisition (acquisition O goods-transfer)–1 is a function from agent patient events to commercial transactions. 4. commercial transaction: a sortal predicate true of commercial transactions. 5. The predicate buy is therefore true of agent patient events that are related in certain fixed ways to a possession transfer and a commercial transaction. The novelty in the definition above is the goods_transfer function. We will call this the profiling function because it selects the parts of the commercial transaction event which the verb highlights. We will call acquisition – the function which determines subject and object – the perspectivalizing function. The role of the the perspectivalizing function is to select a syntactic realization. A profiling function like goods_transfer has two independent motivations: a. It enriches our rather impoverished model of commercial transaction. We started out in (23) with little more than the assumption that there were 4 sorted participants we were calling buyer, seller, money, and goods. Now with the assumption of the goods_transfer function, a possession transfer p is entailed (because the function is total) in which the possession is the goods. Thus goods_transfer can be viewed as part of an enriched definition of the commercial transaction frame. There will be other total functions enriching the definition further, for example, a money_transfer function of use in defining verbs like pay and collect, in which the money is transferred. b. Both money_transfer and goods-transfer are projections from commercial events to possession transfers; and possession transfer is a frame for which we have a predefined perspectivalization, independently motivated for other verbs like acquire and get. By composing a commercial event subscene projection with a possession transfer argument projection we derive an argument projection for commercial transactions.
682
VI. Cognitively oriented approaches to semantics Thus the good transfer function simultaneously serves knowledge representation needs (a) and valence theory needs (b). There is an analogy between how profiling and perspectivalization work and the way the original Minskyan frames work. A Minskyan frame enables the integration of scene components in view with underlying objects by specifying, for example, how the faces of the cube in view relate to the cube as a whole. A Fillmorian perspective enables the integration of the realized elements of a text with an underlying text interpretation by specifying how syntactically realized frame components relate to frames as a whole. In both cases there are operations that mediate between rich representations and a constrained (perspectivalized) representation that belongs to an external representational system. Minskyan rotation operations mediate between 3D representations and the 2D representations of a scene, ultimately necessary because the human retina is a screen. Fillmorian profilings and perspectivalizations mediate between unlinearized representations in which there is no fixed individuation of participants and linearizable argument structure, ultimately necessary because the syntax of human language forces us to linearize participants. Now consider a profiling which leaves things out. This is the case of spend. (25)
⎡COMMERCIAL ⎢ TRANSACTION ⎢ ⎢money ⎢ ⎢ buyer ⎢ ⎢goods ⎢ ⎢⎣seller
⎤ ⎡RESOURCE ⎤ ⎥ ⎢ ⎥ ⎥ CONSUMPT ION ⎢ ⎥ 1⎥ ⎥ ⎯⎯⎯⎯⎯→ ⎢ resource 1⎥ ⎥ 2 ⎥ consumption⊑ ⎢ ⎢consumer ⎥ 2⎥ ⎢ ⎥ 3⎥ 3 ⎥⎦ ⎢⎣ resource-requirer ⎥ 4 ⎥⎦
As discussed in section 2, the verb spend views a commercial transaction as a resource consumption, where resource consumption is the frame used by verbs like waste, lose, use (up), and blow. The profiling of the verb spend includes the seller and goods but leaves the seller out. The profiling of the verb sell includes the buyer and the goods, as well as the seller. The two subscenes overlap in participants but choose distinct, incompatible event types, which lead to distinct realization possibilities in the syntactic frame. The frame-based picture of commercial transactions is schematized in Fig. 29.3. The picture on the left shows what we might call the commercial transaction neighborhood as discussed here. The picture on the right shows that portion of the neighborhood that is activated by buy; the functions used in its definitions are linked by solid lines; the functions left out are in dashes; the boxed regions contains those frames that are used in the definition. If as is suggested in article 106 (Kelter & Kaup) Conceptual knowledge, categorization and meaning, concepts and word meanings need to be different knowledge structures, the picture in Fig. 29.3 may provide one way of thinking about how they might be related, with the frame nodes playing the role of concepts and a configuration of links between them the role of a word meaning. We have called goods-transfer and consumption profiling functions. We might equally well have called them subscene roles, because they are functions from events to entities. Note that subscene roles don’t attribute a fixed hierarchical structure to a frame the way
29. Frame Semantics
Fig. 29.3: Left: Lexical network for commercial transaction. Right: Same network with the perspectivalization chosen by buy in the boxed area.
do . . . cause become . . . in Dowty’s system attributes a fixed structure to causatives of inchoatives. As these examples show, a frame may have subscene roles which carve up its constituents in incompatible ways. Now this may seem peculiar. Shouldn’t the roles of a frame define a fixed relation between disjoint entities? I submit that the answer is no. The roles associated with each sort of event are regularities that help us classify an event as of that sort. But such functions are not guaranteed to carve up each event into nonoverlapping, hierarchically structured parts. Sometimes distinct roles may select overlapping constituents of events, particularly when independent individuation criteria are not decisive, as when the constituents are collectives, or shapeless globs of stuff, or abstract things such as events or event types. Thus we get the cases discussed above like collide, mix, and risk, where different ways of profiling the frames give us distinct, incompatible sets of roles. We may choose to view the colliders as a single collective entity (X and Y collided), or as two (X collided with Y). We may choose to separate a figure from a ground in the mixing event (14f), or lump them together (mix X and Y), or just view the mixed substance as one (14f). Finally, risks involve an action (13c) and a potential bad consequence (13a), and for a restricted set of cases in which that bad consequence is a loss, a lost thing (13b). What of relations? Formally, in this frame-based picture, we have replaced relations with event predicates, each of which is defined through some composed set of mappings to a set of events that will be defined only for some fixed set of roles. Clearly, for every lexical predicate, there is a corresponding relation, namely one defined for exactly the same set of roles as the predicate. Thus in the end the description of the kind of lexical semantic entity which interfaces with the combinatorial semantics is not very different.
683
684
VI. Cognitively oriented approaches to semantics However the problem has, I believe, been redefined in an interesting way. Traditionally, discussion of the lexical-semantic/syntax interface starts with a relation with a predefined set of roles. This is the picture for example, that motivates the formulation of Chomsky’s (1981) Θ-Criterion. However, a major point of Frame Semantics is that, for many purposes, it is useful to look at a set of relations structured in a particular way. This is the domain of frames.
5. Lexicography A word about the application of frames to lexicography is in order. Any set of frames imposes a certain classificational scheme on the lexicon. Other examples of such a classificational scheme are Roget’s Thesaurus, Longman’s valence classes, and Wordnet (Fellbaum 1998). Frames differ from all three in that they are not primarily oriented either to the task of synonym-classes or syntactic frame classes. One expects to find synonyms and antonyms in the same frame, of course, and many examples of valence similarity, but neither trend will be a rule. As we saw in section 2, near synonyms like land and ground may belong to different frames, and understanding those frames is critical to proper usage. As we saw in our investigations of profiling and perspective, differences of both kinds may result in very different valence options for verbs from the same frame. The value of the frame idea for lexicography is that it seems the most promising idea if the goal is to organize words according to usage. This of course is a hypothesis. FrameNet (Fillmore & Baker 2000) is a test of that hypothesis. Accordingly, frame entries are connected with rich sets of examples gleaned from the British National Corpus illustrating frame element realizations in a variety of syntactic contexts. Interested readers will find a tour of the web site far more persuasive than any discussion here.
6. Discourse understanding In this section I propose to raise the issue of frames in discourse understanding, not to try to give the subject an adequate treatment, for which there is no space, but to talk a bit about how the role of frames in discourse understanding is related to their role in interpreting signs. Let us return to the example of verbs conventionally connected with effects caused by movement: (26) a. John broke the glass against the wall. b. # John killed the cockroach against the wall. It is at least arguably the case that this contrast can be made without the help of a lexical stipulation. If movement can be a default or at least a highly prototypical way of breaking something, and not a highly prototypical way of killing something, then something like the default logic of Asher & Lascarides (1995) or abduction as in Hobbs et al. (1993), both of which have been applied successfully to a number of problems of discourse interpretation, could infer causality in (a) and not in (b). However, this still falls somewhat short of predicting the genuine oddity of (b). Notice, too, that when discourse coherence alone is at issue, both causality inferences go through:
29. Frame Semantics (27) a. The glass was hurled against the wall and broke. b. The cockroach was hurled against the wall and died. Thus the defaults at play in determining matters of “valence” differ from those in discourse. We can at least describe the contrasts in (26) – not explain it – by saying movement is an optional component of the breaking frame through which the denotation of the verb break is defined, and not a component of the killing frame; or in terms of the formal picture of section 4: Within the conventional lexical network linking frames in English there is a partial function from breaking events to movement subscenes; there is no such function for killing events. In contrast Fillmore’s (1985: 232) discussed in section 2.1: (28) We never open our presents until morning. The point of this example was that it evoked Christmas without containing a single word specific to Christmas. How might an automatic interpretation system simulate what is going on for human understanders? Presumably by a kind of application of Occam’s razor. There is one and only one frame that explains both the presence of presents and the custom of waiting until morning, and that is the Christmas frame. Thus the assumption that gets us the most narrative bang for the buck is Christmas. In this case the frame has to be evoked by dynamically assembling pieces of information activated in this piece of discourse. These two examples show that frames will function differently in a theory of discourse understanding than they will in a theory of sign-meanings in at least two ways. They will require a different notion of default, and they will need to resort to different inferencing strategies, such as inference to the most economical explanation.
7. Conclusion The logical notion of a relation, which preserves certain aspects of the linearization syntax forces on us, has at times appeared to offer an attractive account of what we grasp when we grasp sign meanings. But the data we have been looking at in this brief excursion into Frame Semantics has pointed another way. Lexical senses seem to be tied to the same kind schemata that organize our perceptions and interpretations of the social and physical world. In these schemata participants are neither linearized nor uniquely individuated, and the mapping into the linearized regime of syntax is constrained but underdetermined. We see words with options in what their exact participants are and how they are realized. Frames offer a model that is both specific enough and flexible enough to accommodate these facts, while offering the promise of a firm grounding for lexicographic description and an account of text understanding.
8. References Asher, Nicholas & Alex Lascarides 1995. Lexical disambiguation in a discourse context. Journal of Semantics 12, 69–108. Baker, Collin & Charles J. Fillmore 2001. Frame Semantics for text understanding. In: Proceedings of WordNet and Other Lexical Resources Workshop. Pittsburgh, PA: North American Association of Computational Linguistics, 3–4.
685
686
VI. Cognitively oriented approaches to semantics Baker, Collin, Charles J. Fillmore & John B. Lowe 1998. The Berkeley FrameNet project. In: B.T.S. Atkins & A. Zampolli (eds.). Proceedings of the Joint Conference of the International Committee on Computational Linguistics and the Association for Computational Linguistics, vol. 1. Montreal: Association for Computational Linguistics, 86–90. Barsalou, Lawrence 1992. Frames, concepts, and conceptual fields. In: A. Lehrer & E. Kittay (eds.). Frames, Fields, and Contrasts. New Essays in Semantic and Lexical Organization. Hillsdale, NJ: Lawrence Erlbaum, 21–74. Barsalou, Lawrence 1999. Perceptual symbol systems. Behavioral and Brain Sciences 22, 577– 660. Bartlett, Frederick C. 1932. Remembering. A Study in Experimental and Social Psychology. Cambridge: Cambridge University Press. Berlin, Isaiah 1957. The Hedgehog and the Fox. New York: New American Library. Reprinted in: I. Berlin. The Proper Study of Mankind. New York: Farrar, Straus, and Giroux, 1997. Boas, Hans C. 2001. Frame Semantics as a framework for describing polysemy and syntactic structures of English and German motion verbs in contrastive computational lexicography. In: P. Rayson et al. (eds.). Proceedings of the Corpus Linguistics 2001 Conference. Lancaster: University Centre for Computer Corpus Research on Language, 64–73. Boas, Hans C. 2005. Semantic frames as interlingual representations for multilingual databases. International Journal of Lexicography 18, 445–478. Carpenter, Bob 1992. The Logic of Typed Feature Structures. With Applications to Unification Grammars, Logic Programs and Constraint Resolution. Cambridge: Cambridge University Press. Chang, Nancy, Srini Narayanan & Miriam R.L. Petruck 2002a. From frames to inference. In: Proceedings of the First International Workshop on Scalable Natural Language Understanding. Heidelberg: International Committee on Computational Linguistics, 478–484. Chang, Nancy, Srini Narayanan & Miriam R.L. Petruck 2002b. Putting frames in perspective. In: Proceedings of the Nineteenth International Conference on Computational Linguistics. Taipei, Taiwan: International Committee on Computational Linguistics, 231–237. Chomsky, Noam 1981. Lectures on Government and Binding. Dordrecht: Foris. Coseriu, Eugenio 1967. Lexikalische Solidaritäten. Poetica 1, 293–303. Davidson, Donald 1967. The logical form of action sentences. In: N. Rescher (ed.). The Logic of Decision and Action. Pittsburgh, PA: University of Pittsburgh Press, 81–94. Reprinted in: D. Davidson (ed.). Essays on Action and Events. Oxford: Clarendon Press, 1980, 105–148. Dowty, David R. 1979. Word Meaning and Montague Grammar. Dordrecht: Reidel. Dowty, David R. 1991. Thematic proto roles and argument selection. Language 67, 547–619. Fellbaum, Christiane (ed.) 1998. WordNet. An Electronic Lexical Database. Cambridge, MA: The MIT Press. Fillmore, Charles J. 1975. An alternative to checklist theories of meaning. In: C. Cogen et al. (eds.). Proceedings of the First Annual Meeting of the Berkeley Linguistics Society. Berkeley, CA: Berkeley Linguistics Society, 123–131. Fillmore, Charles J. 1976. Topics in lexical semantics. In: R. Cole (ed.). Current Issues in Linguistic Theory. Bloomington, IN: Indiana University Press, 76–138. Fillmore, Charles J. 1977a. The case for case reopened. In: P. Cole & J. Sadock (eds.). Syntax and Semantics 8: Grammatical Relations. New York: Academic Press, 59–81. Fillmore, Charles J. 1977b. Scenes-and-frames Semantics. Linguistic structures processing. In: A. Zampolli (ed.). Linguistic Structures Processing. Amsterdam: North-holland, 55–81. Fillmore, Charles J. 1978. On the organization of semantic information in the lexicon. In: D. Farkas et al. (eds). Papers from the Parasession on the Lexicon. Chicago, IL: Chicago Linguistics Society, 148–173. Fillmore, Charles J. 1982. Frame Semantics. In: Linguistic Society of Korea (ed.). Linguistics in the Morning Calm. Seoul: Hanshin, 111–137.
29. Frame Semantics
687
Fillmore, Charles J. 1985. Frames and the semantics of understanding. Quaderni di Semantica 6, 222–254. Fillmore, Charles J. & B.T.S Atkins 1994. Starting where the dictionaries stop. The challenge for computational lexicography. In: B.T.S Atkins & A. Zampolli (eds.). Computational Approaches to the Lexicon. Oxford: Clarendon Press. Fillmore, Charles J. & B.T.S Atkins 1998. FrameNet and lexicographic relevance. In: Proceedings of the First International Conference on Language Resources and Evaluation. Granada: International Committee on Computational Linguistics, 28–30. Fillmore, Charles J. & Collin Baker 2000. FrameNet, http://www.icsi.berkeley.edu/~framenet, October 13, 2008. Gawron, Jean Mark 1983. Lexical Representations and the Semantics of Complementation. New York: Garland. Gawron, Jean Mark 2008. Circumstances and Perspective. The Logic of Argument Structure. http:// repositories.cdlib.org/ucsdling/sdlp3/l, October 13, 2008. Geckeler, Horst 1971. Strukturelle Semantik und Wortfeldtheorie. München: Fink. Hobbs, Jerry R., Mark Stickel, Douglas Appelt & Paul Martin 1993. Interpretation as abduction. Artificial Intelligence 63, 69–142. Jackendoff, Ray S. 1990. Semantic Structures. Cambridge, MA: The MIT Press. Koffka, Kurt 1963. Principles of Gestalt Psychology. New York: Harcourt, Brace, and World. Lakoff, George 1972. Linguistics and natural logic. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 545–665. Lakoff, George 1983. Category. An essay in cognitive linguistics. In: Linguistic Society of Korea (ed.). Linguistics in the Morning Calm. Seoul: Hanshin, 139–194. Lakoff, George & Mark Johnson 1980. Metaphors We Live By. Chicago, IL: The University of Chicago Press. Langacker, Ronald 1984. Active zones. In: C. Brugman et al. (eds.). Proceedings of the Tenth Annual Meeting of the Berkeley Linguistics Society. Berkeley, CA: Berkeley Linguistics Society, 172–188. Lehrer, Adrienne & Eva Kittay (eds.) 1992. Frames, Fields, and Contrasts. New Essays in Semantic and Lexical Organization. Hillsdale, NJ: Lawrence Erlbaum. Levin, Beth 1993. English Verb Classes and Alternations. Chicago, IL: The University of Chicago Press. Mill, John Stuart 1847. A System of Logic. New York: Harper and Brothers. Minsky, Marvin 1975. A framework for representing knowledge. In: P. Winston (ed.). The Psychology of Computer Vision. New York: McGraw-Hill, 211–277. http://web.media.mit.edu/~minsky/ papers/Frames/frames.html, October 13, 2008. Parsons, Terence 1990. Events in the Semantics of English. Cambridge, MA: The MIT Press. Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA: The MIT Press. Rounds, William C. 1997. Feature logics. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 475–533. Rumelhart, Donald 1980. Schemata. The building blocks of cognition. In: R.J. Spiro, B.C. Bruce & W.F. Brewer (eds.). Theoretical Issues in Reading Comprehension. Hillsdale, NJ: Lawrence Erlbaum, 33–58. Schank, Roger C. & R.P. Abelson 1977. Scripts, Plans, Goals, and Understanding. Hillsdale, NJ: Lawrence Erlbaum. Smolka, Gert 1992. Feature constraint logics for unification grammars. Journal of Logic Programming 12, 51–87. Trier, Jost 1971. Aufsätze und Vorträge zur Wortfeldtheorie. The Hague: Mouton. Weisgerber, J. Leo 1962. Grundzüge der inhaltsbezogenen Grammatik. Düsseldorf: Schwann.
Jean-Mark Gawron, San Diego, CA (USA)
688
VI. Cognitively oriented approaches to semantics
30. Conceptual Semantics 1. 2. 3. 4.
Overall framework Major features of Conceptual Structure Compositionality References
Abstract Conceptual Semantics takes the meanings of words and sentences to be structures in the minds of language users, and it takes phrases to refer not to the world per se, but rather to the world as conceptualized by language users. It therefore takes seriously constraints on a theory of meaning coming from the cognitive structure of human concepts, from the need to learn words, and from the connection between meaning, perception, action, and nonlinguistic thought. The theory treats meanings, like phonological structures, as articulated into substructures or tiers: a division into an algebraic Conceptual Structure and a geometric/topological Spatial Structure; a division of the former into Propositional Structure and Information Structure; and possibly a division of Propositional Structure into a descriptive tier and a referential tier. All of these structures contribute to word, phrase, and sentence meanings. The ontology of Conceptual Semantics is richer than in most approaches, including not only individuals and events but also locations, trajectories, manners, distances, and other basic categories. Word meanings are decomposed into functions and features, but some of the features and connectives among them do not lend themselves to standard definitions in terms of necessary and sufficient conditions. Phrase and sentence meanings are compositional, but not in the strict Fregean sense: many aspects of meaning are conveyed through coercion, ellipsis, and constructional meaning.
1. Overall framework Conceptual Semantics is a formal approach to natural language meaning developed by Jackendoff (1983, 1987, 1990, 2002, 2007) and Pinker (1989, 2007); Pustejovsky (1995) has also been influential in its development. The approach can be characterized at two somewhat independent levels. The first is the overall framework for the theory of meaning, and how this framework is integrated into linguistics, philosophy of language, and cognitive science (section 1). The second is the formal machinery that has been developed to achieve the goals of this framework (sections 2 and 3). These two are somewhat independent: the general framework might be realized in terms of other formal approaches, and many aspects of the formal machinery can be deployed within other frameworks for studying meaning. The fundamental goal of Conceptual Semantics is to describe how humans express their understanding of the world by means of linguistic utterances. From this goal flow two theoretical commitments. First, linguistic meaning is to be described in mentalistic/ psychological terms – and eventually in neuroscientific terms. The theory of meaning, like the theories of generative syntax and phonology, is taken to be about what is going Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 688–709
30. Conceptual Semantics on in people’s heads when they use language. Second, the theory aspires to describe the messages that speakers intend their utterances to convey. Thus it potentially includes everything that traditionally falls under the labels of ‘pragmatics’ and ‘world knowledge’ as well as ‘semantics.’ It does not specifically seek a level of representation that might be characterized as ‘pure/literal linguistic meaning’ or ‘meaning that is relevant to grammar.’ If there is such a level, it will emerge in the course of empirical investigation (see remarks in section 1.2). We take these two commitments up in turn.
1.1. Mentalism: reference and truth The mentalist commitment of the theory sets it apart from traditions of formal semantics growing out of logic (e.g. Frege 1892, Russell 1905, Carnap 1939, Tarski 1956, Montague 1973, Lewis 1972; see article 3 (Textor) Sense and reference; article 4 (Abbott) Reference) which aspire to study the relation of sentences to “the world” or to “possible worlds” (where a “the/a world” is often specified in set-theoretic terms, see article 33 (Zimmermann) Model-theoretic semantics. In generative grammar, a sentence is not regarded as a free-standing object that can be related to the world: it is a combinatorial structure in a speaker’s mind that can be shared with other speakers via acoustic or visual signals. Similarly, an entire language is not a free-standing object (or set of sentences) in the world. Rather, a speaker’s “knowledge of a language” is instantiated as a set of stored mental structures and stored relations among structures, plus the ability to combine these stored structures and relations into an unlimited number of meaningful expressions. The notion of “the English language” is thus regarded as an idealization over the systems of linguistic knowledge in the minds of a community of mutually intelligible speakers. We typically presume that these systems are homogeneous, but we readily drop this assumption as soon as we need to take into account dialect differences, vocabulary differences, and stages in children’s acquisition of language. This treatment of linguistic expressions extends to the meanings they convey. The meaning of a word or a sentence is not a free-standing object in the world either. Rather, the meaning of a word is to be regarded as a mental structure stored in a speaker’s mind, linked in long-term memory to the structures that encode the word’s pronunciation and its syntactic properties. The meaning of a sentence is likewise to be regarded as a mental structure, constructed in a speaker’s mind in some systematic way from the meanings of its components. Under this conception, then, meaning must always be relativized to the language user. It makes no sense to say, with Putnam (1975), that speakers don’t really know the meanings of words, or that the “true meaning” of, say, natural kind terms awaits a more mature science. There is no place other than in speaker’s heads to localize meaning. Even if no speakers in 1500 knew the molecular structure of water or the DNA profile of tigers, it seems quixotic to maintain that no one was in possession of “the” meaning of water and tiger. People were managing to communicate with each other quite adequately, in terms of their understanding of these concepts at the time. Similarly, if speakers have different meanings for words (such as when experts have more highly articulated meanings for words in their area of expertise), mutual intelligibility is accomplished through tolerance of differences or, more rarely, through negotiation (as when people appeal to an expert for adjudication). And this seems a realistic assessment of how people use language.
689
690
VI. Cognitively oriented approaches to semantics The mentalist approach also leads to theoretical notions of reference and truth different from canonical formal semantics and philosophy of language. Reference is standardly regarded as a relation between linguistic expressions (typically noun phrases) and things in the world. For convenience, let us call this realist reference (or r-reference). However, the goal of Conceptual Semantics is not an account of free-standing sentences, but rather an account of human understanding. Thus the relation that plays the role of reference in the theory is between the mental structure encoding the linguistic expression and the language user’s conceptualization of the world – all inside the mind. Let us call this relation mentalist reference (or m-reference). For example, in sincerely uttering The cat is on the mat, a speaker is committed to there being a situation in the world in which an entity identifiable as a cat is in contact with the upper surface of another entity identifiable as a mat. A theory of meaning must account for these m-referential commitments. Now note that the speaker has not arrived at these commitments by somehow being in direct contact with reality. Rather, the speaker has arrived at these commitments through either hearsay, memory, inference, or perception. The first three of these require no direct contact with the cat or the mat. This leaves only perception as a potential means of direct contact with the world. However, if we are to take the mentalist approach seriously, we must recognize that perception is far from direct. Visual perception, for example, is a hugely complex computation based on fragmentary information detected by the retina. It is far from well understood how the brain comes up with a unified perception of stable objects situated in a spatial environment, such as a cat on a mat (Neisser 1967, Marr 1982, Koch 2004). Nevertheless, it is this unified perception, computationally far removed from the objects in the world per se, that leads the speaker to making referential commitments about cats and mats. This treatment of reference is an important respect in which Conceptual Semantics differs from the mentalistic theory of Fodor (1975, 1998). Although Fodor wishes to situate meaning in the mind, encoded in a combinatorial “language of thought,” he insists that linguistic expressions are connected to the world by the relation of intentionality or aboutness (see article 2 (Jacob) Meaning, intentionality and communication). For Fodor, the expression the cat is about some cat in the world, and a semantic theory must explicate this relation. In Conceptual Semantics, there is no such direct relation: the speaker’s intention to refer to something in the world is mediated by conceptualization, which may or may not be related to the world through perception. For cases in which conceptualization is not based on perception, consider mortgages and dollars. We speak of them as though they exist in the world, but, unlike cats, these are entities that exist only by virtue of social convention, i.e. shared conceptualization. Nevertheless, for us they are just as real as cats. (And for cats, they are not!) Similar remarks pertain to the notion of truth. For the purposes of a mentalist theory, what is of interest is not the conditions in the world that must be satisfied in order for a sentence to be true, but rather the conditions in speakers’ conceptualizations of the world under which they judge a sentence to be true. That is, the theory is concerned with m-truth rather than r-truth. On a tolerant construal of Conceptual Semantics, the investigation of m-reference and m-truth might be taken to be complementary to a classical approach in terms of r-reference and r-truth. To explain how speakers grasp r-truth, a theory of m-truth will play a necessary part. On a more confrontational construal, Conceptual Semantics might
30. Conceptual Semantics be taken to claim that only the mentalistic approach leads to a theory of meaning that integrates gracefully with a mentalistic theory of language and with cognitive psychology. Either construal is possible; both result in the same empirical questions for research. It might be added that Conceptual Semantics, as part of its theory of word meaning, must of course describe the ordinary language or “folk” meanings of the words refer and true. These appear to correspond closely to the notion of r-reference and r-truth, that is, they express what people conceptualize as objective relations between linguistic expressions and the world. But note that Conceptual Semantics is also responsible for the meanings of nonscientific words such as karma, ghost, tooth fairy, and phlogiston. These too express possible human concepts, widely subscribed to in various cultures at various times. So including the “folk” meanings of refer and true among human concepts doesn’t seem like a terrible stretch. But again, this does not entail that r-reference and r-truth should be the overall objectives of the theory, as they are in classical semantics.
1.2. Boundary conditions and comparison to other frameworks A theory that seeks to describe the range of human thoughts that can be conveyed in language must meet a large collection of boundary conditions. The first two are shared with classical formal semantics. C1 (Compositionality): The meaning of an utterance must be composed systematically in a way that incorporates the meaning of its words and the contribution of its syntax. (However, this does not require that all parts of utterance meaning are expressed by particular words of the utterance, as in classical Fregean composition; see section 3.) C2 (Inference): Utterance meanings must serve as a formal basis for inference.
However, there are also boundary conditions that derive from the mentalist basis of the theory. Classical semantics speaks to none of these concerns. C3 (Categorization): The meanings of words must conform to what is known about human categorization (cf. Murphy 2002, Jackendoff 1983, Lakoff 1987; see article 106 (Kelter & Kaup) Conceptual knowledge, categorization and meaning). C4 (Learnability): The meanings of words must be learnable on the basis of the acquirer’s experience with language and the world, preferably in conformance with empirical evidence on word learning (e.g. Macnamara 1982, Pinker 1989, Bloom 2000). C5 (Connection to perception and action): Phrase and utterance meanings that deal with physical objects and physical actions must be connected to mental representations appropriate to perception and action, so that one can, for instance, talk about what one sees and carry out actions based on imperative sentences (Landau & Jackendoff 1993, Bloom et al. 1996; see article 107 (Landau) Space in semantics and cognition).
A final tenet of Conceptual Semantics connects it with questions of the evolution of language, again an issue on which classical semantics is silent: C6 (Nonlinguistic thought): The mental structures that serve as utterance meanings are present to some degree in nonlinguistic organisms such as babies and apes, and play a role in their understanding of the world. It is in service of expressing such prelinguistic thought that the language faculty evolved (Jackendoff 2002, chapter 8).
691
692
VI. Cognitively oriented approaches to semantics These conditions together serve to differentiate Conceptual Semantics from other major semantic frameworks. It differs from formal semantics not only in its commitment to mentalism, but in the corollary conditions C3–C6. A word meaning must be a mental structure, not a set of instances in possible worlds. Furthermore, human categorization does not operate strictly in terms of necessary and sufficient conditions, but rather in part in terms of default conditions, preference conditions, and distance from central exemplars (see section 2.5). Hence word meanings do not delimit classical categories and cannot be treated in terms of traditional definitions. The learnability of an unlimited variety of word meanings argues that word meanings are composite, built up in terms of a generative system from a finite stock of primitives and principles of combination. By contrast, in classical semantics, word meanings (except for words with logical properties) are typically taken to be atomic. Fodor (1975, 1998) argues that all word meanings are atomic, and seeks to account for learnability by claiming that they are all innate. Beyond this position’s inherent implausibility, it calls for a commitment to (a) only a finite number of possible word meanings in all the languages of the world, since they must all be coded in a finite brain; (b) a reliable triggering mechanism that accounts for concept learning; and eventually (c) a source in evolution for such “innate” concepts as telephone. Fodor’s arguments for this position are based on the assumption that word meanings, if composite, must be statable in terms of definitions, which is denied by Conceptual Semantics and other cognitively rooted theories of meaning (section 2.5; Jackendoff 1983: 122–127; 1990: 37–41; 2002: 334–337; Lakoff 1987; see article 28 (Taylor) Prototype theory and article 29 (Gawron) Frame Semantics). A different framework coming out of computational linguistics and cognitive psychology is Latent Semantic Analysis (Landauer et al. 2007). It characterizes the meanings of words in terms of their cooccurrence with other words in texts (i.e. linguistic use alone). Thus word meanings consist of a collection of linguistic contexts with associated probabilities. There is no account of compositionality or inference; word learning consists of only collating contexts and calculating their probabilities; and there is no relationship to nonlinguistic categorization and cognition. Similar considerations apply to WordNet (Fellbaum 1998). Another framework, Natural Semantic Metalanguage (Wierzbicka 1996) is concerned primarily with decomposition of word meanings. Decompositions are carried out in terms of a small vocabulary of English words that are taken to represent semantic primitives. This approach does not aspire to account for any of the boundary conditions on Conceptual Semantics above. Various other approaches to semantics are concerned primarily with the impact of semantics on grammatical form, for instance the Lexical Conceptual Structures of Levin and Rappaport Hovav (see article 19 (Levin & Rappaport Hovav) Lexical Conceptual Structure), Distributed Morphology (article 81 (Harley) Semantics in Distributed Morphology), and Lieber’s approach to morphological meaning (article 79 (Lieber) Semantics of derivational morphology). Again there is little concern with a full account of meaning, nor with inference, word learning, or connection to perception, action, and nonlinguistic thought. The major framework closest in spirit to Conceptual Semantics is Cognitive Grammar (Langacker 1987, Lakoff 1987, Talmy 2000, Fauconnier 1985; article 27 (Talmy) Cognitive Semantics). This tradition takes seriously the cognitively-based rather than logicallybased nature of meaning, and it stresses the nonclassical character of word meanings and
30. Conceptual Semantics sentence meanings. Besides differences in the phenomena that it focuses on, Cognitive Grammar differs from Conceptual Semantics in four respects. First, the style of its formalization is different and arguably less rigorous. Second, it tends to connect to nonlinguistic phenomena through theories of embodied cognition rather than through more standard cognitive neuroscience. Third (and related), many practitioners of Cognitive Grammar take a rather empiricist (or Lockean) view of learning, whereas Conceptual Semantics admits a considerable structured innate (or Kantian) basis to concept formation. Fourth, Conceptual Semantics is committed to syntax having a certain degree of independence from semantics, whereas Cognitive Grammar seeks to explain all aspects of syntactic form in terms of the meaning(s) expressed (see article 86 (Kay & Michaelis) Constructional meaning).
1.3. Conceptual Structure and Spatial Structure; interfaces with syntax and phonology The central hypothesis of Conceptual Semantics is that there is a level of mental representation, Conceptual Structure, which instantiates sentence meanings and serves as the formal basis for inference and for connection with world knowledge and perception. The overall architecture of the mind in which this embedded is shown in Fig. 30.1.
Fig. 30.1: Architecture of the mind
Each of the levels in Fig. 30.1 is a generative system with its own primitives and principles of combination. The arrows indicate interfaces among representations: sets of principles that provide systematic mappings from one level to the other. On the left-hand side are the familiar linguistic levels and their interfaces to hearing and speaking. On the right-hand side are nonlinguistic connections to the world through visual, haptic, and proprioceptive perception, and through the formulation of action. (One could add general-purpose audition, smell, and taste as well.) In the middle lies cognition, here instantiated as the levels of Conceptual Structure and Spatial Structure. Spatial Structure is hypothesized (Jackendoff 1987, 1996a; Landau & Jackendoff 1993) as a geometric/topological encoding of 3-dimensional object shape, spatial layout, motion, and possibly force. It is not a strictly visual representation, because these features of the conceptualized physical world can also be derived by touch (the haptic sense), by proprioception (the spatial position of one’s own body), and to some degree by auditory localization. Spatial Structure is the medium in which these disparate perceptual modalities are integrated. It also serves as input for formulating one’s own physical actions. It is moreover the form in which memory for the shape of familiar objects and object categories is stored. This level must be generative, since one encounters, remembers, and acts in relation to an indefinitely large number of
693
694
VI. Cognitively oriented approaches to semantics objects and spatial configurations in the course of life (see article 107 (Landau) Space in semantics and cognition). However, it is impossible to encode all aspects of cognition in geometric/topological terms. A memory for a shape must also encode the standard type/token distinction, i.e. whether this is the shape of a particular object or of an object category. Moreover, the taxonomy of categories is not necessarily a taxonomy of shapes. For instance, forks and chairs are not at all similar in shape, but both are artifacts; there is no general characterization of artifacts in terms of shape or the spatial character of the actions one performs with them. Likewise, the distinction between familiar and unfamiliar objects and actions cannot be characterized in terms of shape. Finally, social relations such as kinship, alliance, enmity, dominance, possession, and reciprocation cannot be formulated in geometric terms. All of these aspects of cognition instead lend themselves to an algebraic encoding in terms of features (binary or multi-valued, e.g. TYPE vs. TOKEN) and functions of one or more arguments (e.g. x INSTANCE-OF y, x SUBCATEGORY-OF y, x KIN-OF y). This system of algebraic features and functions constitutes Conceptual Structure. Note that at least some of the distinctions just listed, in particular the social relations, are also made by nonhuman primates. Thus a description of primate nonlinguistic cognition requires some form of Conceptual Structure, though doubtless far less rich than in the human case. Conceptual Structure too has its limitations. Attempts to formalize meaning and reasoning have always had to confront the impossibility of coding perceptual characteristics such as color, shape, texture, and manner of motion in purely algebraic terms. The architecture in Fig. 30.1 proposes to overcome this difficulty by sharing the work of encoding meaning between the geometric format of Spatial Structure and the algebraic format of Conceptual Structure: spatial concepts have complementary representations in both domains. (In a sense, this is a more sophisticated version of Paivio’s 1971 dual-coding hypothesis and the view of mental imagery espoused by Kosslyn 1980.) Turning now to the interfaces between meaning and language, a standard assumption in both standard logic and mainstream generative grammar is that semantics interfaces exclusively with syntax, and in fact that the function of the syntax-semantics interface is to determine meaning in one-to-one fashion from syntactic structure (this is made explicit in article 81 (Harley) Semantics in Distributed Morphology). The assumption, rarely made explicit (but going back at least to Descartes), is that combinatorial thought is possible only through the use of combinatorial language. This comports with the view, common well into the 20th century, that animals are incapable of thought. Modern cognitive ethology (Hauser 2000; Cheney & Seyfarth 2007) decisively refutes this view, and with it the assumption that syntax is the source of combinatorial thought. Conceptual Semantics is rooted instead in the intuition that language is combinatorial because it evolved to express a pre-existing combinatorial faculty of thought (condition C6). Under this approach, it is quite natural to expect Conceptual Structure to be far richer than syntactic structure – as indeed it is. Culicover & Jackendoff (2005) argue that the increasing complexity and abstraction of the structures posited by mainstream generative syntax up to and including the Minimalist Program (Chomsky 1995) have been motivated above all by the desire to encode all semantic relations overtly or covertly in syntactic structure. Ultimately the attempt fails because semantic relations are too rich and multidimensional to be encoded in terms of purely syntactic mechanisms.
30. Conceptual Semantics In Conceptual Semantics, a word is regarded as a part of the language/thought interface: it is a longterm memory association of a piece of phonological structure (e.g./kæt/), some syntactic features (singular count noun), and a piece of Conceptual Structure (FELINE ANIMAL, PET, etc.). If it is a word for a concept involving physical space, it also includes a piece of Spatial Structure (what cats look like). Other interface principles establish correspondences between semantic argument structure (e.g. what characters an action involves) and syntactic argument structure (e.g. transitivity), between scope of quantification in semantics and position of quantifiers in syntax, and between topic and focus in semantics (information structure, see article 71 (Hinterwimmer) Information structure) and affixation and/or position in syntax. In order to deal with situations where topic and focus are coded only in terms of stress and intonation (e.g. The dog CHASED the mailman), the theory offers the possibility of a further interface that establishes a correspondence directly between semantics and phonology, bypassing syntax altogether. If it proves necessary to posit an additional level of “linguistic semantic structure” that is devoted specifically to features relevant for grammatical expression (as posited in article 31 (Lang & Maienborn) Two-level Semantics, article 19 (Levin & Rappaport Hovav) Lexical Conceptual Structure and article 81 (Harley) Semantics in Distributed Morphology), such a level would be inserted between Conceptual Structure and syntax, with interfaces to both. Of course, in order for language to be understood and to play a role in inference, it is still necessary for words to bridge all the way from phonology to Conceptual Structure and Spatial Structure – they cannot stop at the putative level of linguistic semantic structure. Hence the addition of such an extra component would not at all change the content of Conceptual Structure, which is necessary to drive inference and the connection to perception. (Jackendoff 2002, section 9.7, argues that in fact such an extra component is unnecessary.)
2. Major features of Conceptual Structure This section sketches some of the important features of Conceptual Structure (henceforth CS). There is no space here to spell out formal details; the reader is referred to Jackendoff (1983; 2002, chapters 11–12).
2.1. Tiers in CS A major advance in phonological theory was the realization that phonological structure is not a single formal object, but rather a collection of tiers, each with its own formal organization, which divide up the work of phonology into a number of independent but correlated domains. These include at least segmental and syllabic structure; the amalgamation of syllables into larger domains such as feet, phonological words, and intonational phrases; the metrical grid that assigns stress; and the structure of intonation contours correlated with prosodic domains. A parallel innovation is proposed within Conceptual Semantics. The first division, already described, is into Spatial Structure and Conceptual Structure. Within CS, the clearest division is into Propositional Structure, a function-argument encoding of who did what to whom, how, where, and when (arguments and modifiers), versus Information Structure, the encoding of Topic, Focus, and Common Ground. These two aspects of meaning are orthogonal, in that virtually any constituent of a clause, with any thematic
695
696
VI. Cognitively oriented approaches to semantics or modifying role, can function as Topic or Focus or part of Common Ground. Languages typically use different grammatical machinery for expressing these two aspects of meaning. For instance, roles in Propositional Structure are typically expressed (morpho-) syntactically in terms of position and/or case with respect to a head. Roles in Information Structure are typically expressed by special focusing constructions, by special focusing affixes, by special topic and focus positions that override propositional roles, and above all by stress and intonation – which is never used to mark propositional roles. Both the syntactic and the semantic phenomena suggest that Propositional and Information Structure are orthogonal but linked organizations of the semantic material in a sentence. More controversially, Jackendoff (2002) proposes segregating Propositional Structure into two tiers, a descriptive tier and a referential tier. The former expresses the hierarchical arrangement of functions, arguments, and modifiers. The latter expresses the sentence’s referential commitments to each of the characters and events, and the binding relations among them; it is a dependency graph along the lines of Discourse Representation Theory (see article 37 (Kamp & Reyle) Discourse Representation Theory). The idea behind this extra tier is that such issues as anaphora, quantification, specificity, and referential opacity are in many respects orthogonal to who is performing the action and who the action is being performed on. The canonical grammatical structures of language typically mirror the latter rather closely: the relative embedding of syntactic constituents reflects the relative embedding of arguments and modifiers. On the other hand, scope of quantification, specificity, and opacity are not at all canonically expressed in the surface of natural languages; this is why theories of quantification typically invoke something like “quantifier raising” to relate surface position to scope. The result is a semantic structure in which the referential commitments are on the outside of the expression, and the thematic structure remains deeply embedded inside, its arguments bound to quantifiers outside. Dividing the expressive work into descriptive and referential tiers helps clarify the resulting notational logjam. The division into descriptive and referential tiers also permits an insightful account of two kinds of anaphora. Standard definite anaphora, as in (1a), is anaphoric on the referential tier and indicates coreference. One-anaphora, as in (1b), however, is anaphoric on the descriptive tier and indicates a different individual with the same description. (1) a. Bill saw a balloon and I saw it too. b. Bill saw a balloon and I saw one too.
2.2. Ontological categories and aspectual features Reference is typically discussed in terms of NPs that refer to objects. Conceptual Semantics takes the position that there is a far wider range of ontological types to which reference can be made (m-reference, of course). The deictic that is used in (2a) to (m-)refer to an object that the hearer is invited to locate in (his or her conceptualization of) the visual environment. Similarly, the underlined deictics in (2b-g) are used to refer to other sorts of entities. (2) a. b. c. d.
Would you pick that [pointing] up, please? Would you put your hat there [pointing], please? They went that away [pointing]! Can you do this [demonstrating]?
[reference to object] [reference to location] [reference to direction] [reference to action]
30. Conceptual Semantics
697
e. That [pointing] had better never happen in MY house! [reference to event] f. The fish that got away was this [demonstrating] long. [reference to distance] g. You may start … right … now [clapping]! [reference to time] This enriched ontology leads to a proliferation of referential expressions in the semantic structure of sentences. For instance, John went to Boston refers not only to John and Boston, but also to the event of John going to Boston and to the trajectory ‘to Boston’, which terminates at Boston. The event corresponds to the Davidsonian event variable (see article 34 (Maienborn) Event semantics) and the events of Situation Semantics (see articles 35 (Ginzburg) Situation Semantics and NL ontology and 36 (Ginzburg) Situation Semantics. Event and Situation Semanticists have taken this innovation to be a major advance in the ontology over classical formal semantics. Yet the expressions in (2) clearly show a far more differentiated ontology; events are just a small part of the full story. At the same time, it should be recalled that the ‘existence’ of trajectories, distances, and so forth is a matter not of how the world is, but of how speakers conceptualize the world. Incidentally, trajectories such as to Boston are often thought to intrinsically involve motion. A more careful analysis suggests that this is not the case. Motion along the trajectory is a product of composing the motion verb went with to Boston. But the very same trajectory is referred to in the road leads to Boston – a stative sentence expressing the extent of the road – and in the sign points to Boston – a stative sentence expressing the orientation of the sign. The difference among these examples comes from the semantics of the verb and the subject, not from the prepositional phrase. The semantics of expressions of location and trajectory – including their crosslinguistic differences and relationships to Spatial Structure – has become a major preoccupation in areas of semantics related to Conceptual Semantics (e.g. Bloom et al. 1996, Talmy 2000, Levinson 2003, van der Zee & Slack 2003; see article 98 (Pederson) The expression of space; article 107 (Landau) Space in semantics and cognition). Orthogonal to the ontological category features are aspectual features. It has long been known (Declerck 1979, Hinrichs 1985, Bach 1986, and many others) that the distinction between objects and substances (expressed by count and mass NPs respectively) parallels the distinction between events and processes (telic and atelic sentences) (see article 46 (Lasersohn) Mass nouns and plurals; article 48 (Filip) Aspectual class and Aktionsart). Conceptual Semantics expresses this parallelism (Jackendoff 1991) through a feature [±bounded]. Another feature, [±internal structure], deals with aggregation, including plurality. (3) shows how these features apply to materials and situations. (The situations in (3) are expressed as NPs but could just as easily be sentences.) (3)
Material
Situation
[+bounded, –internal structure]
object (a dog)
single telic event (a sneeze)
[–bounded, –internal structure]
substance (dirt)
atelic process (sleeping)
[+bounded, +internal structure]
group (a herd)
multiple telic event (some sneezes)
[–bounded, +internal structure]
aggregate (dogs)
iterated events (sneezing repeatedly)
698
VI. Cognitively oriented approaches to semantics Trajectories or paths also partake of this feature system: a path such as into the forest, with an inherent endpoint, is [+bounded]; along the road, with no inherent endpoint, is [–bounded]. Because these features cut across ontological categories, they can be used to calculate the telicity and iterativity of a sentence based on the contributions of all its parts (Jackendoff 1996b). For instance, (4a) is telic because its subject and path are bounded, and its verb is a motion verb. (4b) is atelic because its path is unbounded and its verb is a motion verb. (4c) is atelic and iterative because its subject is an unbounded aggregate of individuals and its verb is a motion verb. (4d) is stative, hence atelic, because the verb is stative – even though its path is bounded. (Jackendoff 1996b shows formally how these results follow.) (4) a. b. c. d.
John walked into the forest. John walked along the road. People walked into the forest. The road leads into the forest.
2.3. Feature analysis in word meanings Within Conceptual Semantics, word meanings are regarded as composite, but not necessarily built up in a fashion that lends itself to definitions in terms of other words. This subsection and the next three lay out five sorts of evidence for this view, and five different innovations that therefore must be introduced into lexical decomposition. The first case is when a particular semantic feature spans a number of semantic fields. Conceptual Semantics grew out of the fundamental observations of Gruber (1965), who showed that the notions of location, change, and causation extend over the semantic fields of space, possession, and predication. For example, the sentences in (5) express change in three different semantic fields, in each case using the verb go and expressing the endpoint of change as the object of to. (5) a. John went to New York. b. The inheritance went to John. c. The light went from green to red.
[space] [possession] [predication]
Depending on the language, sometimes these fields share vocabulary and sometimes they don’t. Nevertheless, the semantic generalizations ring true crosslinguistically. The best way to capture this crosscutting is by analyzing motion, change of possession, and change of predication in terms of a common primitive function GO (alternating with BE and STAY) plus a “field feature” that localizes it to a particular semantic field (space vs. possession vs. predication). Neither the function nor the field feature is lexicalized by itself: GO is not on its own the meaning of go. Rather, these two elements are like features in phonology, where for example voiced is not on its own a phonological segment but when combined with other features serves to distinguish one segment from another. Thus these meaning components cannot be expressed as word-like primes. An extension of this approach involves force-dynamic predicates (Talmy 1988, Jackendoff 1990: chapter 7), where for instance force, entail, be obligated, and the various senses of must share a feature, and permit, be consistent with, have a right, and the various
30. Conceptual Semantics senses of may share another value of the same feature. At the same time, these predicates differ in whether they are in the semantic field of physical force, social constraint, logical relation, or prediction. Another such case was mentioned in section 2.2: the strong semantic parallel between the mass-count distinction in material substances and the process-event distinction in situations. Despite the parallel, only a few words cut across these domains. One happens to be the word end, which can be applied to speeches, to periods of time, and to tables of certain shapes (e.g. long ones but not circular ones). On the Conceptual Semantics analysis (Jackendoff 1991), end encodes a boundary of an entity that can be idealized as onedimensional, whatever its ontological type. And because only certain table shapes can be construed as elaborations of a one-dimensional skeleton, only such tables have ends. (Note that an approach to end in terms of metaphor only restates the problem. Why do these metaphors exist? Answer: Because conceptualization has this feature structure.) The upshot of cases like these is that word meanings cannot be expressed in terms of word-like definitions, because the primitive features are not on their own expressible as words.
2.4. Spatial structure in word meanings One of the motivations for concluding that linguistic meaning must be segregated from “world knowledge” (as in Two-level Semantics and Lexical Conceptual Structure) is that there are many words with parallel grammatical behavior but clearly different semantics. For instance, verbs of manner of locomotion such as jog, sprint, amble, strut, and swagger have identical grammatical behavior but clearly differ in meaning. Yet there is no evident way to decompose them into believable algebraic features. These actions differ in how they look and how they feel. Similarly, a definition of chair in terms of “[+has-a-seat]” and “[+has-a-back]” is obviously artificial. Rather, our knowledge of the shape of chairs seems to have to do with what they look like and what it is like to sit in them – where sitting is ultimately understood in terms of performing the action. Likewise, our knowledge of dog at some level involves knowing that dogs bark. But to encode this purely in terms of a feature like “[+barks]” misses the point. It is what barking sounds like that is important – and of course this sound also must be involved in the meaning of the verb bark. In each of these cases, what is needed to specify the word meaning is not an algebraic feature structure, but whatever cognitive structures encode categories of shapes, actions, and sounds. Among these structures are Spatial Structures of the sort discussed in section 1.3, which encode conceptualizations of shape, color, texture, decomposition into parts, and physical motion. As suggested there, it is not that these structures alone constitute the word meanings in question. Rather, it is the combination of Conceptual Structure with these structures that fills out the meanings. Jackendoff (1996a) hypothesizes that these more perceptual elements of meaning do not interface directly with syntactic structure; that is, only Conceptual Structure makes a difference in syntactic behavior. For example, the differences in manner of motion among the verbs mentioned above are coded in Spatial Structure and therefore make no difference in their grammatical behavior. If correct, this would account for the fact that such factors are not usually considered part of “linguistic semantics”, even though they play a crucial role in understanding. Furthermore, since these factors are not encoded in
699
700
VI. Cognitively oriented approaches to semantics a format amenable to linguistic expression, they cannot be decomposed into definitions composed of words. The best one can do by way of definition is ostension, relying on the hearer to pick out the relevant factors of the environment.
2.5. Centrality conditions and preference rules It is well known that many words do not have a sharply delimited denotation. An ancient case is bald: the central case is total absence of hair, but there is no particular amount of hair that serves as dividing point between bald and non-bald. Another case is color terms: for instance, there are focal values of red and orange and a smooth transition of hues between them; but there is no sharp dividing line, one side of which is definitely red and the other side is definitely orange. To reinforce a point made in section 1.1, it is not our ignorance of the true facts about baldness and redness that leads to this conclusion. Rather, there simply is no fact of the matter. When judgments of such categories are tested experimentally, the intermediate cases lead to slower, more variable, and more context-dependent judgments. The character of these judgments has to do more with the conceptualization of the categories in question than with the nature of the real world (see article 28 (Taylor) Prototype theory; article 106 (Kelter & Kaup) Conceptual knowledge, categorization and meaning). In Conceptual Semantics, such words involve centrality conditions. They are coded in terms of a focal or central case (such as completely bald or focally red), which serves as prototype. Cases that deviate from the prototype (as in baldness) – or for which another candidate prototype competes (as in color words) – result in the observed slowness and variability of judgments. Such behavior is in fact what would be expected from a neural implementation – sharp categorical behavior is actually much harder to explain in neural terms. A more complex case that results in noncategorical judgments involves so-called cluster concepts. The satisfaction conditions for such concepts are combined by a nonBoolean connective (let’s call it “smor”) for which there is no English word. If a concept C is characterized by [condition A smor condition B], then stereotypical instances of C satisfy both condition A and condition B, and more marginal cases satisfy either A or B. For instance, the verb climb stereotypically involves (A) moving upward by (B) clambering along a vertically aligned surface, as in (6a). However, (6b) violates condition A while observing condition B, and (6c,d) are the opposite. (6e,f), which violate both conditions, are unacceptable. This shows that neither condition is necessary, yet either is sufficient. (6)
a.
The bear climbed the tree.
[upward clambering]
b.
The bear climbed down the tree/ across the cliff.
[clambering only]
c.
The airplane climbed to 30,000 feet.
[upward only]
d.
The snake climbed the tree.
[upward only]
e.
*The airplane climbed down to 10,000 feet.
[neither upward nor clambering]
f.
*The snake climbed down the tree.
[neither upward nor clambering]
30. Conceptual Semantics The connective between the conditions is not simple logical disjunction, because if we hear simply The bear climbed, we assume it was going upward by clambering. That is, both conditions are default conditions, and either is violable (Fillmore 1982). Jackendoff (1983) calls conditions linked by this connective preference rules. This connective is involved in the analysis of Wittgenstein’s (1953) famous example game, in the verb see (Jackendoff 1983: chapter 8), in the preposition in (Jackendoff 2002: chapter 11), and countless other cases. It is also pervasive elsewhere in cognition, for example gestalt principles of perceptual grouping (Wertheimer 1923, Jackendoff 1983: chapter 8) and even music and phonetic perception (Lerdahl & Jackendoff 1983). Because this connective is not lexicalized, word meanings involving it cannot be expressed as standard definitions.
2.6. Dot-objects An important aspect of Conceptual Semantics stemming from the work of Pustejovsky (1995) is the notion of dot-objects – entities that subsist simultaneously in multiple semantic domains. A clear example is a book, a physical object that has a size and weight, but that also is a bearer of information. Like other cluster concepts, either aspect of this concept can be absent: a blank notebook bears no information, and the book whose plot I am currently developing in my head is not (yet) a physical object. But a stereotypical book partakes of both domains. The information component can be linked to other instantiations besides books, such as speech, thoughts in people’s heads, computer chips, and so on. Pustejovsky notates the semantic category of objects like books with a dot between the two domains: [PHYSICAL OBJECT • INFORMATION], hence the nomenclature “dot-object.” Note that this treatment of book is different from considering the word polysemous. It accounts for the fact that properties from both domains can be applied to the same object at once: The book that fell off the shelf [physical] discusses the war [information]. Corresponding to this sort of dot-object there are dot-actions. Reading is at once a physical activity – moving one’s glance over a page – and an informational one – taking in the information encoded on the page. Writing is creating physical marks that instantiate information, as opposed to, say, scribbling, which need not instantiate information. Implied in this analysis is that spoken language also is conceptualized as a dot-object: sounds dotted with information (or meaning). The same information can be conveyed by different sounds (e.g. by speaking in a different language), and the same sounds can convey different information (e.g. different readings of an ambiguous sentence, or different pragmatic construals of the same sentence in different contexts). Then speaking involves emitting sounds dotted with information; by constrast, groaning is pure sound emission. Another clear case of a dot-object is a university, which consists at once of a collection of buildings and an academic organization: Walden College covers 25 acres of hillside and specializes in teaching children of the rich. Still another such domain (pointed out by Searle 1995) is actions in a game. For example, hitting a ball to a certain location is a physical action whose significance in terms of the game may be a home run, which adds runs, or a foul ball, which adds strikes. For such a case, the physical domain is “dotted” with a special “game domain,” in terms of which one carries out the calculation of points or the like to determine who wins in the end.
701
702
VI. Cognitively oriented approaches to semantics Symbolic uses of objects, say in religious or patriotic contexts, can also be analyzed in terms of dot-objects and dot-actions with significance in the symbolized domain. Similarly with money, where coins, bills, checks, and so on – and the exchange thereof – are both physical objects and monetary values (Searle calls the latter institutional facts, in contrast with physical brute facts). Perhaps the most far-reaching application of dot-objects is to the domain of persons (Jackendoff 2007). On one hand, a person is a physical object that occupies a position in space, has weight, can fall, has blue eyes, and so forth. On the other hand, a person has a personal identity in terms of which social roles are understood: one’s kinship or clan relations, one’s social and contractual obligations, one’s moral responsibility, and so forth. The distinction between these two domains is recognized crossculturally as the difference between body on one hand and soul or spirit on the other. Cultures are full of beliefs about spirits such as ghosts and gods, with personal identity and social significance but no bodies. We quite readily conceptualize attaching personal identity to different bodies, as in beliefs in life after death and reincarnation, films like Freaky Friday (in which mother and daughter involuntarily exchange bodies), and Gregor Samsa’s metamorphosis into a giant cockroach. A different sort of such dissociation is Capgras Syndrome (McKay, Langdon & Coltheart 2005), in which a stroke victim claims his wife has been replaced by an impostor who looks exactly the same. Thus persons, like books and universities, are dot-objects. Social actions partake of this duality between physical and social/personal as well. For example, shaking hands is a physical action whose social significance is to express mutual respect between persons. The same social significance can be attached to other actions, say to bowing, high-fiving, or a man kissing a lady’s hand. And the same physical action can have different social significance; for instance hissing is evidently considered an expression of approval in some cultures, rather than an expression of disapproval as in ours. Note that this social/personal domain is not the same as Theory of Mind, although they overlap a great deal. On one hand, we attribute intentions and goals not just to persons but also to animals, who do not have social roles (with the possible exception of pets, who are treated as “honorary” persons). On the other hand, social characteristics such as one’s clan and one’s rights and obligations are not a consequence of what one believes or intends: they are just bare social facts. The consequence is that we likely conceptualize people in three domains “dotted” together: the physical domain, the personal/ social domain, and the domain of sentient/animate entities. A formal consequence of this approach is that the meaning of an expression containing dot-objects and dot-actions is best treated in terms of two or more linked “planes” of meaning operating in parallel. Some inferences are carried out on the physical plane, others on the associated informational, symbolic, or social plane. Particularly through the importance of social predicates to our thought and action, such a formal treatment is fundamental to understanding human conceptualization and linguistic meaning.
3. Compositionality A central idealization behind most theories of semantics, including those of mainstream generative grammar and much of formal logic, is classical Fregean compositionality, which can be stated roughly as (7) (see article 6 (Pagin & Westerståhl) Compositionality).
30. Conceptual Semantics (7) (Fregean compositionality) The meaning of a compound expression is a function of the meanings of its parts and of the syntactic rules by which they are combined. (A similar phrasing appears in article 4 (Abbott) Reference) (7) is usually interpreted in the strongest possible way: the meaning of a phrase is a function only of the meanings of its constituent words, assembled in simple fashion in accordance with the syntax. This is often supplemented with further two assumptions. The first, mentioned in section 1, is that semantics is derived from syntax (perhaps prooftheoretically); the second is that the principles of semantic composition mirror those of syntactic composition rule for rule (for instance in Montague Grammar). Early work in Conceptual Semantics (Jackendoff 1983) adopted a position close to (7): heads of syntactic phrases correspond to semantic functions of one or more arguments; syntactic subjects and complements correspond to semantic constituents that instantiate these arguments (see article 83 (Pesetsky) Argument structure). Syntactic adjuncts, which are attached differently from complements, correspond to semantic modifiers, which compose with semantic heads differently than arguments do. However, subsequent work has revealed a host of cases where such simple relations between syntactic and semantic structure cannot obtain. One class of cases involves semantic information for which there is no evidence in the words or the syntax. (8) illustrates one variety, aspectual coercion (Talmy 1978, Verkuyl 1993, Pustejovsky 1995, Jackendoff 1997a; see also article 25 (de Swart) Mismatches and coercion). (8a) and (8b) are syntactically identical; however, (8a) implies repeated acts of jumping but (8b) does not imply repeated acts of sleeping. (8) a. Jack jumped on the couch until the bell rang. b. Jack slept on the couch until the bell rang. Strong Fregean composition would therefore require that jump (along with every other telic verb) is ambiguous between single and repeated jumping; repetition would come from the latter meaning of the word. (Note: if jump is semantically underspecified, then telicity comes from some nonlexical source, violating Fregean composition.) The problem is that telicity depends not just on the verb but on the entire verb phrase. For example, (9a) implies repeated (masochistic) action and (9b) does not. The difference is that ‘run into the wall’ is telic and ‘run alongside the wall’ is atelic, because of the paths implied by the two prepositions. (9) a. Jack ran into the wall until the bell rang. b. Jack ran alongside the wall until the bell rang. The solution proposed in the references above is that until places a temporal bound on an otherwise unbounded activity. In case the verb phrase is telic, i.e. it designates a temporally bounded event, semantic composition is licensed to reinterpret the verb phrase iteratively (i.e. it “coerces” the interpretation of the VP), so that the iterations constitute an unbounded activity. However, there is no reflex of this extra step of composition in syntactic structure. This view is confirmed by psycholinguistic experimentation (Piñango, Zurif & Jackendoff 1999); additional processing load is found in sentences like
703
704
VI. Cognitively oriented approaches to semantics (8a), taking place at a time and in a brain location consistent with semantic rather than syntactic processing. Another such case is reference transfer (Nunberg 1979), in which an NP is used to refer to something related such as ‘picture of NP’, ‘statue of NP’, ‘actor portraying NP’ and so on: (10) a. There’s Chomsky up on the top shelf, next to Plato.
[statue of or book by Chomsky]
b. [One waitress to another:] The ham sandwich in the corner wants [person who ordered sandwich] some more coffee. c. I’m parked out back. I got smashed up [my car] on the way here. Jackendoff (1992) (also Culicover & Jackendoff 2005) shows that these shifts cannot be disregarded as “merely pragmatic,” for two reasons. First, a theory that is responsible for how speakers understand sentences must account for these interpretations. Second, some of these types of reference transfer have interactions with anaphoric binding, which is taken to be a hallmark of grammar. Suppose Richard Nixon went to see the opera Nixon in China. It might have happened that … (11) Nixon was horrified to watch himself sing a foolish aria to Chou En-lai. Here Nixon stands for the real person and himself stands for the portrayed Nixon on stage. However, such a connection is not always possible: (12) *After singing his aria to Chou En-lai, Nixon was horrified to see himself get up and leave the opera house. (11) and (12) are syntactically identical in the relevant respects. Yet the computation of anaphora is sensitive to which NP’s reference has been shifted. This shows that reference transfer must be part of semantic composition. Jackendoff (1992) demonstrates that the meaning of reference transfer cannot be built into syntactic structure in order to derive it by Fregean composition. Another sort of challenge to Fregean composition comes from constructional meaning, where ordinary syntax is paired with nonstandard semantic composition (see article 86 (Kay & Michaelis) Constructional meaning). Examples appear in (13): the verb is syntactically the head of the VP, but it does not select its complements. Rather, the verb functions semantically as a means or manner expression. (13) a.
Bill belched his way out of the restaurant.
[‘Bill went out of the restaurant belching’]
b.
Laura laughed the afternoon away.
[‘Laura spent the afternoon laughing’]
c.
The car squealed around the corner.
[‘The car went around the corner squealing’]
30. Conceptual Semantics Jackendoff (1990, 1997b) and Goldberg (1995) analyze these examples as instances of distinct meaningful constructions in English. The way-construction in (13a) is an idiom of the form V Pro’s way PP, meaning ‘go PP by/while V-ing’; the time-away construction in (13b) has the form V-NP[time period] away, meaning ‘spend NP V-ing’; the soundmotion construction in (13c) has the form V PP, meaning ‘go PP while emitting sound of type V.’ It is shown that there is no way to derive these meanings from standard syntactic structures; rather there are stipulated nonstandard ways to compose a VP in English (though by no means crosslinguistically). Constructional meaning is also found in expressions with nonstandard syntax such as (14). (14) a. b. c. d.
The more I read, the less I understand. Into the cellar with you! One more beer and I’m leaving. rule for rule; day after day; student by student
In these cases one might be able to maintain a sort of Fregean composition, in that the special syntax directly denotes a particular sort of meaning composition. But the principles of composition here are (a) completely idiosyncratic and (b) introduce their own elements of meaning rather than just assembling the meanings of the words. This is not the spirit in which Fregean composition is usually intended. A final set of cases that cast doubt on Fregean compositionality are those where syntactic composition vastly underdetermines semantic composition. An example is Bare Argument Ellipsis: in (15), the meaning of B’s reply to A is not determined by the syntax of the reply, which is just yeah plus a bare NP. Rather, it has to do with a best pragmatic fit to A’s utterance. (15) A: I hear Ozzie’s been drinking again. B: Yeah, scotch. [‘Yeah, Ozzie’s been drinking scotch.’ – not ‘Yeah, I/you hear Ozzie’s been drinking scotch’] Mainstream generative theory (e.g. recently Merchant 2001) has maintained that B’s reply is derived by deletion from an underlying structure which expresses the way the reply is understood and which therefore can undergo Fregean composition. However, Culicover & Jackendoff (2005) (along with a host of others, including among philosophers Stainton 2006) argue that in general it is impossible to state a canonical rule of ellipsis based on syntactic identity, and that the proper generalization must be stated over meaning relations between A’s and B’s utterances. This means that there is no syntactic structure from which the understood meaning of B’s reply can be derived; hence Fregean composition again is violated (see article 70 (Reich) Ellipsis). A more radical example is pidgin languages, where there is arguably no syntactic structure (or at least very little), and yet structured meanings are conveyed (Givón 1995). In these cases, as in (15), it is up to the listener to use heuristics and world knowledge to surmise the overall semantic configuration intended by the speaker. However, such rudimentary syntax is not confined to pidgins. It also appears in standard language in nounnoun compounds, where the semantic relation between the two nouns is quite varied despite the very same uninformative syntactic configuration:
705
706
VI. Cognitively oriented approaches to semantics (16) wheat flour = ‘flour made from wheat’ cake flour = ‘flour of which cakes are made’ dog house = ‘house in which a dog characteristically lives’ house dog = ‘dog that lives in a house’ (and not a doghouse!) garbage man = ‘man who handles garbage’ snow man = ‘simulated man made of snow’ sun hat = ‘hat that protects one from the sun/that one wears in the sun’ bike helmet = ‘helmet that one wears while riding a bike’ rocket fuel = ‘fuel that powers a rocket’ etc. The range of semantic possibilities, though not unlimited, is quite broad; yet these examples show no syntactic contrast. Therefore the meaning cannot be derived simply by arranging the meanings of the words (see Jackendoff 2010: chapter 13 and article 80 (Olsen) Semantics of compounds). These examples (see Jackendoff 1997a, Jackendoff 2002, and Culicover & Jackendoff 2005 for a more extensive enumeration and discussion) show that the relation between syntax and semantics is more flexible than Fregean compositionality. This might be stated as (17). (17) (Enriched composition) Phrase and sentence meanings are composed from the meanings of the words plus independent principles for constructing meanings, only some of which correlate with syntactic structure. Moreover, some syntactic structures express elements of meaning (not just arrangements of elements) that are not conveyed by individual words. Fregean composition is the simplest case of (17), in which all elements of meaning come from the words, and syntactic structure expresses only the arrangement of word meanings, not content. This works for simple examples like Pat kissed Frankie, but not for the sorts of phenomena presented above. Such phenomena are pervasive in language; they involve both pieces of meaning expressed through meaningful syntactic constructions and pieces of meaning that are expressed neither lexically nor syntactically. There are two important consequences of adopting this view of the syntax-semantics relation. First, it is possible to recognize that much of the complexity of mainstream syntax has arisen from trying to make covert syntax (D-structure or Logical Form) rich enough to achieve Fregean compositionality. Once one acknowledges the richer possibilities for composition argued for here, it becomes possible to strip away much of this complexity from syntax. The result is a far leaner theory of syntax, partly compensated for by a richer theory of the mapping between syntax and semantics (Culicover & Jackendoff 2005). The tradeoff, however is not even, because no defensible version of Fregean compositionality, no matter how complex the syntax, can account for any the phenomena adduced in this section. A second consequence of Enriched Composition is that one can now come to view language not as a system that derives meanings from sounds (say proof-theoretically), but rather as a system that expresses meanings, where meanings constitute an independent mental domain – the system of thought. This is consistent with the view of Conceptual Semantics laid out in section 1 above, in which Conceptual Structure and Spatial Structure are the domains of thought and are related to linguistic expression through
30. Conceptual Semantics the interfaces with syntax and phonology. Thus the empirical phenomena studied within Conceptual Semantics provide arguments for the theory’s overall worldview, one that is consistent with the constraints of the mentalistic framework.
4. References Bach, Emmon 1986. The algebra of events. Linguistics & Philosophy 9, 5–16. Bloom, Paul 2000. How Children Learn the Meanings of Words. Cambridge, MA: The MIT Press. Bloom, Paul et al. (eds.) 1996. Language and Space. Cambridge, MA: The MIT Press. Carnap, Rudolf 1939. Foundations of Logic and Mathematics. Chicago, IL: The University of Chicago Press. Cheney, Dorothy L. & Robert M. Seyfarth 2007. Baboon Metaphysics. The Evolution of a Social Mind. Chicago, IL: The University of Chicago Press. Chomsky, Noam 1995. The Minimalist Program. Cambridge, MA: The MIT Press. Culicover, Peter W. & Ray Jackendoff 2005. Simpler Syntax. Oxford: Oxford University Press. Declerck, Renaat 1979. Aspect and the bounded/unbounded (telic/atelic) distinction. Linguistics 17, 761–794. Fauconnier, Gilles 1985. Mental Spaces. Aspects of Meaning Construction in Natural Language. Cambridge, MA: The MIT Press. Fellbaum, Christiane (ed.) 1998. WordNet. An Electronic Lexical Database. Cambridge, MA: The MIT Press. Fillmore, Charles 1982. Towards a descriptive framework for deixis. In: R. Jarvella & W. Klein (eds.). Speech, Place, and Action. New York: Wiley, 31–59. Fodor, Jerry A. 1975. The Language of Thought. Cambridge, MA: Harvard University Press. Fodor, Jerry A. 1998. Concepts. Where Cognitive Science Went Wrong. Oxford: Oxford University Press. Frege, Gottlob 1892/1952. Über Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik 100, 25–50. English translation in: P. Geach & M. Black (eds.). Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1952, 56–78. Givón, Talmy 1995. Functionalism and Grammar. Amsterdam: Benjamins. Goldberg, Adele E. 1995. Constructions. A Construction Grammar Approach to Argument Structure. Chicago, IL: The University of Chicago Press. Gruber, Jeffrey S. 1965. Studies in Lexical Relations. Ph.D. dissertation. MIT, Cambridge, MA. Reprinted in: J. S. Gruber. Lexical Structures in Syntax and Semantics. Amsterdam: North-Holland, 1976, 1–210. Hauser, Marc D. 2000. Wild Minds. What Animals Really Think. New York: Henry Holt. Hinrichs, Erhard W. 1985. A Compositional Semantics for Aktionsarten and NP Reference in English. Ph.D. dissertation. Ohio State University, Columbus, OH. Jackendoff, Ray 1983. Semantics and Cognition. Cambridge, MA: The MIT Press. Jackendoff, Ray 1987. Consciousness and the Computational Mind. Cambridge, MA: The MIT Press. Jackendoff, Ray 1990. Semantic Structures. Cambridge, MA: The MIT Press. Jackendoff, Ray 1991. Parts and boundaries. Cognition 41, 9–45. Reprinted in: R. Jackendoff, Meaning and the Lexicon. The Parallel Architecture 1975–2000. Oxford: Oxford University Press, 138–173. Jackendoff, Ray 1992. Mme. Tussaud meets the binding theory. Natural Language and Linguistic Theory 10, 1–31. Jackendoff, Ray 1996a. The architecture of the linguistic-spatial interface. In: P. Bloom et al. (eds.). Language and Space. Cambridge, MA: The MIT Press, 1–30. Reprinted in: R. Jackendoff, Meaning and the Lexicon. The Parallel Architecture 1975–2010. Oxford: Oxford University Press, 112–134. Jackendoff, Ray 1996b. The proper treatment of measuring out, telicity, and possibly even quantification in English. Natural Language and Linguistic Theory 14, 305–354. Reprinted in:
707
708
VI. Cognitively oriented approaches to semantics R. Jackendoff, Meaning and the Lexicon. The Parallel Architecture 1975–2010. Oxford: Oxford University Press, 175–221. Jackendoff, Ray 1997a. The Architecture of the Language Faculty. Cambridge, MA: The MIT Press. Jackendoff, Ray 1997b. Twistin’ the night away. Language 73, 534–559. Reprinted in: R. Jackendoff, Meaning and the Lexicon. The Parallel Architecture 1975–2010. Oxford: Oxford University Press, 250–277. Jackendoff, Ray 2002. Foundations of Language. Brain, Meaning, Grammar, Evolution. Oxford: Oxford University Press. Jackendoff, Ray 2007. Language, Consciousness, Culture. Essays on Mental Structure. Cambridge, MA: The MIT Press. Koch, Christof 2004. The Quest for Consciousness. A Neurobiological Approach. Englewood, CO: Roberts. Kosslyn, Stephen M. 1980. Image and Mind. Cambridge, MA: Harvard University Press. Lakoff, George 1987. Women, Fire, and Dangerous Things. What Categories Reveal about the Mind. Chicago, IL: The University of Chicago Press. Landau, Barbara & Ray Jackendoff 1993. ‘What’ and ‘where’ in spatial language and spatial cognition. Behavioral and Brain Sciences 16, 217–238. Landauer, Thomas et al. (eds.) 2007. Handbook of Latent Semantic Analysis. Mahwah, NJ: Erlbaum. Langacker, Ronald 1987. Foundations of Cognitive Grammar, vol. 1. Theoretical Prerequisites. Stanford, CA: Stanford University Press. Lerdahl, Fred & Ray Jackendoff 1983. A Generative Theory of Tonal Music. Cambridge, MA: The MIT Press. Levinson, Stephen C. 2003. Space in Language and Cognition. Explorations in Cognitive Diversity. Cambridge: Cambridge University Press. Lewis, David 1972. General semantics. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 169–218. Macnamara, John 1982. Names for Things. A Study of Human Learning. Cambridge, MA: The MIT Press. Marr, David 1982. Vision. A Computational Investigation into the Human Representation and Processing of Visual Information. San Francisco, CA: Freeman. McKay, Ryan, Robyn Langdon & Max Coltheart 2005. ‘Sleights of mind’. Delusions, defenses, and self-deception. Cognitive Neuropsychiatry 10, 305–326. Merchant, Jason 2001. The Syntax of Silence. Sluicing, Islands, and the Theory of Ellipsis. Oxford: Oxford University Press. Montague, Richard 1973. The proper treatment of quantification in ordinary English. In: J. Hintikka, J. Moravcsik & P. Suppes (eds.). Approaches to Natural Language. Dordrecht: Reidel, 221–242. Murphy, Gregory L. 2002. The Big Book of Concepts. Cambridge, MA: The MIT Press. Neisser, Ulric 1967. Cognitive Psychology. Englewood Cliffs, NJ: Prentice Hall. Nunberg, Geoffrey 1979. The non-uniqueness of semantic solutions. Polysemy. Linguistics & Philosophy 3, 143–184. Paivio, Allan 1971. Imagery and Verbal Processes. New York: Holt, Rinehart & Winston. Piñango, Maria M., Edgar Zurif & Ray Jackendoff 1999. Real-time processing implications of enriched composition at the syntax-semantics interface. Journal of Psycholinguistic Research 28, 395–414. Pinker, Steven 1989. Learnability and Cognition. The Acquisition of Argument Structure. Cambridge, MA: The MIT Press. Pinker, Steven 2007. The Stuff of Thought. Language as a Window into Human Nature. New York: Viking. Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA: The MIT Press. Putnam, Hilary 1975. The meaning of ‘meaning’. In: K. Gunderson (ed.). Language, Mind, and Knowledge. Minneapolis, MN: University of Minnesota Press, 131–193. Russell, Bertrand 1905. On denoting. Mind 14, 479–493. Searle, John R. 1995. The Construction of Social Reality. New York: Free Press.
31. Two-level Semantics: Semantic Form and Conceptual Structure
709
Stainton, Robert J. 2006. Words and Thoughts. Subsentences, Ellipsis, and the Philosophy of Language. Oxford: Oxford University Press. Talmy, Leonard 1978. The relation of grammar to cognition. A synopsis. In: D. Waltz (ed.). Theoretical Issues in Natural Language Processing. New York: ACM, 14–24. Revised and enlarged version in: L. Talmy. Toward a Cognitive Semantics, vol. 1. Concept Structuring Systems. Cambridge, MA: The MIT Press, 2000, 21–96. Talmy, Leonard 1988. Force-dynamics in language and cognition. Cognitive Science 12, 49–100. Talmy, Leonard 2000. Toward a Cognitive Semantics. Cambridge, MA: The MIT Press. Tarski, Alfred 1956. The concept of truth in formalized languages. In: A. Tarski. Logic, Semantics, Metamathematics. Translated by J. H. Woodger. 2nd edn., ed. J. Corcoran. London: Oxford University Press, 152–197. van der Zee & Jon Slack (eds.) 2003. Representing Direction in Language and Space. Oxford: Oxford University Press. Verkuyl, Henk 1993. A Theory of Aspectuality. The Interaction between Temporal and Atemporal Structure. Cambridge: Cambridge University Press. Wertheimer, Max 1923. Laws of organization in perceptual forms. Reprinted in: W. D. Ellis (ed.). A Source Book of Gestalt Psychology. London: Routledge & Kegan Paul, 1938, 71–88. Wierzbicka, Anna 1996. Semantics. Primes and Universals. Oxford: Oxford University Press. Wittgenstein, Ludwig 1953. Philosophical Investigations. Translated by G.E.M. Ascombe. Oxford: Blackwell.
Ray Jackendoff, Medford, MA (USA)
31. Two-level Semantics: Semantic Form and Conceptual Structure 1. 2. 3. 4. 5. 6.
Introduction Polysemy problems Compositionality and beyond: Semantic underspecification and coercion More on SF variables and their instantiation at the CS level Summary and outlook References
Abstract Semantic research of the last decades has been shaped by an increasing interest in conceptuality, that is, in emphasizing the conceptual nature of the meanings conveyed by natural language expressions. Among the multifaceted approaches emerging from this tendency, the article focuses on discussing a framework that has become known as »Twolevel Semantics«. The central idea it pursues is to assume and justify two basically distinct, but closely interacting, levels of representation that spell out the meaning of linguistic expressions: Semantic Form (SF) and Conceptual Structure (CS). The distinction of SF vs. CS representations is substantiated by its role in accounting for related parallel distinctions including ‘lexical vs. contextually specified meaning’, ‘grammar-based vs. concept-based restrictions’, ‘storage vs. processing’ etc. The SF vs. CS distinction is discussed on the basis of semantic problems regarding polysemy, underspecification, coercion, and inferences. Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 709–740
710
VI. Cognitively oriented approaches to semantics
1. Introduction 1.1. The turn to conceptuality Looking back at the major trends of linguistic research in the 80’s and 90’s, we observe a remarkable inclination to tackle semantic issues by emphasizing the conceptual nature of the meanings conveyed by linguistic expressions. Several models and frameworks of linguistic semantics developed at that time marked off their specific view on meaning by programmatically labeling the structure they focus on as conceptual (cf. article 19 (Levin & Rappaport Hovav) Lexical Conceptual Structure; article 30 (Jackendoff) Conceptual Semantics; article 27 (Talmy) Cognitive Semantics) and by elevating concepts, conceptualization, and Conceptual System to key words of semantic theorizing. The approach presented in this article is another outcome of these efforts, which implies that it shows commonalities with as well as differences from the approaches mentioned above. The semantic issues which have been under debate since that time are summarized in (1) by listing the major topics and the crucial questions they have given rise to: (1) a. compositionality: b. lexicalism:
c. meaning variation: d. cognitivism:
e. modularity:
f.
interpretations:
How far do we get by holding to the Frege Principle? What can provide a better account of the internal meaning structure of lexical items – semantic decomposition or meaning postulates? How do we account for polysemy and underspecification? How can we avoid “uninterpreted markerese” by drawing on semantic primes which are (i) compatible with our linguistic intuition, (ii) reconstructible elements of our conceptual knowledge, and which (iii) can be traced back to our perceptual abilities? How can we spell out and test the claim that our linguistic behavior results from the interaction of largely autonomous mental systems and subsystems? What are the respective roles of word knowledge and world knowledge in specifying what is commonly dubbed “sentence meaning” vs. “utterance meaning” vs. “communicative sense”?
The answers to (1a–f) as provided by various frameworks differ to a certain extent, though on closer inspection they will presumably turn out not to be strictly incompatible. However, typical features of theoretical innovations in linguistics such as terminological rank growth, lack of concern in dealing with equivocations, and confinement to selections of data and/or problems that are supportive of a given approach have impeded detailed comparisons between the competing approaches so far, but see Taylor (1994, 1995), Geeraerts (2010). Space limitations prevent us from delving into this endeavor here. Instead, the article attempts to convey some of the motives and tenets of what has become known as Two-level Semantics (which, incidentally, is not a registered trademark created by the adherents of the approach, but a label it received from reviewers) and restricts reference to kindred views to that of Conceptual Semantics expounded in Jackendoff (1996, 2002; article 30 of this volume).
31. Two-level Semantics: Semantic Form and Conceptual Structure
1.2. Basic assumptions Two-level Semantics is not at variance with the other frameworks in recognizing the conceptual nature of, and in pursuing a mentalistic approach to, linguistic meaning. The major difference between the former and the latter is hinted at in the subtitle, which presents the distinction of two levels of representation, i.e. Semantic Form (SF) vs. Conceptual Structure (CS), as the central issue this approach claims to deal with. The relations assumed to hold between SF and CS have in common that they induce certain asymmetries but they differ in the viewpoints that give rise to these distinctions. In the following, we briefly discuss a selection of features that have been proposed to distinguish SF representations from CS representations. To clarify the significance of these rather general claims, the goals and the problems connected with the assumptions will be commented on in more concrete terms.
(2) SF ⊂ CS In substance, SF representations may be conceived of as those subsets of CS representations that are systematically connected to, and hence covered by, lexical items and their combinatorial potential to form more complex expressions. Strictly speaking, SF and CS here stand for two sets of elements (inventories) which make up the respective representations. Due to the conditions specified in (3) and (4) below, SF representations and CS representations do not qualify as members of the same set – the former represent linguistic knowledge, the latter non-linguistic knowledge. The relationship expressed in (2) comprises two aspects. The uncontroversial one is the subset – set relation SF ⊂ CS which follows from the widely held view that for every linguistic expression e in language L there is a CS representation c assignable to it via SF(e), but not vice versa. It is obviously not the case that for every actual or latent CS item c there is an expression e in L with an SF(e) which makes c communicable to other speakers of L. Thus, (2) presupposes the existence of non-lexicalized concepts. The problematic aspect of (2) is this: The view that CS representations are mental structures that mediate between language and the world as construed by the human mind implies that the Conceptual System provides representations whose contents originate in heterogeneous cognitive subsystems and which therefore have to be homogenized to yield knowledge structures that can be accessed and processed on the conceptual level. The conditions based on which, say, perceptual features stemming from vision, touch, proprioception etc. are conceptualized to figure in CS representations are far from clear. We will call this the “homogenization problem” posed by CS representations. (3) grammar-based vs. concept-based SF representations account for the fact that the meanings of linguistic expressions come with grammatically determined kinds of packaging in terms of morphosyntactic categories and semantic types, while the elements of CS representations, due to their mental source and intermodal homogeneity, lack grammar-based wrappings.
711
712
VI. Cognitively oriented approaches to semantics The distinction in (3) is not challenged in principle but it is under debate whether or not the types of grammatical packaging in which the meanings of linguistic expressions are conveyed yield a sufficient condition to postulate SF as a representation level of its own. So, e.g., R. Jackendoff (article 30) does not absolutely exclude such a level in conceding “If it proves necessary to posit an additional level of »linguistic semantic structure« that is devoted specifically to features relevant for grammatical expression […], the addition of such an extra component would not at all change the content of Conceptual Structure, which is necessary to drive inference and the connection to perception”. Basically, however, he sticks to the view “that in fact such an extra component is unnecessary” (Jackendoff, this volume, p. 695). Let’s call this the “justification problem” posed by the assumption of SF representations. (4) linguistic vs. non-linguistic origin SF representations form an integral part of the information cluster represented by the lexical entries of a given language L, whereas CS representations are taken to belong to, or at least to be rooted in, the non-linguistic mental systems based on which linguistic expressions are interpreted and related to their denotations. The distinction referred to in (4) by locating the roots of SF and CS representations in different though mutually accessible mental subsystems is the view taken by adherents of Two-level Semantics, cf. Bierwisch (1983, 1996, 1997, 2007); Bierwisch & Lang (1989a); Bierwisch & Schreuder (1992) for earlier works. Article 16 (Bierwisch) Semantic features and primes focuses on defining SF as an interface level whose basic elements, combinatorial rules, and well-formedness constraints directly reflect the conditions on which lexicon-based meanings of morpho-syntactically categorized, regularly combined linguistic expressions are composed and interpreted. While article 16 may well be taken as a state-of-the-art report on arguments in favor of assuming SF as a level of representation, much less attention is paid to CS representations that are supposed to connect the former with “the full range of mental structures representing the content to be expressed” (Bierwisch, this volume, p. 322). So we face problems connected with the intermodal validity and the cross-modal origin of CS representations: (i) how to relate linguistically designated SF representations with conceptually homogenized CS representations? (ii) how to trace the latter back to their respective cognitive sources that are determined by crucially differing sensory modalities? (5) storage vs. processing SF representations are linguistic knowledge structures that are accessibly stored in long-term memory, whereas CS representations are activated and compiled in working memory, cf. article 108 (Kelter & Kaup) Conceptual knowledge, categorization, and meaning. The distinction that (5) establishes by locating SF and CS representations in long-term memory and working memory, respectively, marks out what experimental psycholinguistics may contribute to clarifying the theoretically controversial interrelationship of SF and CS representations by drawing on evidence from language processing. The effects of taking (5) seriously can be expected to pay off in confirming or disconfirming the
31. Two-level Semantics: Semantic Form and Conceptual Structure distinction of SF vs. CS but also in providing criteria for deciding what requirements the representations at issue have to meet. The methodologically most relevant conclusion drawn by Kelter & Kaup (article 108) reads as follows: “researchers should acknowledge the fact that concepts and word meaning are different knowledge structures.” The claim in (5) suggests that if it is the SF of lexical items that is stored in long-term memory, the entries should be confined to representing what may be called “context-free lexical meanings”, whereas CS representations compiled and processed in working memory should take charge of what may be called “contextually specified (parts of) utterance meanings”. The difference between the two types of meaning representations indicates the virtual semantic underspecification of the former and the possible semantic enrichment of the latter. There is a series of recent experimental studies designed and carried out along these lines which – in combination with evidence from corpus data, linguistic diagnostics etc. – are highly relevant for the theoretical issues raised by SF vs. CS representations. Experiments reported by Stolterfoht, Gese & Maienborn (2010) and Kaup, Lüdtke & Maienborn (2010) succeeded in providing processing evidence that supports the distinction of, e.g., primary adjectives vs. adjectivized participles vs. verbal participles, that is, evidence for packaging categories relevant to SF representations. In addition, these studies reveal the processing costs of contextualizing semantically underspecified items, a result that supports the view that contextualizing the interpretation of a given linguistic expression e is realized by building up an enriched CS representation on the basis of SF (e).
1.3. SF vs. CS – an illustration from everyday life To round off the picture outlined so far, we illustrate the features listed in (2)–(5) in favor of the SF vs. CS distinction by an example we are well acquainted with, viz. the representations involved in handling numbers, number symbols, and numerals in everyday life. Note that each of the semiotic objects in (6)–(8) below represents in some way the numerical concept »18«. However, how numerical concepts between »10« and »20« are stored, activated, and operated on in our memory is poorly understood as yet, so the details regarding the claim in (5) must be left open. Suffice it to agree that »18« stands for the concept we make use of, say, in trying to mentally add up the sum to be paid for our purchases in the shopping trolley. With this proviso in mind, we now look at the representations of the concept »18« in (6)–(8) to find out their interrelations. (6) a. |||| b. :::
|||| |||| ||| ::: :::
(7) a. XVIII b. 18 (8) a. b. c. d. e.
a'.
eighteen, achtzehn dix-huit, shi ba okto-kai-deka diez y ocho vosem-na-dcat'
IIXX
(rarely occurring alternative)
(8 + 10) (10 + 8) ((8)-and-(10)) ((10)-and-(8)) ((8)-on-(10))
English, German French, Mandarin Greek Spanish Russian
713
714
VI. Cognitively oriented approaches to semantics f. duo-de-viginti g. ocht-deec h. deu-naw
((2)-of-(20)) (8 + (2 × 5)) (2 × 9)
Latin Irish Welsh
(6) shows two iconic non-verbal representations of a quantity whose correlation with the concept »18« and/or with the numerals in (8) rests on the ability to count and the availability of numerals. The tallying systems in (6) are simple but inefficient for doing arithmetic and hence hardly suitable to serve as semantic representations of the numerals in (8). (7) shows two symbolic non-verbal representations of »18«, generated by distinct writing systems for numbers. The Roman number symbols are partially iconic in that they encode addition by iterating up to three special symbols for one, ten, hundred, or thousand, partially symbolic due to placing the symbol of a small number in front of the symbol of a larger number, thus indicating subtraction, cf. (7a, a'). The lack of a symbol for null prevented the creation of a positional system, the lack of means to indicate multiplication or division impeded calculation. Both were obstacles to progress in mathematics. Thus, Roman number symbols may roughly render the lexical meaning of (8a–f) but not those of (8g–h) and all other variants involving multiplication or division. The Indo-Arabic system of number symbols exemplified by 18 in (7b) is a positional system without labels based on exponents of ten (100, 101, 102, … , 10n). As a representational system for numbers it is recursive and potentially infinite in yielding unambiguous and well-distinguished chains of symbols as output. Thus, knowing the system implies knowing that 18 ≠ 81 or that 17 and 19 are the direct predecessor and successor of 18, respectively, even if we do not have pertinent number words at our disposal to name them. Moreover, it is this representation of numbers that we use when we do arithmetic with paper and pencil or by pressing the keys of an electronic calculator. Enriched with auxiliary symbols for arithmetical operations and for marking their scope of application, as well as furnished with conventions for writing equations etc., this notational system is a well-defined means to reduce the use of mathematical expressions to representations of their Conceptual Structures, that is, to the CS representations they denote, independent of any natural language in which these expressions may be read aloud or dictated. Let’s call this enriched system of Indo-Arabic number symbols the “CS system of mathematical expressions”. Now, what about the SF representations of numerals? Though all number words in (8) denote the concept »18«, it is obvious that their lexical meanings differ in the way they are composed, cf. the second column in (8). As regards their combinatorial category, the number words in (8) are neither determinative nor copulative compounds, nor are they conjoined phrases. They are perhaps best categorized as juxtapositions with or without connectives, cf. (8c–f) and (8a–b, g–h), respectively. The unique feature of complex number words is that the relations between their numeral constituents are nothing but encoded fundamental arithmetic operations (addition and multiplication are preferred; division and subtraction are less frequent). Thus, the second column in (8) shows SF representations of the number words in the first column couched in terms of the CS system of mathematical expressions. The latter are construable as functor-argument structures with arithmetic operators (‘+’, ‘–’, ‘×’ etc.) as functors, quantity constants for digits as arguments, and parentheses (…) as boundaries marking lexical building blocks. Now let’s see what all this tells us about the distinctions in (2)–(5) above.
31. Two-level Semantics: Semantic Form and Conceptual Structure The subset – set relation SF ⊂ CS mentioned in connection with (2) also holds for the number symbols in (7b). The CS system of mathematical expressions is capable of representing all partitions of 18 that draw on fundamental arithmetic operations. Based on this, the CS system at stake covers the internal structures of complex number words, cf. (8), as well as those of equations at the sentence level like 18 = 3 × 6; 18 = 2 × 9; 18 = 72 : 4 etc. By contrast, the subset of SF representations for numerals is restricted in two respects. First, not all admissible partitions of a complex number like 18 are designated as SF of a complex numeral lexicalized to denote »18«. The grammar of number words in L is interspersed with (certain types of) L-specific packing strategies, cf. Hurford (1975), Greenberg (1978). Second, since the ideal relationship between systems of number symbols and systems of numerals is a one-to-one correspondence, the non-ambiguity required of the output of numeral systems practically forbids creation or use of synonymous number names (except for the distinct numerals used for e.g. 1995 when speaking of years or of prices in €). There is still another conclusion to be drawn from (6)–(8) in connection with (2). The CS system of mathematical expressions is a purposeful artifact created and developed to solve the “homogenization problem” raised by CS representations for the well-defined field of numbers and arithmetic operations on them. First, the mental operations of counting, adding, multiplying etc., which the system is designed to represent, have been abstracted from practical actions, viz. from lining up things, bundling up things, bundling up bundles of things etc. Second, the CS representations of mathematical expressions provided by the system are unambiguous, complete (that is, fully specified and containing neither gaps nor variables to be instantiated by elements from outside the system), and independent of the particular languages in which they may be verbalized. The lexicon-based packaging and contents of the components of SF representations claimed in (3) and (4) are also corroborated by (6)–(8). The first point to note is the Lspecific ways in which (i) numerals are categorized in morpho-syntactic terms and (ii) their lexical meanings are composed. The second point is this: Complex numerals differ from regular (determinative or copulative) compounds in that the relations between their constituents are construed as encodings of fundamental arithmetical operations, cf. (8a–h). This unique feature of the subgrammar of number words also yields a strong argument wrt. the “justification problem” posed by the assumption of lexicon-based SF representations. The claims in (3) and (4) concerning the non-linguistic nature of CS representations are supported by the fact that e.g. 18 is an intermodally valid representation of the concept »18« as it covers both the perception-based iconic representations of »18« in (6) and the lexicon-based linguistic expressions denoting »18« in (8). Thus, the unique advantage of the CS system of mathematical expressions is founded on the representational intermodality and the conceptual homogeneity it has achieved in the history of mathematical thinking. No other science is more dependent on the representations of its subject than mathematics. Revealing as this illustration may be, the insights it yields cannot simply be extended to the lexicon and grammar of a natural language L beyond the subgrammar of number words. The correlations between systems of number names and their SF representations in terms of the CS system of mathematical expressions form a special case which results from the creation of a non-linguistic semiotic artifact, viz. a system to represent number concepts under controlled laboratory conditions. The meanings, the combinatorial
715
716
VI. Cognitively oriented approaches to semantics potential and hence the SF representations of lexical items outside the domain of numeric tools are far less strictly codified than those of numerals. Otherwise, the controversial issues listed in (1) would not emerge. The overwhelming majority of SF representations of lexical items have to account for ambiguity, polysemy, underspecification, contextdependency etc., that is, for phenomena which require the use of appropriate variables at the SF level to be instantiated by pieces of information available at the CS level.
1.4. Aims and limitations Having outlined some perspectives and problems connected with the assumption of two separate but interacting levels of semantic representation, we conclude this introductory survey by some remarks on the weight one may attach to the pros and cons discussed so far. First, regarding the justification problem raised by (3) there is a truism: the representations assigned to linguistic meaning depend on the meaning attributed to linguistic representations. In other words, in view of our limited knowledge of the principles based on which linguistic expressions and semantic interpretations are mutually assigned, we cannot get along without auxiliary terminology such as tier, layer, plane, domain etc. Thus, the term level of representation is just a heuristic aid that serves as a gathering place for distinctions considered to be necessary and worth systematizing. Any further assessment is premature. Second, the crucial point is not the number of levels of linguistic structure formation we postulate but the validity of the arguments based on which such levels are substantiated. It is above all this guideline that characterizes the efforts subsumable under the label Two-level Semantics. There have been proposals to increase the number of levels, cf. Dölling (2001, 2003, 2005a); Schwarz (1992), as well as criticisms regarding the mapping operations assumed to apply between SF and CS, cf. Blutner (1995, 1998, 2004), Meyer (1994), Taylor (1994, 1995). Given the situation defined by the questions in (1), Two-level Semantics may be considered a series of attempts along the lines of (2)–(5) to achieve a more fine-grained picture of what we are used to calling “semantic interpretation”. These attempts were, and still are, driven and guided by the following leitmotif: (9) The semantic interpretation of a sentence s in isolation as well as of its utterance in use require to differentiate and interrelate those portions of its meaning that are lexicon-based and those possibly available portions of meaning that are contextbased such that the latter may serve as specifications of the former. Third, in view of the fact that lexical SF representations are discussed in detail by M. Bierwisch (article 16), we will pay more attention to compositionality issues (§3) and CS representations and the way they account both for the semantic issues pointed out in (1) and for the various problems raised in connection with the distinctions in (2)–(5) above (§4). Fourth, Two-level Semantics shares several objectives with the framework presented in article 30 (Jackendoff) Conceptual Semantics but prefers different solutions. There is agreement on the guiding role of compositionality and the need for decomposition. Jackendoff’s requirement that “Utterance meanings must serve as a formal basis for
31. Two-level Semantics: Semantic Form and Conceptual Structure inference” (Jackendoff, this volume, p. 691) is accepted as contextualized inferencing at the CS level but in addition there are built-in inferences at the SF level. The two-level framework acknowledges the import of categorization and contextualization but places emphasis on the grammatical nature of SF as indicated in (3) and (4) above. On this view, the principles governing SF representations concern not only the internal meaning structure and the grammatical packaging of lexical items but also general conditions on the lexical system of L, e.g. grammatical categories, lexicalization patterns, options to be chosen as the basis of agreement etc. By way of illustration, note the following. The English collective noun (i) married couple has two equivalents in German: (ii) Ehepaar, which is also a collective noun, and (iii) Eheleute, which, though based on a plural only noun, behaves like a regular individual plural and has no direct counterpart in English; cf. Dölling (1994), Lang (1994). Now, while all three are absolutely alike at the CS level in denoting a set of two individuals as husband and wife, they differ at the SF level in the way they are sensitive to number agreement and selectional restrictions, cf. (10–13): (10) a. b. c. d.
Die Eheleute hassen [3P.Pl] einander/sich gegenseitig. Das Ehepaar hasst [3P.Sg] *einander/*sich gegenseitig. Das Ehepaar *ist/*sind [3P.Sg/Pl] beide Linkshänder. Die Eheleute sind [3P.Pl] beide Linkshänder.
(11) a. The married couple hate [3P.Pl] each other/are [3P.Pl] both left-handers. b. Each one of the married couple hates [3P.Sg] the other. (12) a. The married couple is [3P.Sg] waiting for their visa. b. The married couple are [3P.Pl] waiting for their visas. (13) a. Das Ehepaari wartet [3P.Sg] auf seini/*ihri Visum. b. Die Eheleutei warten [3P.Pl] auf ihrei Visa. The antecedent of reciprocals like einander or each other must denote a set of two (or more) elements. In both languages, the antecedent is usually a plural NP or an andcoordination of NPs; with collective nouns, however, there are language-particular constraints. In German, agreement features for person, number, and gender are assigned on the basis of some morpho-syntactic correspondence between antecedent and target. A singular collective noun as subject requires a verb in the singular and excludes reciprocals like einander as complement, cf. (10b,c; 13a), whereas plural NPs or and-coordinated NPs as subjects usually come with plural verbs and allow for reciprocals as complements, cf. (10a,d; 13b). In British English, however, committee-type singular nouns as subjects may spread agreement features on a morpho-syntactic or on a semantic basis, cf. (11a,b; 12a,b). Cases of singular agreement like (12a) are conceptualized as referring to a single entity, cases of plural override like (12b) are conceptualized as referring to the individual members of the set. What is an option in English is an obligatory lexical choice in German. As lexical items, English singular collective nouns are unspecified for inducing morpho-syntactic or semantic agreement and for co-occurring with reciprocals, German singular collective nouns, however, are basically unavailable for plural agreement and/ or reciprocals since number agreement in German strictly operates on morpho-syntactic
717
718
VI. Cognitively oriented approaches to semantics matching. In sum, although having the same SF, the collective nouns married couple and Ehepaar differ in their impact on sentence formation. Moreover, since SF forms a constitutive part of L as a natural language, it is subject to a series of pragmatic-based felicity conditions on communication. None of these aspects of SF as a linguistic level applies to CS representations. The article attempts to show that the distinction of SF vs. CS representations may turn out to be a useful heuristic means in dealing with the issues listed in (1) as well as a promising research strategy to connect semantic theorizing with empirical methods of analyzing semantic processing along the lines of (5). Guided by the leitmotif in (9), §2 deals with some unsolved problems of polysemy. §3 explores the SF vs. CS distinction from the angle of compositionality, and in §4 we turn to contextualization by discussing case studies of variables in SF representations and their instantiation at the CS level. In doing so, we also examine how inferences are accounted for by SF and CS representations, respectively.
2. Polysemy problems 2.1. Institution nouns Meaning multiplicity on the lexical level comprises three basic types: homonymy, polysemy, and indeterminacy (or vagueness). Bierwisch (1983), in a way the birth certificate of the SF vs. CS distinction, draws on institution nouns such as school, university, museum, parliament etc. to illustrate systematic polysemy, that is, a lexical item with one meaning representation acquiring further representations that differ from the first in predictable ways based on conceptual relations. (14a–d) below shows some of the readings that school may assume. The readings are numbered and the concepts they represent are added in ITALICIZED CAPS. normal caps in (15) show the invariant SF representation for the lexeme school, which may be contextually specified at the CS level by applying certain functions to (15) that eventually yield the utterance meanings of (14a–d) as represented in (16a–d). (14) a. b. c. d.
The school made a major donation. The school has a flat roof. He enjoys school very much. The school took a staff outing.
school1 ⊂ INSTITUTION school2 ⊂ BUILDING school3 ⊂ PROCESS school4 ⊂ PERSONNEL
(15) SF(school) = λX [purpose X W] with W = processes_of_learning_and_teaching (16) a. b. c. d.
λX [INSTITUTION X λX [BUILDING X λX [PROCESS X λX [PERSONNEL X
& SF (school)] & SF (school)] & SF (school)] & SF (school)]
Taken together, (14)–(16) show a way of (i) keeping the lexical meaning of the lexeme school constant and avoiding problematic ambiguity assumptions and (ii) still accounting for the range of semantic variation the lexeme school may cover at the CS level. The
31. Two-level Semantics: Semantic Form and Conceptual Structure conceptual interpretations of school in (16) are determined by selectional restrictions, cf. (14a–d), and come with distinctive grammatical features: so e.g. school in the PROCESS reading has no regular plural and in German the prepositions in Max geht auf die/in die/ zur Goethe-Schule clearly select the INSTITUTION, BUILDING and PROCESS reading, respectively. So far, so good. Methodologically, however, the analysis of these institution nouns poses some problems. First of all, we do not have reliable principles yet to find the SF of a polysemous lexeme, which makes it difficult to motivate a collection of templates that would account for the specifications in (16). Moreover, it is unclear (i) whether the members of the concept family associated with the noun school all draw on the abstract SF the same way (as suggested by (15–16)) or (ii) whether some of the concepts are more closely interconnected than others. Finally, it is unclear what conceptual (sub-)system is taken to serve as the source for the specifications in (16). To show the importance of these issues and their impact on the SF vs. CS distinction some brief comments might be in order. The SF proposed in (15) takes school as a sort of artifact by drawing on the feature purpose X W, which is not implausible as it inheres in all artifact-denoting nouns. However, (15) ignores the social relevance attributed to the purpose W = processes_of_ learning_ and_teaching or to the purposes W', W" of other institution nouns. Actually, what makes a created X into an institution is its social importance evidenced by the fact that some purpose Wi has been institutionalized by founding or keeping X. Therefore, instead of reducing the role of this feature common to all institution nouns to that of yielding a concept at the CS level, cf. INSTITUTION in (16a), the lexical semantics of these nouns should make use of it as an invariant component at the SF level. Heuristically, the starting point for construing the SF of school and the CS specifications in (16) might be the lexical meaning of institution, which is something like ‘a legal entity that organizes purposeful events to be performed and/or received by authorized groups of persons in specific locations’ such that it (i) also covers abstract instances like the institution of marriage and (ii) provides the basis for (16a–d) as metonymy-based conceptual shifts. The learned word institution, no doubt an element of the adult lexicon, has a lexical meaning that is sufficiently abstract to allow for each and every of the conceptual specifications of school in (16); its conceptual basis is a sort of world knowledge that rests on what may be called “created advanced level concepts”, which in turn define a widely unexplored domain of the conceptual system. In contrast, the conceptual subsystem of spatial orientation is a domain we know a bit more about, as it crucially draws on human perception and thus on “natural basic level concepts”. So it is not a surprise that a number of pioneering works in the realm of conceptual structure deal with spatial issues. Since these studies provide better illustrations of the SF vs. CS distinction, we will focus on them in the next sections. Another problem with this approach to systematic polysemy is the fact that, despite their ontological and/or categorial differences, the conceptual specifications of the SF (school) in (16a–d) are not absolutely incompatible but may occur in certain combinations, cf. the gradual acceptability of the examples in (17): (17) a. b. c. d.
The school which has a flat roof made a major donation. ?? The school, which has a flat roof, made a major donation. ?? The school, which has a flat roof, went out for a staff outing. The school has a flat roof and *it/the school went out for a staff outing.
719
720
VI. Cognitively oriented approaches to semantics Whereas the INSTITUTION and the BUILDING readings are somewhat compatible, the BUILDING and the PERSONNEL readings are not; as regards the (type of) reading of the antecedent, anaphoric pronouns are less tolerant than relative pronouns or repeated DPs. The data in (17) show that the conceptual specifications of SF (school) differ in ways that are poorly understood as yet; cf. Asher (2007) for some discussion. The semantics of institution nouns, for a while the signature tune of Two-level Semantics, elicited a certain amount of discussion and criticism, cf. Herzog & Rollinger (1991), Bierwisch & Bosch (1995). The problems expounded in these volumes are still unsolved but they sharpened our view of the intricacies of the SF vs. CS distinction.
2.2. Locative prepositions In many languages the core inventory of adpositions encode spatial relations to localize some x (called theme, figure or located object) wrt. the place occupied by some y (called relatum, ground or reference object), where x and y may pairwise range over objects, substances, and events. Regarding the conceptual basis of these relations, locative prepositions in English and related languages are usually subdivided into topological (in, at, on), directional (into, onto), dimensional (above, under, behind), and path-defining (along, around) prepositions. The semantic problems posed by these lexical items can be best illustrated with in, which supposedly draws on spatial containment, pure and simple, and which is therefore taken to be the prime example of a topological preposition. To illustrate how SF (in) is integrated into a lexical entry with information on Phonetic Form (PF), Grammatical Features (GF), Argument Structure (AS) etc., we take German in as a telling example: It renders English in vs. into with distinct cases which in turn correspond to the values of the feature [ Dir(ectional)] subcategorizing the internal argument y, and to further syntactic distinctions. The entry in (18) is taken from Bierwisch (1988: 37), examples are added in (19). The interdependence of the values for the case feature [ Obl(ique)] and for the category feature [ Dir] is indicated by means of the meta-variable ∈ {+, –} and by the conventions (i) – inverts the value of and (ii) (W) means that W is present if = + and absent if = –. (18) Lexical entry of the German preposition in:
[–V,–N, Dir];
y
x
SF
}
/in/;
AS
}
}
GF
}
PF
[(fin) [loc x] ⊂ [loc y]]
[– Obl] (19) a. Die Straße/Fahrt
führt in die Stadt. [+ Dir, – Obl] = Acc, “x is a path ending in y” The street/journey leads into the city. /in/;
[–V,–N, Dir];
y x [(fin) [loc x] ⊂ [loc y]] [– Obl]
31. Two-level Semantics: Semantic Form and Conceptual Structure b. Die Straße/Fahrt ist in der Stadt. [– Dir, + Obl] = Dat, “x is located in y” The street/journey is in the city. /in/;
[–V,–N, Dir];
y x
[[loc x] ⊂ [loc y]]
[+ Obl] Now let’s take a closer look at the components of SF. The variables x and y represent entities ranging over the domains of objects, substances, or events. loc is a SF functorconstant of category N/N such that loc x assigns x the place it occupies in the domain it is an element of. The SF constant fin yields the final part of [loc x], thereby transforming the external argument of in into a path. The SF-constant ⊂ “specifies a particular relation between places, in the case of in simply (improper) inclusion” (Bierwisch 1988: 34). Confining our review to objects, the SF of in assigned to (19b) might thus be paraphrased as “the place occupied by the street x is (improperly) included in the place occupied by the city y” (op. cit.). While it is widely accepted that the semantics of locative in should be based on spatial inclusion, the relativizing attribute “(improper)” in the explication of the SF-constant ⊂ quoted above is indicative of a hidden controversial issue. In fact, much ink has been spilled on the problem of how to determine the lexical meaning of in by keeping to the spatial inclusion approach. The discussion was ignited by groups of data that seem to challenge the [[loc x] ⊂ [loc y]] analysis of the preposition in in some way. (20) a. The amount of oxygen in the air is diminishing. b. The balloons in the air quickly escaped. c. The air in the balloons quickly escaped. (21) a. b. c. d.
The water in the vase should be replaced. The flowers in the vase are wilted. The cracks in the vase cannot be repaired. I did not notice the splinter in his hand.
Whereas the approach under review might capture the examples in (20) by letting x and y range over substances (a) or objects and substances (b, c), the differences of (20a vs. b) and of (20b vs. c) in the interpretation of loc and ⊂ remain out of its reach. Obviously, (20a–c) differ in the way the place is assigned to x and to y by loc, but are alike in clearly requiring that ⊂ has to be interpreted as proper inclusion. The examples in (21) show that the place assigned to the relatum by the functor loc is not confined to the material boundaries of the object y but may vary to some extent. In (21a–c) the interpretation of in the vase involves function-based enrichment, e.g. by means of gestalt-psychological laws of closure, to account for the containment relation between x and y, which is proper in (21a), partial in (21b), and privative in (21c). The PP in (21d) is ambiguous, i.e. unspecified wrt. “x being materially included in y (as a foreign body)” or “x being functionally included in a cupped y (to prevent x from getting lost)”. The discussion of such data produced a series of theoretical revisions of the semantic analysis of topological prepositions. Wunderlich & Herweg (1991) propose (22) as a
721
722
VI. Cognitively oriented approaches to semantics general schema for the SF of locative prepositions thereby abandoning the problematic functor ⊂ and revising the functor loc: (22) λy λx (loc (x, prep*(y) )), where loc localizes the object x in the region determined by the preposition p and prep* is a variable ranging over p-based regions. Bierwisch (1996: 69) replaces SF (in) in (18) with λy λx [x [loc [int y]]] commenting “x loc p identifies the condition that the location of x be (improperly) included in p” and “int y identifies a location determined by the boundaries of y, that is, the interior of y”. Although this proposal avoids some of the problems with the functor ⊂, the puzzling effect of “(improperly) included” remains and so does the definition of int y as yielding “the interior of x”. Herweg (1989) advocates an abstract SF (in) which draws on proper spatial inclusion such that the examples in (21) are semantically marked due to violating the “Presupposition of Argument Homogeneity”. The resulting truth value gap triggers certain functionbased accommodations at the CS level that account for the interpretations of (21a–d). Hottenroth (1991), in a detailed analysis of French dans, rejects the idea that SF (dans) might draw on imprecise region-creating constants like int y. Instead, SF (dans) should encode the conditions on the relatum in prototypical uses of dans. The standard reference region of dans is a three-dimensional empty closed container (bottle, bag, box etc.). If the relatum of dans does not meet one or more of these characteristics, the reference region is conceptually adapted by means of certain processing principles (laws of closure, mental demarcation of unbounded y, conceptual switching from 3D to 2D etc.). In view of data like those in (21), Carstensen (2001) proposes to do away with the region account altogether and to replace it with a perception-based account of prepositions that draws on the conceptual representation of changes of focused spatial attention. To sum up, the brief survey of developments in the semantic analysis of prepositions may also be taken as proof of the heuristic productivity emanating from the SF vs. CS distinction. Among polysemous verbs, the verb to open has gained much attention, cf. Bierwisch (article 16). Based on a French-German comparison, Schwarze & Schepping (1995) discuss what type of polysemy is to be accounted for at which of the two levels. Functional categories (determiners, complementizers, connectives etc.), whose lexical meanings lack any support in perception and are hence purely operative, have seldom been analyzed in terms of the SF vs. CS distinction so far; but cf. Lang (2004) for an analysis that accounts for the abstract meanings of and, but etc. and their contextual specification by inferences drawn from the structural context, the discourse context, and/or from world knowledge. Clearly, the ‘poorer’ the lexical meaning of such a synsemantic lexical item, the more will its semantic contribution need to be enriched by means of contextualization.
3. Compositionality and beyond: Semantic underspecification and coercion Two-level Semantics was first mainly concerned with polysemy problems of the kind illustrated in the previous section. Emphasis was laid on developing an adequate theory
31. Two-level Semantics: Semantic Form and Conceptual Structure of lexical semantics that would be able to deal properly and on systematic grounds with the distinction of word knowledge and world knowledge. A major tenet of Two-level Semantics as a lexicon-based theory of natural language meaning is that the internal decompositional structure of lexical items determines their external combinatorial properties, that is, their external syntactic behavior. This is why compositionality issues are of eminent interest to Two-level Semantics; cf. (1a). There is wide agreement among semanticists that, given the combinatorial nature of linguistic meaning, some version of the principle of compositionality – as formulated, e.g., in (23) – must certainly hold. But in view of the complexity and richness of natural language meaning, there is also consensus that compositional semantics is faced with a series of challenges and problems; see article 6 (Pagin & Westerståhl) Compositionality. (23) Principle of compositionality: The meaning of a complex expression is a function of the meanings of its parts and the way they are syntactically combined. Rather than weakening the principle of compositionality or abandoning it altogether, Two-level Semantics seeks to cope with the compositionality challenge by confining compositionality to the level of Semantic Form. That is, SF is understood as comprising exactly those parts of natural language meaning that are (i) context-independent and (ii) compositional, in the sense that they are built in parallel with syntactic structure. This leaves space to integrate non-compositional aspects of meaning constitution at the level of Conceptual Structure. In particular, the mapping of SF-representations onto CSrepresentations may include non-local contextual information and thereby qualify as non-compositional. Of course, the operations at the CS level as well as the SF – CS mapping operations are also combinatorial and can therefore be said to be compositional in a broader sense. Yet their combinatorics is not bound to mirror the syntactic structure of the given linguistic expression and thus does not qualify as compositional in a strict sense. This substantiates the assumption of two distinct levels of meaning representation as discussed in §1. Thus, Two-level Semantics’ account of the richness and flexibility of natural language meaning constitution consists in assuming a division of labor between a rather abstract, context-independent and strictly compositionally determined SF and a contextually enriched CS that also includes non-compositionally derived meaning components. Various solutions have been proposed for implementing this general view of the SF vs. CS distinction. These differ mainly in (a) the syntactic fine-tuning of the compositional operations and the abstractness of the corresponding SF-representations, and in (b) the way of handling non-compositional meaning aspects in terms of, e.g., coercion operations. These issues will be discussed in turn.
3.1. Combinatory meaning variation Assumptions concerning the spell-out of the specific mechanisms of compositionality are generally guided by parsimony. That is, the fewer semantic operations warranting compositionality are postulated, the better. On this view, it would be attractive to have a single semantic operation, presumably functional application, figuring as the semantic counterpart to syntactic binary branching. An illustration is given in (24):
723
724
VI. Cognitively oriented approaches to semantics Given the lexical entries for the locative preposition in and the proper noun Berlin in (24a) and (24b) respectively, functional application of the preposition to its internal argument yields (24c) as the compositional result corresponding to the semantics of the PP. (24) a. in: λy λx (loc (x, in*(y))) b. Berlin: berlin c. [PP in [DP Berlin]]: λy λx (loc (x, in*(y))) (berlin) ≡ λx (loc (x, in*(berlin))) Functional application is suitable for syntactic head-complement relationships as it reveals a correspondence between the syntactic head-non-head relationship and the semantic functor-argument relationship. In (24c), for instance, the preposition in is both the syntactic head of the PP and the semantic functor, which takes the DP as its argument. Syntactic adjuncts, on the other hand, cannot be properly accounted for by functional application as they lack a comparable syntax-semantics correspondence. In syntactic head-adjunct configurations the semantic functor, if any, is not the syntactic head but the non-head; for an overview of the different solutions that have been put forth to cope with this syntax-semantics imbalance see article 54 (Maienborn & Schäfer) Adverbs and adverbials. Different scholars working in different formal frameworks have suggested remarkably convergent solutions, according to which the relevant semantic operation applying to syntactic head-adjunct configurations is predicate conjunction. This might be formulated, for instance, in terms of a modification template MOD as given in (25); cf., e.g. Higginbotham’s (1985) notion of θ -identification, Bierwisch’s (1997) adjunction schema, Wunderlich’s (1997b) argument sharing, or the composition rule of predicate modification in Heim & Kratzer (1998). (25) Modification template MOD: MOD: λQ λP λx (P(x) & Q(x)) The template MOD takes a modifier and an expression to be modified (= modifyee) and turns it into a conjunction of predicates. More specifically, an (intersective) modifier adds a predicate that is linked up to the referential argument of the expression to be modified. In (26) and (27) illustrations are given for nominal modification and verbal modification, respectively. In (26), the semantic contribution of the modifier is added as an additional predicate of the noun’s referential argument. In (27), the modifier provides an additional predicate of the verb’s eventuality argument. (26) a. house: λz (house (z)) λu (loc (u, in*(berlin))) b. [PP in Berlin]: c. [NP [NP house] [PP in Berlin]]: λQ λP λx (P(x) & Q(x)) (λz (house (z))) (λu (loc (u, in*(berlin)))) ≡ λx (house (x) & loc (x, in*(berlin))) (27) a. sleep: λz λe (sleep (e) & agent (e, z)) λu (loc (u, in*(berlin))) b. [PP in Berlin]: c. [VP [VP sleep] [PP in Berlin]]:
31. Two-level Semantics: Semantic Form and Conceptual Structure λQ λP λx (P(x) & Q(x))(λz λe (sleep (e) & agent (e, z)))(λu (loc (u, in*(berlin)))) ≡ λz λe (sleep (e) & agent (e, z) & loc (e, in*(berlin))) The semantic template MOD thus provides the compositional semantic counterpart to syntactic head-adjunct configurations. There are good reasons to assume that, besides functional application, some version of MOD is required when it comes to spelling out the basic mechanisms of compositionality. The template MOD in (25) captures a very fundamental insight about the compositional contribution of intersective modifiers. Nevertheless, scholars working within the Two-level Semantics paradigm have emphasized that a modification analysis along the lines of MOD fails to cover the whole range of intersective modification; cf., e.g., Maienborn (2001, 2003) for locative adverbials, Dölling (2003) for adverbial modifiers in general, Bücking (2009, 2010) for nominal modifiers. Modifiers appear to be more flexible in choosing their compositional target, both in the verbal domain and in the nominal domain. Besides supplying an additional predicate of the modifyee’s referential argument, as in (26) and (27), modifiers may also relate less directly to their host argument. Some illustrations are given in (28)–(30). (For the sake of simplicity the data are presented in English.) (28) a. The cook prepared the chicken in a Marihuana sauce. b. The bank robbers escaped on bicycles. c. Paul tickled Maria on her neck.
(cf. Maienborn 2003)
(29) a. Anna dressed Max’s hair unobtrusively. b. Ede reached the summit in two days.
(cf. Dölling 2003: 530) (cf. Dölling 2003: 516)
(30) a. the fast processing of the data b. the preparation of the chicken in a pepper sauce c. Georg’s querying of the men
(cf. Bücking 2009: 94) (cf. Bücking 2009: 102) (cf. Bücking 2010: 51)
The locative modifiers in (28) differ from the general MOD pattern as illustrated in (27) in that they do not locate the whole event but only one of its integral parts. For instance, in (28b) it’s not the escape that is located on bicycles but – according to the preferred reading – the agent of this event, viz. the bank robbers. In the case of (28c), the linguistic structure does not even tell us what is located on Maria’s neck. It could be Paul’s hand but also, e.g., a feather he used for tickling Maria. Maienborn (2001, 2003) calls these modifiers “event internal modifiers” and sets them apart from “event external modifiers” such as in (27), which serve to holistically locate the verb’s eventuality argument. Similar observations are made by Dölling (2003) wrt. cases like (29). Sentence (29a) is ambiguous. It might be interpreted as expressing that Anna performed the event of dressing Max’s hair in an unobtrusive manner. This is what the application of MOD would result in. But (29a) has another reading, according to which it is not the event of hair dressing that is unobtrusive but Max’s resulting hair-style. Once again, the modifier’s contribution does not apply directly to the verb’s eventuality argument but to some referent related to it. The same holds true for (29b), where the temporal adverbial cannot relate to the punctual event of Ede reaching the summit but only to its preparatory phase.
725
726
VI. Cognitively oriented approaches to semantics Finally, Bücking (2009, 2010) discusses a series of cases in the nominal domain which also show a less direct relationship between the modifier and its host argument than the one established by MOD; cf. (25). The modifier fast in (30a), for instance, may be interpreted event-externally, expressing that the overall duration of the processing was short. But (30a) also has an event-internal interpretation, according to which the subevents of processing the data were performed in a fast manner (whereas the whole processing might have taken a long time). In a similar vein, Georg need not necessarily be the agent of the querying in (30c). Bücking argues that the prenominal genitive establishes a more indirect relationship to the nominal referent, such that a more abstract control relation between Georg and the query would suffice; cf. the one provided by the context in (31). (31) Georg wanted to know how mens’ buying behavior is influenced by the weather. He therefore instructed his research assistants to interview men under varying weather conditions. Georg’s querying of the men is still considered a milestone in consumer research. (cf. Bücking 2010: 51) The conclusion to be drawn from these and similar studies is that modifiers show a remarkable flexibility in relating to their compositionally determined host argument, thus giving rise to a wide spectrum of meaning variations. Is there a way to treat this observation compositionally? The proposals developed by Bücking, Dölling and Maienborn basically amount to liberalizing MOD such that it may license the particular kind of semantic underspecification observed above. That is, besides linking the semantic contribution of the modifier directly to the verb’s or noun’s referential argument, as in (25), there should be a less direct variant that could be spelled out as in (32). (32) Modification template MOD’: MOD’: λQ λP λx (P(x) & R (x, v) & Q(v)) MOD’ introduces a free variable v that is linked to the modifyee’s referential argument x by means of a relational variable R. Both v and R are so-called SF-parameters, i.e. free variables that remain underspecified at the level of SF and will only be instantiated at the level of CS. Applying MOD’ to a sentence such as (28c), repeated as (33), yields the following SF: (33) Paul tickled Maria on her neck. SF: ∃e (tickle (e) & agent (e, paul) & patient (e, maria) & R (e, v) & loc (v, on*(maria’s neck)) According to the SF in (33), an entity v which is involved in the tickling event is located on Maria’s neck. This is as far as the compositional semantics of event-internal modifiers takes us. The identification of v and its exact role in e can only be spelled out at the CS level by taking into account contextually available world knowledge. This would include, e.g., knowledge about the spatial configuration required for tickling, viz. contact, as well as knowledge about suitable and/or plausible instruments employed for tickling. A potential conceptual spell-out is given in (34); cf. Maienborn (2003: 490ff) for details.
31. Two-level Semantics: Semantic Form and Conceptual Structure (34) Paul tickled Maria on her neck. SF: ∃e (tickle (e) & agent (e, paul) & patient (e, maria) & R (e, v) & loc (v, on*(maria’s neck)) CS: ∃ex (tickle (e) & agent (e, paul) & patient (e, maria) & instr (e, x) & feather (x) & loc (x, on*(maria’s neck)) This conceptual spell-out provides a plausible utterance meaning for sentence (34). It goes beyond the compositionally determined meaning by exploiting our conceptual knowledge that tickling is performed with some instrument which needs to have spatial contact to the object being tickled. Consequently, the SF-parameter R can be identified as the instrument relation, and the parameter v may be instantiated, e.g., by a feather. Although not manifest at the linguistic surface, such conceptually inferred units are plausible potential instantiations of the compositionally introduced SF-parameter v. (Dölling and Maienborn use abduction as a formal means of deriving a contextually specified CS from a semantically underspecified SF; cf. Hobbs et al. (1993). We will come back to the SF-CS mapping in §4.) Different proposals have been developed for implementing the notion of a more liberal and flexible combinatorics, such as MOD’, into the compositional machinery. Maienborn (2001, 2003) argues that MOD’ is only licensed in particular structural environments: Event-internal modifiers have a base adjunction site in close proximity to the verb, whereas event-external adjuncts adjoin at VP-level. These distinct structural positions provide the key to a compositional account. Maienborn thus formulates a more fine-tuned syntax-semantics interface condition that subsumes MOD and MOD’ under a single compositional rule MOD*. (35) Modification template MOD*: MOD*: λQ λP λx (P(x) & R (x, v) & Q(v)) Condition on the application of MOD*: If MOD* is applied in a structural environment of categorial type X, then R = part-of, otherwise (i.e. in an XP-environment) R is the identity function. If MOD* is applied in an XP-environment, then R is instantiated as identity, i.e. v is identified with the referential argument of the modified expression, thus yielding the standard variant MOD. If applied in an X-environment, R is instantiated as the partof relation, which pairs entities with their integral constituents. Thus, in Maienborn’s account the observed meaning variability is traced back to a grammatically constrained semantic indeterminacy that is characteristic of modification. Dölling (2003) takes a different track by assuming that the SF-parameter R is not rooted in modification but is of a more general nature. Specifically, he suggests that R is introduced compositionally whenever a one-place predicate enters the composition. By this move, the SF of a complex expression is systematically extended by a series of SF-parameters, which guarantee that the application of any one-place predicate to its argument is systematically shifted to the conceptual level. On Dölling’s account, the SF of a complex linguistic expression is maximally abstract and underspecified, with SF-parameters delineating possible (though not necessarily actual) sites of meaning variation. Differences aside, the studies of Dölling, Maienborn and other scholars working in the Two-level Semantics paradigm emphasize that potential sources for semantic
727
728
VI. Cognitively oriented approaches to semantics indeterminacy are not only to be found in the lexicon but may also emerge in the course of composition, and they strive to model this combinatory meaning variation in terms of a rigid account of lexical and compositional semantics. A key role in linking linguistic and extra-linguistic knowledge is taken by so-called SF-parameters. These are free variables that are installed under well-defined conditions at SF and are designed to be instantiated at the level of CS. SF-parameters are a means of triggering and controlling the conceptual enrichment of a grammatically determined meaning representation. They delineate precisely those gaps within the Semantic Form that call for conceptual specification and they impose sortal restrictions on possible conceptual fillers. SF-parameters can thus be seen as well-defined windows through which compositional semantics allows linguistic expressions to access and constrain conceptual structures.
3.2. Non-compositional meaning adjustments Conceptual specification of a compositionally determined, underspecified, abstract meaning skeleton, as illustrated in the previous section, is the core notion that characterizes the Two-level Semantics perspective on the semantics-pragmatics interface. Its focus is on the conceptual exploitation of a linguistic expression’s regular meaning potential. A second focus typically pursued within Two-level Semantics concerns the possibilities of a conceptual solution of combinatory conflicts arising in the course of composition. These are combinatory adjustment operations by which a strictly speaking ill-formed linguistic expression gets an admissible yet irregular interpretation. In the literature such noncompositional rescue operations are generally discussed under the label of “coercion”. An example is given in (36). (36) The alarm clock stood intentionally on the table. The sentence in (36) does not offer a regular integration for the subject-oriented adverbial intentionally, i.e, the subject NP the alarm clock does not fulfill the adverbial’s selectional restriction for an intentional subject. Hence, a compositional clash results, and the sentence is ungrammatical. Nevertheless, although deviant, there seems to be a way to rescue the sentence so that it becomes acceptable and interpretable. In the case of (36), a possible repair strategy would be to introduce an actor who is responsible for the fact that the alarm clock stands on the table. This move would provide a suitable anchor for the adverbial’s semantic contribution. Thus, we understand (36) as saying that someone put the alarm clock on the table on purpose. That is, in case of a combinatorial clash, there seems to be a certain leeway for non-compositional adjustments of the compositionally derived meaning. The defective part is “coerced” into the right format. Coercion phenomena are a topic of intensive research in current semantics. Up to now the primary focus has been on the widely ramified notion of aspectual coercion (e.g. Moens & Steedman 1988; Pulman 1997; de Swart 1998; Dölling 2003, 2010; Egg 2005) and on cases of so-called “complement coercion” as in Peter began the book (e.g. Pustejovsky 1995; Egg 2003; Asher 2007); see article 25 (de Swart) Mismatches and coercion for an overview. The framework of Two-level Semantics is particularly suited to investigate these borderline cases at the semantics-pragmatics interface because of its
31. Two-level Semantics: Semantic Form and Conceptual Structure
729
comparatively strong assumptions and predictions about this interface in terms of SFand CS-representations, and about the kind of knowledge available at each level. To give an example, one issue emphasized by Dölling (2010) is that it is not only grammatical conflicts that trigger coercion operations (as predominantly assumed in the literature), but that such operations may also be employed for solving conflicts or expectations that arise from world knowledge. If we take for instance a variant of sentence (36) such as (37), there is no immediate need for a non-compositional rescue operation anymore. The subject NP the children fulfills the adverbial’s selectional restriction for an intentional subject, hence, the sentence can be interpreted strictly compositionally with the children as intentional subjects. Nevertheless sentence (37) still has a second reading – viz. the only possible reading for (36) – according to which someone else, e.g. their teacher, put the children on the table on purpose. (37) The children stood intentionally on the table.
(2 readings)
Dölling (2010) draws the conclusion that rather than being borderline cases with somehow irregular interpretations, so-called coercion phenomena are just another instance of semantic underspecification; cf. §3.1. Thus, he would propose to derive an abstract, underspecified SF for both (36) and (37), and to defer its specification to the level of CS. On the other hand, the following data are problematic for a radical underspecification account such as Dölling’s. (38) *The alarm clock stood voluntarily on the table. (39) The children stood voluntarily on the table.
(1 reading)
Sentence (38) is ungrammatical. There is no way of rescuing it along the lines of (36). Although from a conceptual perspective it would make equally good sense to interpret (38) as expressing that someone put the alarm clock voluntarily on the table, there is no such rescue option available. Apparently the linguistic system prevents such a resort. In the same vein, sentence (39) only has one reading, according to which it is the children’s will to stand on the table but not that of another person. These observations suggest that the additional readings available for (36) and (37) are not fully regular interpretations but coerced ones. They show the need for scrutinizing on a much broader empirical basis the conspiracy of grammatical, conceptual and pragmatic factors that license and constrain the coercion phenomena; see also the different viewpoints on this issue put forward by Dölling (2005b), Rothstein (2005) and Maienborn (2005a,b). A comparatively new kind of evidence that might help clarify matters is provided by psycholinguistic studies; see Pylkkänen & McElree (2006) for a state of the art report on coercion. The short discussion of (36)–(39) gives a slight impression of the wide range of options currently tested in sharpening our understanding of the semantics-pragmatics interface and the implications they have for our assumptions about compositionality. The matter of how much grammar gets into meaning constitution and what else may join it to establish a full-fledged utterance meaning of natural language expressions is still far from being settled.
730
VI. Cognitively oriented approaches to semantics
4. More on SF variables and their instantiation at the CS level As pointed out in section 2.1, it was mainly the conceptual subsystem of spatial cognition that has stimulated pioneering investigations within Two-level Semantics. Therefore, it may be appropriate to report some of the analyses proposed in the realm of dimensional designation of spatial objects, cf. Bierwisch & Lang (1989a); Bierwisch (1996, 1997); Bierwisch & Schreuder (1992); Lang (1990, 1994, 2001); Lang, Carstensen & Simmons (1991). It is the complex interaction of two major grammatical modules, viz. gradation/ comparison and dimension assignment, which make facts and insights in this field especially rewarding to semanticists. In order to discover the full range of relevant data, the basic assumption of Two-level Semantics (quoted at the outset of section 3), i.e. that the internal componential structure of lexical items determines their external combinatorial properties, has been converted into a heuristic guideline: Eliciting the combinatorics of dimension assignment (DA) terms for spatial objects by means of tasks like naming object extents or guessing objects by their dimensions etc. will reveal both the lexical meaning of each DA term and the structural pattern determining the lexical field which the DA term is an element of.
4.1. Variables in SF representations of spatial dimension terms In Bierwisch & Lang (1989a), SF representations of German and English dimensional adjectives are taken to be complex 3-place predicates. Their general format is shown in (40); the variables in (40) are distinguished by the type of operators that bind them.
(40) λc λx [quant [ dim x ] = [ v c ]] First, there are variables in argument places that are subject to λ-abstraction, λconversion and other binding operations: (i) an object x that is assigned a dimension d, with d ∈ {dim} and dim being a metavariable on dimension assignment parameters, cf. (42) below; (ii) a difference value c which is added to (+), or subtracted from (–), the comparison value v. Second, the variable v is a free variable which – depending on the respective structural context within the clause – may assume one of the following values: (iii) “0” if c contains a Measure Phrase or “norm of the class which x belongs to” if dim is an AP in the positive without complement or “content of the comparative phrase” if dim is part of a comparative construction. The admissible specifications of the comparison value v are subject to some general conditions which are motivated by CS but have been formulated as conditions on well-formed SF representations; for details justifying that solution, cf. Bierwisch & Lang (1989b). The operator quant is an SF functor constant which selects the type of scale induced by dim and triggers existential quantification of the value c in accordance with the Unspecified Argument Rule, (cf. Lang 1985; Bierwisch 1989: 76) such that the SF of, e.g., The pole is long comes out as in (41), where def.pole’ abbreviates the meaning of the subject the pole:
31. Two-level Semantics: Semantic Form and Conceptual Structure (41) ∃c [[quant max def.pole’] = [Normpole + c]] This much on SF variables coming with dimension terms and on their instantiation in structural contexts that are provided by the morphosyntax of the sentence at issue. After a brief look at the elements instantiating the metavariable dim, we will discuss a type of SF variable that is rooted in the lexical field structure of DA terms. Conceived as a basic module of cognition, dimension assignment to spatial objects involves entities and operations at three levels. The perceptual level provides the sensory input from vision and other senses; the conceptual level serves as a filter system reducing perceptual distinctions to the level that our everyday knowledge of space needs, and the semantic level accounts for the ways in which conceptually approved features are encoded in categorized lexemes and arranged in lexical fields. DA basically draws on Dimension Assignment Parameters (DAP) that are provided by two frames of reference, which determine the dimensional designation of spatial objects: (42) a. The Inherent Proportion Schema (IPS) yields proportion-based gestalt features by identifying the object’s extents as maximal, minimal, and across axis, respectively. b. The Primary Perceptual Space (PPS) yields contextually determined position features of spatial objects by identifying the object’s extents as aligned with the vertical axis, with the observer axis, and/or with an across axis in between.
The DAP in small caps listed in (42) occur in two representational formats that reflect the SF vs. CS distinction. In SF representations, the DAP figure as functor constants of category N/N in the SF of L-particular dimension terms that instantiate {dim} within the general schema in (40). In CS representations, elements of the DAP inventory figure as conceptual features in so-called Object Schemata (cf. 4.2 below) that contain the conceptually defining as well as the contextually specified spatial features of the object at issue. Lang (2001) shows that the lexical field of spatial dimension terms in a language L is determined by the share it has in IPS and PPS, respectively. While reference to the vertical is ubiquitous, the lexical coverage of DA terms amounts to the following typology: proportion-based languages (Mandarin, Russian) adhere to IPS, observer-based ones (Korean, Japanese) adhere to PPS, and mixed-type ones (English, German) draw on an overlap between IPS and PPS. The semantic effects of this typology are inter alia reflected by the respective across terms: In P-based and in O-based languages, they are lexically distinct and referentially unambiguous, in mixed-type languages like English they lack both of these properties.
Note the referential ambiguity of the English across term wide in (44.1) and its contextualized interpretations in (44.2 – 4) when referring to a board sized 100 × 30 × 3 cm in the spatial settings I–III shown in (43):
731
732
VI. Cognitively oriented approaches to semantics
(43) I
II
III
c c b
b c
b a
a = long, b = wide
a
a = wide, b = high
a
a = wide, b = deep
(44) 1. The board is wide enough, but too thin.
[I: wide = b; II & III: wide = a] 2. The board is long and wide enough, but too thin. [wide = b as in setting I] 3. The board is high and wide enough, but too thin. [wide = a as in setting II] 4. The board is deep and wide enough, but too thin. [wide = a as in setting III]
As regards the manner of DA, note the following pairwise differences: (43 I) and (44.2) refer to the board as such by confining its DA to P-based gestalt properties, whereas (43 II, III) and (44.2, 3) account for the board’s increasing integration into the surrounding spatial context. This in turn entails that (44.2) can be applied to setting II or III as well, but (44.4 and 3) may not be applied to setting II and I, respectively. Now let us look at the relationship between object extents and DA terms. Whereas the coupling of extent c and the term thin (or its antonym thick) is constant in I–III, the across term wide can refer to a or to b. The choice is determined by the situational context, cf. (43 I–III), and/or the linguistic context available, cf. (44.1–4). In short, the English across term wide selects an object extent d that is orthogonal to an object extent d’, with d’ ∈ {max, vert, obs}. The set includes those dimensions from IPS (max) and from PPS (vert, obs) that are independently assignable to object extents. The inherent relativity of wide requires its SF to contain – in addition to the schema in (4) – an ∃-bound variable d’ to be instantiated in the situational and/or the linguistic context: (45) λc [λx [∃d’ [[quant across ⊥ d’ x ] = [ v c ]]]], with d’ ∈ {max, vert, obs} Without contextual clues about d’, wide is ambiguous or unspecified between referring to extent a or to extent b, cf. (44.1). In the spatial settings in (43 I–III), the relevant extent d’ is visible, in the sentences (44.2–4) d’ is linguistically accessible. The intermodal equivalence of visual and verbal contexts wrt. selecting the constant that replaces d’ provides a strong argument for the view that the specification of the object extent which wide refers to takes place at the CS level. It is CS representations that provide the visual and/or linguistic information based on which the selectional restriction “d’ ∈ {max, vert, obs}” in (45) can be operative, cf. (43) and (44).
31. Two-level Semantics: Semantic Form and Conceptual Structure
However, the restriction on d’ is not just an idiosyncratic feature of the lexical item wide/small but a condition on DA terms in L following from its typological make-up as a P/O-mixed-type language. Correspondingly, P-based languages restrict across terms to IPS requiring “d’ ∈ {max}”, and O-based languages to PPS by requiring “d’ ∈ {obs}”, cf. Lang (2001) for details. Now, having located the source of the referential ambiguity of wide – small at the SF level and identified CS as the level where the ambiguity is resolved, provided that suitable context information is available, we want to know how the spatial settings shown in (43) and verbally described in (44) can be homogenized at the level of CS representations.
4.2. Object Schemata as CS representations A suitable way of representing concepts of spatial objects is by means of a matrix with 3 rows and up to 3 columns, called Object Schema (OS), cf. Lang (1989, 1990); Lang, Carstensen & Simmons (1991). An OS contains entries which represent spatial properties of objects in three tiers. The 1st row represents an object’s (i) dimensionality by variables for object axes, i.e. a, a b, or a b c, ordered by their relative salience such that within the general OS for buildings the entry vert in a vs. b vs. c differentiates the OS of a sky-scraper from that of an apartment house or of a bungalow; (ii) boundedness by to set apart undimensionable objects (sky, weather) or objects named by mass nouns (air, water); (iii) integration of axes by (…) to distinguish a disk < (a b) c > from a pole < a (b c) > and a ball < (a b c) >. The 2nd row lists the object’s gestalt and position properties by primary entries like max, min, vert, obs, which stand either for (i) axial concepts induced by DA terms whose SF contains max, min, vert, obs or for (ii) concepts activated by nonlinguistic, i.e. visual or tactile, input on the object at issue. Empty cells with Ø in the 2nd row mark object extents that may be designated by several distinct DAP depending on the position properties attributed to the object at hand. The 3rd row (separated by a horizontal line) displays the results of contextualizing the entries in the 2nd row and hence the contextually specified DA of the object at issue. The mapping between DAP as SF functor constants in small caps and their counterparts in OS as CS entries in lower case letters involves two operations defined as follows: (46) a. Identification: P ⇒ p, with P ∈ { max, min, across, vert, obs …}, p ∈ { max, min, across, vert, obs …} and p is a 3rd row entry in OS b. Specification: Q ⇒ p, with Q ∈ { vert, obs, across, … }, p ∈ { max, Ø, vert, ….} and p is licensed as a landing site for Q in OS (47) below shows the distinct OS serving as CS representations of the board in the settings in (43) as well as of the utterance meanings of the sentences in (44). To elucidate (i) the intermodal equivalence of the context information available from (43) or (44) and (ii) how it is reflected in the corresponding OS, the setting numbers and
733
734
VI. Cognitively oriented approaches to semantics
the pertinent DA terms for a and b have been added in (47). The respective extent chosen as d’ to anchor across in the OS at issue and/or to interpret wide in (44.2–4) is in boldface. (47)
I
II
III
min
min
min
max
across
min
across
vert
min
across
obs
min
a = long, b = wide
a = wide, b = high
a = wide, b = deep
The OS in (47) as CS representations of (43) and (44) capture all semantic aspects of DA discussed so far but they deserve some further remarks. First, (47-I) results from primary identification à la (46a) indicated by matching entries in the 2nd and 3rd row, while (47-II and III) are instances of contextual specification as defined in (46b). Second, the typological characteristics of a P/O-mixed-type language are met as d’ for wide may be taken from IPS as in (47 I) or from PPS as in (47 II and III). Third, the rows of an OS, which contain the defining spatial properties and possibly also some contextual specifications, can be taken as a heuristic cue for designing the SF representations of object names that lexically reflect the varying degree of integration into spatial contexts we observe in (43–44), e.g. board (freely movable) < notice-board (hanging) < windowsill (bottom part of a window) – in this respect OS may be seen as an attempt to capture what Bierwisch (article 16 in this volume) calls “dossiers”. Fourth, Lang, Carstensen & Simmons (1991) presents a Prolog system of DA using OS enriched by sidedness features, and Lang (2001) proposes a detailed catalogue of types of spatial objects with their OS accounting for primary entries and for contextually induced orientation or perspectivization. Fifth, despite their close interaction by means of the operations in (46), DAP as elements of SF representations and OS entries as CS elements are subject to different constraints, which is another reason to keep them distinct. The entries in an OS are subject to conditions of conceptual compatibility that inter alia define the set of admissible complex OS entries listed as vertically arranged pairs in (48): (48) max across
max
max
Ø
Ø
Ø
vert
obs
across
vert
obs
An important generalization is that (48) holds independently of the way in which the complex entry happens to come about. So, the combination of max and vert in the same column may result from primary identification in the 2nd row, cf. The pole is 2m tall, where the SF of tall contains max & vert x as a conjunction of DAP, or from contextual specification, cf. The pole is 2m high, where vert is added in the 3rd row. The semantic structure of DA terms is therefore constrained by compatibility conditions at the CS level but within this scope it is cross-linguistically open to different lexicalization patterns and to variation of what is covered by the SF of single DA terms.
31. Two-level Semantics: Semantic Form and Conceptual Structure
Finally, whereas OS may contain one or more Ø or entries that have a share in both IPS and PPS (as does e.g. across), the DA of spatial objects by linguistic means is subject to the following uniqueness constraint: (49) In an instance of naming distinct axial extents a, b, c of some object x by enumerating DA terms, each DAP and each extent may occur only once. Reminiscent of the Θ-criterion, (49) excludes e.g. (i) *The board is long and wide enough, but too small or (ii) *The pole is 2m long and 2m high/tall as ill-formed. Though disguised by distinct lexical labels, wide and small in (i) are conflicting occurrences of the DAP across, whereas long and high/tall in (ii) compete for one and the same extent a. The uniqueness constraint in (49) exemplifies one of the pragmatic felicity conditions on linguistic communication; cf. §1.4 above. Structurally, (49) follows from the homogeneity condition on the conjuncts in coordinate structures; theoretically, (49) is an outcome of the Gricean Maxim of Manner, especially of the sub-maxim “Avoid ambiguity!”.
4.3. Inferences The distinction of SF vs. CS representations, hitherto exemplified by DAP as SF constants for dimension terms and by OS as a CS format for spatial objects, respectively, is also relevant to the way inferences in the realm of spatial cognition are semantically accounted for. The SF vs. CS distinction outlined by (2)–(5) in §1.2 reappears in a division of labor between (i) inferences that draw on permanent lexical knowledge made available in SF format and (ii) inferences that are performed on contextually specified CS representations. We will illustrate this correlation by means of three groups of data.
4.3.1. Lexical antonymy While hyponymy and synonymy are non-typical lexical relations among DA terms, various facets of antonymy seem to be indispensable to them; cf. Lang 1995. The SF of DA terms, cf. (40) and (45), is componential as it results from decomposing the meaning of lexical items into suitable building blocks, that is, into SF components which are interrelated by meaning postulates and which therefore allow for purely lexicon-based inferences. There are two sorts: (i) schema-forming SF components (e.g. become and cause, cf. Bierwisch 2005, 2010; Wunderlich 1997a); and (ii) schemafilling SF components (e.g. the elements of {dim} in (42) and (46) or operative elements like ‘∃’, ‘’ or ‘=’ in (45)). Two DA terms are lexical antonyms if (i) they share the same DAP in forming polar opposites, (ii) assign contrary values to d, (iii) allow for converse comparatives etc. Inferences that draw on lexical antonymy show up in entailments between sentences, cf. (50), and are codified as lexical knowledge postulates at the SF level, cf. (51). We neglect details concerning ‘=’, abbreviate SF (the board) by B, and take N(orm value) and K(ey value) to instantiate the comparison value v in (50a) and
735
736
VI. Cognitively oriented approaches to semantics
(50b), respectively. For the whole range of entailments and SF postulates based on DA terms see Bierwisch (1989). (50) a. The board is short b. The board is not long enough
→ The board is not long. ↔ The board is too short
(51) a. ∃c [[ quant max B] = [N – c ]] ⇒ ∼ [∃c [[∃c [[ quant max B] = [N + c ]]] b. ∼ [∃c [[ quant max B] = [K + c ]]] ⇔ ∃c [[ quant max B] = [K – c ]]
4.3.2. Contextually induced dimensional designation Valid inferences like those in (52) are accounted for, and invalid ones like those in (53) are avoided, by drawing on the information provided by, or else lacking in, contextually specified OS. (52) a. The board is 1m wide and 0.3 m high → The board is 1m long and 0.3m wide b. The pole is 2m tall/2m high → The pole is 2m long
(53) a. The wall is wide and high enough –→ The wall is long and wide enough b. The tower is 10 m tall/high –→ *The tower is 10 m long. The valid inferences result from the operation of de-specification, which is simply the reverse of the operation of contextual specification defined in (46b): (54) De-specification: a. For any OS for x with a vertical entry < p, q >, there is an OS’ with < p, p >. b. For any OS for x with a vertical entry < Ø, q >, there is an OS’ with < Ø, across >.
The inferences in (53a, b) are ruled out as invalid because the OS under review do not contain the type of entries needed for (54) to apply.
4.3.3. Commensurability of object extents Note that the DA terms long, wide and/or thick are not hyponyms to big despite the fact that big may refer to the [v + c] of one, two or all three extents of a 3D object, depending on the OS of the objects at issue. When objects differing in dimensionality are compared by using the DA term big, the dimensions it covers are determined by the common share of the OS involved, cf. (55): (55) a. My car is too big for the parking space. b. My car is too big for the garage door.
(too long and/or too wide) (too wide and/or too high)
So it is above all the two mapping operations between SF and CS representations as defined in (46a, b) and exemplified by DAP and OS that account for the whole range of seemingly complicated facts about DA to spatial objects.
31. Two-level Semantics: Semantic Form and Conceptual Structure
5. Summary and outlook In this article we have reported on some pros and cons related to distinguishing SF and CS representations and illustrated them by data and facts from a selection of semantic phenomena. Now we briefly outline the state of the art in more general terms and take a look at the desiderata that define the agenda for future research. The current situation can be summarized in three statements: (i) the SF vs. CS distinction brings in clear-cut advantages as shown by the examples in §§ 2–4; (ii) we still lack reliable heuristic strategies for identifying the appropriate SF of a lexical item; (iii) it is difficult to define the scope of variation a given SF can cover at CS level. What we urgently need is independent evidence for the basic assumption underlying the distinction: SF representations and CS representations differ in nature as they are subject to completely different principles of organization. By correlating the SF vs. CS distinction with distinctions relevant to other levels of linguistic structure formation, cf. (2)–(5) in section 1, the article has taken some steps in that direction. One of them is to clarify the differences between SF and CS that derive from their linguistic vs. non-linguistic origin; cf. (4). The linguistic basis of the SF-representations of DA terms, for instance, is manifested (i) in the DAP constants’ interrelation by postulates underlying lexical relations, (ii) in participating in certain lexicalization patterns (e.g. proportion-based vs. observer-based), (iii) in being subject to the uniqueness constraint in (49), which is indicative of the semioticity of the system it applies to, whereas the Conceptual System CS is not a semiotic one. Pursuing this line of research, phenomena specific to natural languages like idiosyncracies, designation gaps, collocations, connotations, folk etymologies etc. should be scrutinized for their possible impact on establishing SF as a linguistically determined level of representation. The non-linguistic basis of CS-representations, e.g. OS involved in DA, is manifested (i) in the fact that OS entries are exclusively subject to perception-based compatibility conditions; (ii) in their function to integrate input from the spatial environment regardless of the channel it comes in; (iii) in their property to allow for valid inferences to be drawn on entries that are induced as contextual specifications. To deepen our understanding of CS-representations, presumptions like the following deserve to be investigated on a broader spectrum and in more detail: (i) CS representations may be underspecified in certain respects, cf. the role of ‘Ø’ in OS, but as they are not semiotic entities they are not ambiguous; (ii) the compatibility conditions defining admissible OS suggest that the following relation may hold wrt. the well-formedness of representations: sortal restrictions ⊂ selectional restrictions; (iii) CS representations have to be contingent since contradictory entries cause the system of inferences to break down; contradictions at SF level trigger accommodation activities. As the agenda above suggests, a better understanding of the interplay of linguistic and non-linguistic aspects of meaning constitution along the lines developed here is particularly to be expected from interdisciplinary research combining methods and insights from linguistics, psycholinguistics, neurolinguistics and cognitive psychology.
6. References Asher, Nicholas 2007. A Web of Words: Lexical Meaning in Context. Ms. Austin, TX, University of Texas.
737
738
VI. Cognitively oriented approaches to semantics Bierwisch, Manfred 1983. Semantische und konzeptuelle Interpretation lexikalischer Einheiten. In: º R. Ružiˇ cka & W. Motsch (eds.). Untersuchungen zur Semantik. Berlin: Akademie Verlag, 61–99. Bierwisch, Manfred 1988. On the grammar of local prepositions. In: M. Bierwisch, W. Motsch & I. Zimmermann (eds.). Syntax, Semantik und Lexikon. Berlin: Akademie Verlag, 1–65. Bierwisch, Manfred 1989. The semantics of gradation. In: M. Bierwisch & E. Lang (eds.). Dimensional Adjectives: Grammatical Structure and Conceptual Interpretation. Berlin: Springer, 71–261. Bierwisch, Manfred 1996. How much space gets into language? In: P. Bloom et al. (eds.). Language and Space. Cambridge, MA: The MIT Press, 31–76. Bierwisch, Manfred 1997. Lexical information from a minimalist point of view. In: Ch. Wilder, H.-M. Gärtner & M. Bierwisch (eds.). The Role of Economy Principles in Linguistic Theory. Berlin: Akademie Verlag, 227–266. Bierwisch, Manfred 2005. The event structure of cause and become. In: C. Maienborn & A. Wöllstein (eds.). Event Arguments: Foundations and Applications. Tübingen: Niemeyer, 11–44. Bierwisch, Manfred 2007. Semantic Form as interface. In: A. Späth (ed.). Interfaces and Interface Conditions. Berlin: de Gruyter, 1–32. Bierwisch, Manfred 2010. become and its presuppositions. In: R. Bäuerle, U. Reyle & T. E. Zimmermann (eds.). Presupposition and Discourse. Bingley: Emerald, 189–234. Bierwisch, Manfred & Peter Bosch (eds.) 1995. Semantic and Conceptual Knowledge (Arbeitspapiere des SFB 340 Nr. 71). Heidelberg: IBM Deutschland. Bierwisch, Manfred & Ewald Lang (eds.) 1989a. Dimensional Adjectives: Grammatical Structure and Conceptual Interpretation. Berlin: Springer. Bierwisch, Manfred & Ewald Lang 1989b. Somewhat longer – much deeper – further and further. Epilogue to the Dimensional Adjective Project. In: M. Bierwisch & E. Lang (eds.). Dimensional Adjectives: Grammatical Structure and Conceptual Interpretation. Berlin: Springer, 471–514. Bierwisch, Manfred & Rob Schreuder 1992. From concepts to lexical items. Cognition 42, 23–60. Blutner, Reinhard 1995. Systematische Polysemie: Ansätze zur Erzeugung und Beschränkung von Interpretationsvarianten. In: M. Bierwisch & P. Bosch (eds.). Semantic and Conceptual Knowledge (Arbeitspapiere des SFB 340 Nr. 71). Heidelberg: IBM Deutschland, 33–67. Blutner, Reinhard 1998. Lexical pragmatics. Journal of Semantics 15, 115–162. Blutner, Reinhard 2004. Pragmatics and the lexicon. In: L. R. Horn & G. Ward (eds.). The Handbook of Pragmatics. Malden, MA: Blackwell, 488–514. Bücking, Sebastian 2009. Modifying event nominals: Syntactic surface meets semantic transparency. In: A. Riester & T. Solstad (eds.). Proceedings of Sinn und Bedeutung (= SuB) 13. Stuttgart: University of Stuttgart, 93–107. Bücking, Sebastian 2010. Zur Interpretation adnominaler Genitive bei nominalisierten Infinitiven im Deutschen. Zeitschrift für Sprachwissenschaft 29, 39–77. Carstensen, Kai-Uwe 2001. Sprache, Raum und Aufmerksamkeit. Tübingen: Niemeyer. Dölling, Johannes 1994. Sortale Selektionsbeschränkungen und systematische Bedeutungsvariabilität. In: M. Schwarz (ed.). Kognitive Semantik – Cognitive Semantics. Tübingen: Narr, 41–60. Dölling, Johannes 2001. Systematische Bedeutungsvariationen: Semantische Form und kontextuelle Interpretation (Linguistische Arbeitsberichte 78). Leipzig: University of Leipzig. Dölling, Johannes 2003. Flexibility in adverbal modification: Reinterpretation as contextual enrichment. In: E. Lang, C. Maienborn & C. Fabricius-Hansen (eds.). Modifying Adjuncts. Berlin: de Gruyter, 511–552. Dölling, Johannes 2005a. Semantische Form und pragmatische Anreicherung: Situationsausdrücke in der Äußerungsinterpretation. Zeitschrift für Sprachwissenschaft 24, 159–225. Dölling, Johannes 2005b. Copula sentences and entailment relations. Theoretical Linguistics 31, 317–329. Dölling, Johannes 2010. Aspectual coercion and eventuality structure. To appear in: K. Robering & V. Engerer (eds.). Verbal Semantics. Egg, Markus 2003. Beginning novels and finishing hamburgers. Remarks on the semantics of ‘to begin’. Journal of Semantics 20, 163–191.
31. Two-level Semantics: Semantic Form and Conceptual Structure Egg, Markus 2005. Flexible Semantics for Reinterpretation Phenomena. Stanford, CA: CSLI Publications. Geeraerts, Dirk 2010. Theories of Lexical Semantics. Oxford: Oxford University Press. Greenberg, Joseph H. 1978. Generalizations about numerical systems. In: J. H. Greenberg (ed.). Universals of Human Language, vol. 3: Word Structure. Stanford, CA: Stanford University Press, 249–295. Heim, Irene & Angelika Kratzer 1998. Semantics in Generative Grammar. Oxford: Blackwell. Herweg, Michael 1989. Ansätze zu einer semantischen Beschreibung topologischer Präpositionen. In: Chr. Habel, M. Herweg, K. Rehkämper (eds.). Raumkonzepte in Verstehensprozessen. Interdisziplinäre Beiträge zu Sprache und Raum. Tübingen: Niemeyer, 99–127. Herzog, Otthein & Claus-Rainer Rollinger (eds.) 1991. Text Understanding in LILOG: Integrating Computational Linguistics and Artificial Intelligence (Lecture Notes in Artificial Intelligence 546). Berlin: Springer. Higginbotham, James 1985. On semantics. Linguistic Inquiry 16, 547–593. Hobbs, Jerry R., Mark Stickel, Douglas Appelt & Paul Martin 1993. Interpretation as abduction. Artificial Intelligence 63, 69–142. Hottenroth, Monika-Priska 1991. Präpositionen und Objektkonzepte. Ein kognitiv orientiertes, zweistufiges Modell für die Semantik lokaler Präpositionen. In: G. Rauh (ed.). Approaches to Prepositions. Tübingen: Narr, 77–107. Hurford, James R. 1975. The Linguistic Theory of Numerals. Cambridge: Cambridge University Press. Jackendoff, Ray 1996. The architecture of the linguistic-spatial interface. In: P. Bloom et al. (eds.). Language and Space. Cambridge, MA: The MIT Press, 1–30. Jackendoff, Ray 2002. Foundations of Language. Brain, Meaning, Grammar, Evolution. Oxford: Oxford University Press. Kaup, Barbara, Jana Lüdtke & Claudia Maienborn 2010. ‘The drawer is still closed’: Simulating past and future actions when processing sentences that describe a state. Brain & Language 112, 159–166. Lang, Ewald 1985. Symmetrische Prädikate: Lexikoneintrag und Interpretationsspielraum. Eine Fallstudie zur Semantik der Personenstandslexik. Linguistische Studien des ZISW, Reihe A 127, 75–113. Lang, Ewald 1989. The semantics of dimensional designation of spatial objects. In: M. Bierwisch & E. Lang (eds.). Dimensional Adjectives: Grammatical Structure and Conceptual Interpretation. Berlin: Springer, 263–417. Lang, Ewald 1990. Primary perceptual space and inherent proportion schema. Journal of Semantics 7, 121–141. Lang, Ewald 1994. Semantische vs. konzeptuelle Struktur: Unterscheidung und Überschneidung. In: M. Schwarz (ed.). Kognitive Semantik – Cognitive Semantics. Tübingen: Narr, 25–40. Lang, Ewald 1995. Das Spektrum der Antonymie. In: G. Harras (ed.). Die Ordnung der Wörter. Kognitive und lexikalische Strukturen. Berlin: de Gruyter, 30–98. Lang, Ewald 2001. Spatial dimension terms. In: M. Haspelmath et al. (eds.). Language Typology and Universals. An International Handbook. Vol. 2. Berlin: Walter de Gruyter, 1251–1275. Lang, Ewald 2004. Schnittstellen bei der Konnektoren-Beschreibung. In: H. Blühdorn, E. Breindl & U. H. Waßner (eds.). Brücken schlagen. Grundlagen der Konnektorensemantik. Berlin: de Gruyter, 45–92. Lang, Ewald, Kai-Uwe Carstensen & Geoff Simmons 1991. Modelling Spatial Knowledge on a Linguistic Basis. Theory – Prototype – Integration. Berlin: Springer. Maienborn, Claudia 2001. On the position and interpretation of locative modifiers. Natural Language Semantics 9, 191–240. Maienborn, Claudia 2003. Event-internal modifiers: Semantic underspecification and conceptual interpretation. In: E. Lang, C. Maienborn & C. Fabricius-Hansen (eds.). Modifying Adjuncts. Berlin: de Gruyter, 475–509.
739
740
VI. Cognitively oriented approaches to semantics Maienborn, Claudia 2005a. On the limits of the Davidsonian approach: The case of copula sentences. Theoretical Linguistics 31, 275–316. Maienborn, Claudia 2005b. Eventualities and different things: A reply. Theoretical Linguistics 31, 383–396. Meyer, Ralf 1994. Probleme von Zwei-Ebenen-Semantiken. Kognitionswissenschaft 4, 32–46. Moens, Mark & Marc Steedman 1988. Temporal ontology and temporal reference. Computational Linguistics 14, 15–28. Pulman, Stephen G. 1997. Aspectual shift as type coercion. Transactions of the Philological Society 95, 279–317. Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA: The MIT Press. Pylkkänen, Liina & Brian McElree 2006. The syntax-semantics interface: On-line composition of meaning. In: M. A. Gernsbacher & M. Traxler (eds.). Handbook of Psycholinguistics. 2nd edn. Amsterdam: Elsevier, 537–577. Rothstein, Susan 2005. Response to ‘On the limits of the Davidsonian approach: The case of copula sentences’ by Claudia Maienborn. Theoretical Linguistics 31, 375–381. Schwarz, Monika 1992. Kognitive Semantiktheorie und neuropsychologische Realität. Repräsentationale und prozedurale Aspekte der semantischen Kompetenz. Tübingen: Niemeyer. Schwarze, Christoph & Marie-Theres Schepping 1995. Polysemy in a two-level semantics. In: U. Egli et al. (eds.). Lexical Knowledge in the Organization of Language. Amsterdam: Benjamins, 283–300. Stolterfoht, Britta, Helga Gese & Claudia Maienborn 2010. Word category conversion causes processing costs: Evidence from adjectival passives. Psychonomic Bulletin & Review 17, 651–656. de Swart, Henriëtte 1998. Aspect shift and coercion. Natural Language and Linguistic Theory 16, 347–385. Taylor, John A. 1994. The two-level approach to meaning. Linguistische Berichte 149, 3–26. Taylor, John A. 1995. Models for word meaning: The network model (Langacker) and the two-level model (Bierwisch) in comparison. In: R. Dirven & J. Vanparys (eds.). Current Approaches to the Lexicon. Frankfurt/M.: Lang, 3–26. Wunderlich, Dieter 1997a. cause and the structure of verbs. Linguistic Inquiry 28, 27–68. Wunderlich, Dieter 1997b. Argument extension by lexical adjunction. Journal of Semantics 14, 95–142. Wunderlich, Dieter & Michael Herweg 1991. Lokale und Direktionale. In: A. von Stechow & D. Wunderlich (eds.). Semantik – Semantics. Ein internationales Handbuch der zeitgenössischen Forschung – An International Handbook of Contemporary Research (HSK 6). Berlin: de Gruyter, 758–785.
Ewald Lang, Berlin (Germany) Claudia Maienborn, Tübingen (Germany)
32. Word meaning and world knowledge 1. 2. 3. 4. 5.
Introduction Core abstract theories Linking word meaning with the theories Distinguishing lexical and world knowledge References
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 740–761
32. Word meaning and world knowledge
Abstract Lexical semantics should be in part about linking the meanings of words with underlying theories of the world. But for this to be even remotely possible, the theories need to be informed by the insights of cognitive and other linguists about the conceptual structure on which language is based. They have to be axiomatizations of a kind of abstract topology that, for example, includes the domains of composite entities (things made of other things), scalar notions, change of state, and causality. Theories of each of these domains are sketched briefly, and it is shown how three very common polysemous words can be defined or characterized in terms of these theories. Finally, there is a discussion of what sort of boundary one can hope to draw between lexical knowledge and other world knowledge.
1. Introduction We use words to talk about the world. Therefore, to understand what words mean, we should have a prior explication of how we view the world. Suppose we have a formal logical theory of some domain, or some aspect of the world, that is, a set of predicates intended to capture the concepts in that domain and a set of axioms or rules that constrain the possible meanings of those predicates. Then a formal theory of lexical semantics in that domain would be a matter of writing axioms to relate predicates corresponding to the words in the domain to the predicates in the underlying theory of the domain. For example, the word “until” might be anchored in a formal theory of time that provides an axiomatization of intervals and a before relation. (See article 29 (Gawron) Frame Semantics for a similar view, where frames correspond to the domain theory.) For the last forty years researchers in artificial intelligence have made efforts to encode various aspects of world knowledge formally. These efforts have primarily been in commonsense physics in the areas of space, time, and qualitative physics, and, in commonsense psychology, in concepts related to belief and intention. A good review of this work that is old but has not lost its relevance is Davis (1990). Most of this work has focused on narrow areas of commonsense knowledge. But there have been several large-scale efforts to encode knowledge of many domains, most notably, Cyc (Lenat & Guha 1990; Cycorp 2008). One might think that this work could form the basis of an effort toward a formal theory of lexical semantics anchored in world knowledge. However, these theories for the most part were not designed with language in mind, and in particular what is missing is precisely some of the linguists’ insights described in the previous several articles of this volume. All of this seriously undercuts the utility for lexical semantics of Cyc and similar large ontologies, and indeed of most of the small-scale theories as well. In trying to link words and world, there are a number of bad ways to go about it. For example, we could take our theory of the world to be quantum mechanics and attempt to define, say, verbs of motion in terms of the primitives provided by that theory. A less obviously wrong approach, and one that has sometimes been tried, is to adopt Euclidean 3-space as the underlying model of space and attempt to define, say, spatial prepositions in terms of that. More common is a serious misstep, with respect to language, that many large-scale ontologies take at the start. Cyc begins by enforcing a rigid distinction between tangible and intangible entities, and in other hierarchical ontologies, the toplevel split is between physical and abstract entities. Yet this distinction plays very little
741
742
VI. Cognitively oriented approaches to semantics role in language. We can be in a room, in a social group, in the midst of an activity, in trouble, and in politics. We can move a chair from the desk to the table, move money from one bank account to another, move a discussion from religion to politics, and move an audience to tears. A fundamental distinction between tangibles and intangibles rules out the possibility of understanding the sense of “in” or “move” common to all these uses. Our effort, by contrast, has sought to exploit the insights of linguists such as Gruber (1965), the generative semanticists, Johnson (1987), Lakoff (1987), Jackendoff (see article 30 (Jackendoff) Conceptual Semantics), and Talmy (see article 27 (Talmy) Cognitive Semantics: An overview). Johnson, Lakoff, Talmy and others have used the term “image schemas” to refer to a conceptual framework that includes topological relations but excludes, for example, Euclidean notions of magnitude and shape. We have been developing core theories that formalize something like the image schemas, and we have been using these to define or characterize words. Among the theories we have developed are theories of composite entities, or things made of other things, the figure-ground relation, scalar notions, change of state, and causality. The idea behind these abstract core theories is that they capture a wide range of phenomena that share certain features. The theory of composite entities, for example, is intended to accommodate natural physical objects like volcanos, artifacts like automobiles, complex events and processes like concerts and photosynthesis, and complex informational objects like mathematical proofs. The theory of scales captures commonalities shared by distance, time, numbers, money, and degrees of risk, severity, and happiness. The most common words in English (and other languages) can be defined or characterized in terms of these abstract core theories. Specific kinds of composite entities and scales, for example, are then defined as instances of these abstract concepts, and we thereby gain access to the rich vocabulary the abstract theories provide. We can illustrate the link between word meaning and core theories with the rather complex verb “range”. A core theory of scales provides axioms involving predicates such as scale, in s. The converse is not necessarily true; the composite scale may have more structure than that inherited from its component scales. We need composite scales to deal with complex scalar predicates, such as damage. When something is damaged, it no longer fulfills its function in a goal-directed system. It needs to be repaired, and repairs cost. Thus, there are (at least) two ways in which damage can be serious, first in the degradation of its function, second in the cost of its repair. These are independent scales. Damage that causes a car not to run may cost next to nothing to fix, and damage that only causes the car to run a little unevenly may be very expensive. It is very useful to be able to isolate the high and low regions of a scale. We can do this with operators called Hi and Lo. The Hi region of a scale includes its top; the Lo region includes its bottom. The points in the Hi region are all greater than any of the points in the Lo region. Otherwise, there are no general topological constraints on the Hi and Lo regions. In particular, the bottom of the Hi region and the top of the Lo region may be indeterminate with respect to the elements of the scale. The Hi and Lo operators provide us with a coarse-grained structure on scales, useful when greater precision is not necessary or not possible. The absolute form of adjectives frequently isolate Hi and Lo regions of scales. A totally ordered Height Scale can be defined precisely, but frequently we are only interested in qualitative judgments of height. The word “tall” isolates the Hi region of the Height Scale; the word “short” isolates the Lo region. A Happiness Scale cannot be defined precisely. We cannot get much more structure for a Happiness Scale than what is
747
748
VI. Cognitively oriented approaches to semantics given to us by the Hi and Lo operators. The Hi and Lo operators can be iterated, to give us the concepts “happy”, “very happy”, and so on. In any given context, the Hi and Lo operators will identify different regions of the scale. That is, the inferences we can draw from the fact that something is in the Hi region of a scale are context-dependent; indeed, inferences are always context-dependent. But two important constraints on the Hi and Lo regions relate them to distributions and functionality. The Hi and Lo regions must be related to common distributions of objects on the scale in an as-yet nonexistent qualitative theory of distributions. If something is significantly above average for the relevant set, then it is in the Hi region. The regions must also be related to goal-directed behavior; often something is in the Hi region of a scale precisely because that property aids or defeats the achievement of some goal in a plan. For example, saying that a talk is long often means that it is longer than the audience’s attention span, and thus the goal of conveying information is defeated. Often when we call someone tall, we mean tall enough or too tall for some purpose.
2.3. Change of state A predicate of central importance is the predicate change. This is a relation between situations, or conditions, or predications, and indicates a change of state. A change from p being true of x to q being true of x, using an ontologically promiscuous notation that reifies states and events (see Hobbs 1985; article 34 (Maienborn) Event semantics), can be represented change(e1, e2) ∧ p'(e1, x) ∧ q’(e2, x) This says that there is a change from the situation e1 of p being true of x to the situation e2 of q being true of x. A very common pattern involves a change of location: change(e1, e2) ∧ at'(e1, x, y, s) ∧ at'(e2, x, z, s) That is, there is a change from the situation e1 of x being at y in s to the situation e2 of x being at z in s. When there is a change, generally there is some entity involved in both the start and end states; there is something that is changing––x in the above formulas. The predicate change possesses a limited transitivity. There was a change from Bill Clinton being a law student to Bill Clinton being President, because they are two parts of the same ongoing process, even though he was governor in between. There was a change from Bill Clinton being President to George W. Bush being President. But we probably do not want to say there was a change from Bill Clinton being a law student to George W. Bush being President. They are not part of the same process. A state cannot change into the same state without going through an intermediate different state. The concept of change is linked with time in the obvious way. If state e1 changes into state e2, then e2 cannot be before e1. My view is that the relation between change and time is much deeper, cognitively. The theory of change of state suggests a view of the world as consisting of a large number of more or less independent, occasionally interacting processes, or histories, or sequences of events. x goes through a series of changes, and y
32. Word meaning and world knowledge goes through a series of changes, and occasionally there is a state that involves a relation between the two. We can then view the time line as an artificial construct, a regular sequence of imagined abstract events – think of them as ticks of a clock in the National Institute of Science and Technology – to which other events can be related by chains of copresence. Thus, I know I went home at six o’clock because I looked at my watch, and I had previously set my watch by going to the NIST Web site. In any case, there is no need to choose between such a view of time and one that takes time as basic. They are inter-definable in a straightforward fashion (Hobbs et al. 1987). For convenience, we define one-argument predicates changeFrom and changeTo, suppressing one or the other argument of change.
2.4. Cause Our treatment of causality (Hobbs 2005) rests on a distinction between causal complexes and the predicate cause. When we flip a switch and the light comes on, we say that flipping the switch caused the light to come on. But many other factors were involved. The wiring and the light bulb had to be intact, the power had to be on in the city, and so forth. We say that all these other states and events constitute the causal complex for the effect. A causal complex for an effect is the set of all the eventualities that must happen or hold in order for the effect to occur. The two principal properties of causal complexes are that when all the eventualities happen, the effect happens, and that every eventuality in the causal complex is required for the effect to happen. These are strictly true, and the notion of causal complex is not a defeasible one. The “cause” of an effect, by contrast, is a distinguished element within the causal complex, one that cannot normally be assumed to hold. It is often the action that is under the agent’s immediate control. It is only defeasibly true that when a cause occurs the effect also occurs. This inference can be defeated because some of the other states and events in the causal complex that normally hold do not hold in this particular case. The notion of cause is much more useful in commonsense reasoning because we can rarely if ever enumerate all the eventualities in a causal complex. Most of our commonsense causal knowledge is expressed in terms of the predicate cause. The concept cause has the expected properties, such as defeasible transitivity and consistency with temporal ordering. But we should not expect to have a highly developed theory of causality per se. Rather we should expect to see causal information distributed throughout our knowledge base. For example, there is no axiom of the form (∀ e1, e2)cause(e1, e2) ≡ … defining cause. But there will be many axioms of the forms p'(e1, x) ⊃ q'(e2, x) ∧ cause(e1, e2) r'(e3, x) ⊃ p'(e1, x) ∧ cause(e1, e3) expressing causal connections among specific states and events; e.g., p-like events cause q-like events or r-like events are caused by p-like events. We don’t know precisely what causality is, but we know lots and lots of examples of things that cause other things.
749
750
VI. Cognitively oriented approaches to semantics Some would urge that causes and effects can only be events, but it seems to me that we want to allow states as well, since in The slipperiness of the ice caused John to fall.
the cause is a state. Moreover, intentional agents are sometimes taken to be the unanalyzed causes of events. In John lifted his arm.
John is the cause of the change of position of his arm, and we probably don’t want to have to coerce this argument into some imagined event taking place inside John. Physical forces may also act as causes, as in Gravity causes the moon to circle the earth.
The world is laced with threads of causal connection. In general, two entities x and y are causally connected with respect to some behavior p of x, if whenever p happens to x, there is some corresponding behavior q that happens to y. Attachment of physical objects is one variety of causal connection. In this case, p and q are both move. If x and y are attached, moving x causes y to move. Containment is similar. A particularly common variety of causal connection between two entities is one mediated by the motion of a third entity from one to the other. This might be called, somewhat facetiously, a “vector boson” connection. In particle physics, a vector boson is an elementary particle that transfers energy from one point to another. Photons, which really are vector bosons, mediate the causal connection between the sun and our eyes. Other examples of such causal connections are rain drops connecting a state of the clouds with the wetness of our skin and clothes, a virus transmitting disease from one person to another, and utterances passing information between people. Containment, barriers, openings, and penetration are all with respect to paths of causal connection. Force is causality with a scalar structure (see article 27 (Talmy) Cognitive Semantics: An overview). The event structure underlying many verbs exhibits causal chains. Instruments, for example, are usually vector bosons. In the sentence, John pounded the nail with a hammer for Bill.
the underlying causal structure is that the agent John causes a change in location of the instrument, the hammer, which causes a change in location of the object or theme, the nail, which causes or should cause a change in the mental or emotional state of the beneficiary, Bill. Agent –cause–> change(at(Instr, x, s), at(Instr, Object, s)) –cause–> change(at(Object, y1, s), at(Object, y2, s)) –cause–> change(p1(Beneficiary), p2(Beneficiary))
Much of case grammar and work on thematic roles can be seen as a matter of identifying where the arguments of verbs fit into this kind of causal chain when we view the verbs as
32. Word meaning and world knowledge instantiating this abstract frame (see Jackendoff 1972; article 18 (Davies) Thematic roles; article 19 (Levin & Rappaport Hovav) Lexical Conceptual Structure). In addition, in this theory we define such concepts as enable, prevent, help, and obstruct. There are also treatments of attempts, success, failure, ability, and difficulty. With this vocabulary, we are in a position to characterize more precisely the intuitive notions of state, event, action, and process. A state is a static property that does not involve a change (at the relevant granularity), such as an at relationship, at(x, y, s). To be up, for example, is a state. An event is a change of state, a common variety of which is a change of location: change(e1, e2) ∧ at'(e1, x, y, s) ∧ at'(e2, x, z, s) For example, the verb “rise” denotes a change of location of something to a higher point. An action is the causing of an event by an intentional agent: cause(a, e) ∧ change'(e, e1, e2) ∧ at'(e1, x, y, s) ∧ at'(e2, x, z, s) The verb “raise” denotes an action by someone of effecting a change of location of something to a higher point. A process is a sequence of events or actions. For example, to fluctuate is to undergo a sequence of risings and fallings, and to pump is to engage in a sequence of raisings and lowerings. We can coarsen the granularity on processes so that the individual changes of state become invisible, and the result is a state. This is a transformation of perspective that is effected by the progressive aspect in English. Thus, fluctuating can be viewed as a state. Detailed expositions of all the core theories can be found at http://www.isi.edu/hobbs/csk.html
3. Linking word meaning with the theories Once we have in place the core theories that capture world knowledge at a sufficiently abstract level, we can begin to construct the axioms that link word meaning to the theories. We illustrate here how that would go, using the words “have”, “remove”, and “remain”. Words have senses, and for each sense the linkage will be different. Here we examine the word senses in WordNet (Miller 1995) and FrameNet (Baker, Fillmore & Cronin 2003), since they are the most heavily used lexical resources in computational linguistics. The word sense numbers correspond to their order in the Web interfaces to the two resources: http://wordnet.princeton.edu/ http://framenet.icsi.berkeley.edu
3.1. “Have” In WordNet the verb “have” has 19 senses. But they can be grouped into three broad “supersenses”. In its first supersense, X has Y means that X is in some relation to Y. The WordNet senses this covers are as follows:
751
752
VI. Cognitively oriented approaches to semantics 1. 2. 3. 4. 7. 9. 12. 17.
a broad sense, including have a son, having a condition hold and having a college degree having a feature or property, i.e., the property holding of the entity a sentient being having a feeling or internal property a person owning a possession have a person related in some way: have an assistant have left: have three more chapters to write have a disease: have influenza have a score in a game: have three touchdowns
The supersense can be characterized by the axiom have-s1(x, y) ⊃ relatedTo(x, y) In these axioms, supersenses are indexed with s, WordNet senses with w, and FrameNet senses with f. Unindexed predicates are from core theories. The individual senses are then specializations of the supersense where more domainspecific predicates are explicated in more specialized domains. For example, sense 4 relates to the supersense as follows: have-w4(x, y) ≡ possess(x, y) have-w4(x, y) ⊃ have-s1(x, y) where the predicate possess would be explicated in a commonsense theory of economics, relating it to the priveleged use of the object. Similarly, have-w3(x, y) links with the supersense but has the restrictions that x is sentient and that the “relatedTo” property is the predicate-argument relation between the feeling and its subject. The second supersense of “have” is “come to be in a relation to”. This is our changeTo predicate. Thus, the definition of this supersense is have-s2(x, y) ≡ changeTo(e) ∧ have-s1'(e, x, y) The WordNet senses this covers are as follows: 10. 11. 14. 15. 16. 18.
be confronted with: we have a fine mess experience: the stocks had a fast run-up receive something offered: have this present come into possession of: he had a gift from her undergo, e.g., an injury: he had his arm broken in the fight have a baby
In these senses the new relation is initiated but the subject does not necessarily play a causal or agentive role. The particular change involved is specialized in the WordNet senses to a confronting, a receiving, a giving birth, and so on. The third supersense of “have” is “cause to come to be in a relation to”. The axiom defining this is have-s3(x, y) ≡ cause(x, e) ∧ have-s2'(e, x, y) The WordNet senses this covers are
32. Word meaning and world knowledge 5. 6. 8. 13. 19.
cause to move or be in a certain position or condition: have your car ready consume: have a cup of coffee organize: have a party cause to do: she had him see a doctor have sex with
In all these cases the subject initiates the change of state that occurs. FrameNet has five simple transitive senses for “have”. Their associated frames are 1. 2. 3. 4. 5.
Have associated Possession Ingestion Inclusion Birth
The first sense corresponds to the first WordNet supersense: have-f1(x, y) ≡ have-s1(x, y) The second sense is WordNet sense 4. have-f 2(x, y) ≡ have-w4(x, y) The third sense is WordNet sense 6. The fourth sense is a partOf relation. It is a specialization of WordNet sense 2. have-f 4(x, y) ≡ partOf(x, y) have-f 4(x, y) ⊃ have-w2(x, y) The fifth sense is WordNet sense 18.
3.2. “Remove” If x removes y from z, then x causes a change from the state in which y is at z. remove(x, y, z) ⊃ cause(x, e1) ∧ changeFrom'(e1, e2) ∧ at'(e2, y, z, s) This is the “supersense” covering all of the WordNet and FrameNet senses of “remove”. WordNet lists 8 senses of “remove”. In WordNet sense 1, at is instantiated as physical location. In sense 2, at is instantiated as position in an organization, as in “The board removed the VP of operations.” In sense 3, y is somehow dysfunctional, as in removing trash. In sense 4, at is instantiated as the membership relation in a set; y is removed from set z. In sense 5, the change is functional or strategic, as in a general removing his troops from a vulnerable position. In sense 6, x and y are identical, as in “He removed himself from the contest.” In sense 7, at is instantiated as “alive”, as in “The Mafia don removed his enemy.” In sense 8, y is abstract and dysfunctional, as in removing an obstacle.
753
754
VI. Cognitively oriented approaches to semantics FrameNet has two senses of the word. The first is the general meaning, our supersense. In the second sense, x is a person, y is clothes, and z is a body. Note that the supersense gives the topological structure of the meaning of the verb. The various senses are then generated from that by instantiating the at relation to something more specific, or by adding domain constraints to the arguments x, y and z.
3.3. “Remain” There are four WordNet senses of the verb “remain”: 1. 2. 3. 4.
Not change out of an existing state: He remained calm. Not change out of being at a location: He remained at his post. Entities in a set remaining after others are removed: Three problems remain. A condition remains in a location: Some smoke remained after the fire was put out.
The first sense is the most general and subsumes the other three. We can characterize it by the axiom remain-w1(x, e) ⊃ arg(x, e) ∧ ¬changeFrom(e) That is, if x remains in condition e, then e is a property of x (or x is an argument of e), and there is no change from state e holding. By the properties of changeFrom it follows that x is in state e, as is presupposed. In the second sense, the property e of x is being in a location. remain-w2(x, e) ≡ remain-w1(x, e) ∧ at'(e, x, y) The fourth sense is a specialization of the second sense in which the entity x that remains is a state or condition. remain-w4(x, e) ≡ remain-w2(x, e) ∧ state(x) The third sense is the most interesting to characterize. As in the fourth WordNet sense of “remove”, there is a process that removes elements from a set, and what remains is the set difference between the original and the set of elements that are removed. In this axiom x remains after process e. remain-w3(x, e) ≡ remove-w4'(e, y, s2, s1) ∧ setdiff(s3, s1, s2) ∧ member(x, s3) That is, x remains after e if and only if e is a removal event by some agent y of a subset s2 from s1, s3 is the set difference between s1 and s2, and x is a member of s3. There are four FrameNet senses of “remain”. The first is the same as WordNet sense 1. The second is the same as WordNet sense 3. The third and fourth are two specializations of WordNet sense 3, one in which the removal process is destructive and one in which it is not. There are two nominalizations of the verb “remain”––“remainder” and “remains”. All of their senses are related to WordNet sense 3. The first WordNet noun sense is the most general.
32. Word meaning and world knowledge remainder-w1(x, e) ≡ remain-w3(x, e) That is, x is the remainder after process e if and only if x remains after e. The other three senses result from specialization of the removal process to arithmetic division, arithmetic subtraction, and the purposeful cutting of a piece of cloth. The noun “remains” refers to what remains (w3) after a process of consumption or degradation.
3.4. The nature of word senses The most common words in a language are typically the most polysemous. They often have a central meaning indicating their general topological structure. Each new sense introduces inferences that cannot be reliably determined just from a core meaning plus contextual factors. They tend to build up along what Brugman (1981), Lakoff (1987) and others have called a radial category structure (see article 28 (Taylor) Prototype theory). Sense 2 may be a slight modification of sense 1, and senses 3 and 4 different slight modifications of sense 2. It is easy to describe the links that take us from one sense to an adjacent one in the framework presented here. Each sense corresponds to a predicate which is characterized by one or more axioms involving that predicate. A move to an adjacent sense happens when incremental changes are made to the axioms. As we have seen in the examples of this section, the changes are generally additions to the antecedents or consequents of the axioms. The principal kinds of additions are embedding in change and cause, as we saw in the supersenses of “have”; the instantiation of general predicates like relatedTo and at to more specific predicates in particular domains, as we saw in all three cases; and the addition of domain-specific constraints on arguments, as in restricting y to be clothes in remove-f 2. A good account of the lexical semantics of a word should not just catalog various word senses. It should detail the radial category structure of the word senses, and for each link, it should say what incremental addition or modification resulted in the new sense. Note that radial categories provide us with a logical structure for the lexicon, and also no doubt a historical one, but not a developmental one. Children often learn word senses independently and only later if ever realize the relation among the senses. See article 28 Prototype theory for further discussion of issues with respect to radial categories.
4. Distinguishing lexical and world knowledge It is perhaps natural to ask whether a principled boundary can be drawn between linguistic knowledge and knowledge of the world. To make this issue more concrete, consider the following seven statements: (1) If a string w1 is a noun phrase and a string w2 is a verb phrase, then the concatenation w1w2 is a clause. (2) The transitive verb “moves” corresponds to the predication move2(x, y), providing a string describing x occurs as its subject and a string describing y occurs as its direct object. (3) If an entity x moves (in sense move2) an entity y, then x causes a change of state or location of y.
755
756
VI. Cognitively oriented approaches to semantics (4) If an entity y changes to a new state or location, it is no longer in its old state or location. (5) If a physical object x moves a physical object y through a fluid medium, then x must apply force to y against the resistance of the medium. (6) The function of a barge is to move freight across water. (7) A barge moved the wreckage of Flight 1549 to New Jersey. Syntax consists in part of rules like (1), or generalizations of them. One could view the lexicon as consisting of axioms expressing information like (2), specifying for each word sense and argument realization pattern what predication is conveyed, perhaps together with some generalizations of such statements. (Lexical knowledge of other languages would be encoded as similar axioms, sometimes linking to the same underlying predicates, sometimes different.) Axioms expressing information like (3) link the lexical predicates with underlying domain theories, in this case, theories of the abstract domains of causality and change of state. Axioms expressing facts like (4) are internal to domain theories, in this case, the theory of the abstract domain of change of state. Axioms expressing general facts like (5) are part of a commonsense or scientific theory of physics, which can be viewed as a specialization and elaboration of the abstract theories. Axioms expressing facts like (6) encode telic information about artifacts. Statement (7) is a specific, accidental fact about the world. Many have felt that the viability of lexical semantics as a research enterprise requires a principled distinction between lexical knowledge and world knowledge, presumably somewhere below axioms like (2) and above facts like (7). Many of those who have believed that no such distinction is possible have concluded that lexical semantics is impossible, or at least can only be very limited in its scope. For example, in his discussion of meaning, Bloomfield (1933, 139–140) rules out the possibility of giving definitions of most words. In order to give a scientifically accurate definition of meaning of every form of a language, we should have to have a scientifically accurate knowledge of everything in the speakers’ world. While this may be possible for certain scientifically well-understood terms like “salt”, we have no precise way of defining words like “love” or “hate” which concern situations that have not been accurately classified – and these latter are in the great majority.
He concludes that The statement of meanings is therefore a weak point in language-study, and will remain so until human knowledge advances very far beyond its present state.
Lexical semantics is impossible because we would need a theory of the world. Bloomfield goes on to talk about such phenomena as synonymy and antonymy, and leaves issues of meaning at that. More recently, Fodor (1980) similarly argued that lexical semantics would need a complete and correct scientific theory of the world to proceed, and is consequently impossible in the foreseeable future. A counterargument is that we don’t need a scientifically correct theory of the world, because people don’t have that as they use language to convey meaning. We rather need to capture people’s commonsense theories of the world. In fact, there are a number of
32. Word meaning and world knowledge interesting engineering efforts to encode commonsense and scientific knowledge needed in specific applications or more broadly. Large ontologies of various domains, such as biomedicine and geography, are being developed for the Semantic Web and other computational uses. Cyc (Lenat & Guha 1990) has been a large-scale effort to encode commonsense knowledge manually since the middle 1980s; it now contains millions of rules. The Open Mind Common Sense project (Singh 2002) aims at accumulating huge amounts of knowledge rapidly by marshaling millions of “netizens” to make contributions; for example, a participant might be asked to complete the sentence “Water can …” and reply with “Water can put out fires.” Many of these projects, including Cyc, involve a parallel effort in natural language processing to relate their knowledge of the world to the way we talk about the world. Might we do lexical semantics by explicating the meanings of words in terms of such theories? Fodor (1983) can be read as responding to this possibility. He argues that peripheral processes like speech recognition and syntactic processing are encapsulated in the sense that they require only limited types of information. Central processes like fixation of belief, by contrast, can require any knowledge from any domain. He gives the example of the power of analogical reasoning in fixation of belief. The body of knowledge that can be appealed to in analogies can not be circumscribed; analogies might involve mappings from anything to anything else. Scientific study of modular processes is feasible, but scientific study of global processes is not. No scientific account of commonsense reasoning is currently available or likely to be in the foreseeable future; by implication reasoning about commonsense world knowledge is not currently amenable to scientific inquiry, nor is a lexical semantics that depends on it. Syntax is amenable to scientific study, but only, according to Fodor, because it is informationally encapsulated. Thus, the debate on this issue often centers on the modularity of syntax. Do people do syntactic analysis of utterances in isolation from world knowledge? Certainly at time scales at which awareness functions, there is no distinction in the processing of linguistic and world knowledge. We rarely if ever catch ourselves understanding the syntax of a sentence we hear without understanding much about its semantics. For example, in Chomsky’s famous grammatical sentence, “Colorless green ideas sleep furiously,” there is no stage in comprehension at which we are aware that “colorless” and “green” are adjectives, but haven’t yet realized they are contradictory. Moreover, psychological studies seem to indicate that syntactic processing and the use of world knowledge are intricately intertwined. Much of this work has focused on the use of world knowledge to resolve references and disambiguate ambiguous prepositional phrase attachments. Tanenhaus & Brown-Schmidt (2008) review some of this research that makes use of methods of monitoring eye movements to track comprehension. For example, they present evidence that subjects access the current physical context while they are processing syntactically ambiguous instructions and integrate it with the language immediately. In terms of our examples, they are using facts like (1) and facts like (7) together. The authors contend that their results “are incompatible with the claim that the language processing includes subsystems (modules) that are informationally encapsulated, and thus isolated from high-level expectations.” Often the line between linguistic and world knowledge is drawn to include selectional constraints within language. Hagoort et al. (2004) used electroencephalogram and functional magnetic resonance imaging data to investigate whether there was any difference between the temporal course of processing true sentences like “Dutch trains are
757
758
VI. Cognitively oriented approaches to semantics yellow and very crowded”, factually false but sensible sentences like “Dutch trains are white and very crowded”, and sentences that violate selectional constraints like “Dutch trains are sour and very crowded.” The false sentences and the selectionally anomalous sentences showed a virtually identical peak of activity in the left inferior prefrontal cortex. The authors observed that there is “strong empirical evidence that lexical semantic knowledge and general world knowledge are both integrated in the same time frame during sentence interpretation, starting at ~300ms after word onset.” However, there is a difference in frequency profile between the two conditions, consisting of a measurable increase in activity in the 30–70 Hz range (gamma frequency) for the false sentences, and an increase in the 4–7Hz range (theta frequency) in the anomalous condition. The authors conclude that “semantic interpretation is not separate from its integration with nonlinguistic elements of meaning,” but that nevertheless “the brain keeps a record of what makes a sentence hard to interpret, whether this is word meaning or world knowledge.” Thus, if the brain makes a distinction between linguistic and world knowledge, it does not appear to be reflected in the temporal course of processing language. The most common argument in linguistics and related fields for drawing a strict boundary between lexicon and world is a kind of despair that a scientific study of world knowledge is possible. Others have felt it is possible to identify lexically relevant domains of world knowledge that are accessible to scientific study. Linguists investigating “lexical conceptual structure” (e.g., see article 19 (Levin & Rappaport Hovav) Lexical Conceptual Structure) are attempting to discover generalizations in how the way an entity occurs in the underlying description of a situation or event in terms of abstract topological predicates influences the way it is realized in the argument structure in syntax. For example, do verbs that undergo dative alternation all have a similar underlying abstract structure? Does the causative always involve embedding an event as the effect in a causal relation, where the cause is the agent or an action performed by the agent? The hypothesis of this work is that facts like (2), which are linguistic, depend crucially on facts like (3), which have a more world-like flavor. However, this does not mean that we have identified a principled boundary between linguistic and world knowledge. One could just as well view this as a strategic decision about how to carve out a tractable research problem. Pustejovsky (1995) pushes the line between lanaguage and world farther into the world. He advocates representing what he calls the “qualia structure” of words, which includes facts about the constituent parts of an entity (Constitutive), its place in a larger domain (Formal), its purpose and function (Telic), and the factors involved in its origin (Agentive). One can then, for example, use the Telic information to resolve a metonymy like “She began a cigarette” into its normal reading of “She began smoking a cigarette,” rather than any one of the many other things one could do with a cigarette – eating it, rolling it, tearing it apart, and so on. His framework is an attempt to relate facts like (2) about what arguments can appear with what predicates with facts like (6) about the functions and other properties of things. Several places in his book, Pustejovsky suggests that it is important to see his qualia structures as part of lexical semantics, and hence linguistics, as opposed to general commonsense knowledge that is not linguistic. But he never makes a compelling argument to this effect. All of his qualia structures and coercion mechanisms are straightforward to express in a logical framework, so there are no formal reasons for the distinction. I think it is best to see this particular carving out of
32. Word meaning and world knowledge knowledge and interpretation processes, as with the study of lexical conceptual stuctures, as a strategic decision to identify a fruitful and tractable research problem. Pustejovsky’s work is an attempt to specify the knowledge that is required for interpreting at least the majority of nonstandard uses of words. Kilgarriff (2001) tests this hypothesis by examining the uses of nine particular words in a 20-million word corpus. 41 of 2276 instances were judged to be nonstandard since they did not correspond to any of the entries for the word in a standard dictionary. Of these, only two nonstandard uses were derivable from Pustejovsky’s qualia structures. The others required deeper commonsense knowledge or previous acquaintance with collocations. Kilgarriff’s conclusion is that “Any theory that relies on a distinction between general and lexical knowledge will founder.” (Kilgariff 2001: 325) Some researchers in natural language processing have argued that lexical knowledge should be distinguished from other knowledge because it results in more efficient computation or more efficient comprehension and production. One example concerns hyperonymy relations, such as that car(x) implies vehicle(x). It is true that some kinds of inferences lend themselves more to efficient computation than others, and inferences involving only monadic predicates are one example. But where this is true, it is a result not of their content but of structural properties of the inferences, and these cut across the lexical-world distinction. Any efficiency realized in inferring vehicle(x) can be realized in inferring expensive(x) as well. All of statements (1)–(7) are facts about the world, because sentences and their structure and words and their roles in sentences are things in the world, as much as barges, planes, and New Jersey. There is certainly knowledge we have that is knowledge about words, including how to pronounce and spell words, predicate-argument realization patterns, alternation rules, subcategorization patterns, grammatical gender, and so on. But words are part of the world, and one might ask why this sort of knowledge should have any special cognitive status. Is it any different in principle from the kind of knowledge one has about friendship, cars, or the properties of materials? In all these cases, we have entities, properties of entities, and relations among them. Lexical knowledge is just ordinary knowledge where the entities in question are words. There are no representational reasons for treating linguistic knowledge as special, providing we are willing to treat the entities in our subject matter as first-class individuals in our logic (cf. Hobbs 1985). There are no procedural reasons for treating linguistic knowledge as special, since parsing, argument realization, lexical decomposition, the coercion of metonymies, and so on can all be implemented straightforwardly as inference. The argument that parsing and lexical decomposition, for example, can be done efficiently on present-day computers, whereas commonsense reasoning cannot, does not seem to apply to the human brain; psycholinguistic studies show that the influence of world knowledge kicks in as early as syntactic and lexical knowledge, and yields the necessary results just as quickly. We are led to the conclusion that any drawing of lines is for the strategic purpose of identifying a coherent, tractable and fruitful area of research. Statements (1)–(6) are examples from six such areas. Once we have identified and explicated such areas, the next question is what connections or articulations there are among them; Pustejovsky’s research and work on lexical conceptual structures are good examples of people addressing this question. However, all of this does not mean that linguistic insights can be ignored. The world can be conceptualized in many ways. Some of them lend themselves to a deep treatment
759
760
VI. Cognitively oriented approaches to semantics of lexical semantics, and some of them impede it. Put the other way around, looking closely at language leads us to a particular conceptualization of the world that has proved broadly useful in everyday life. It provides us with topological relations rather than with the precision of Euclidean 3-space. It focuses on changes of state rather than on correspondences with an a priori time line. A defeasible notion of causality is central in it. It provides means for aggregation and shifting granularities. It encompasses those properties of space that are typically transferred to new target domains when what looks like a spatial metaphor is invoked. More specific domains can then be seen as instantiations of these abstract theories. Indeed, Euclidean 3-space itself is such a specialization. Language provides us with a rich vocabulary for talking about the abstract domains. The core meanings of many of the most common words in language can be defined or characterized in these core theories. When the core theory is instantiated in a specific domain, the vocabulary associated with the abstract domain is also instantiated, giving us a rich vocabulary for talking about and thinking about the specific domain. Conversely, when we encounter general words in the contexts of specific domains, understanding how the specific domains instantiate the abstract domains allows us to determine the specific meanings of the general words in their current context. We understand language so well because we know so much. Therefore, we will not have a good account of how language works until we have a good account of what we know about the world and how we use that knowledge. In this article I have sketched a formalization of one very abstract way of conceptualizing the world, one that arises from an investigation of lexical semantics and is closely related to the lexical decompositions and image schemas that have been argued for by other lexical semanticists. It enables us to capture formally the core meanings of many of the most common words in English and other languages, and it links smoothly with more precise theories of specific domains. I have profited from discussions of this work with Gully Burns, Peter Clark, Tim Clausner, Christiane Fellbaum, and Rutu Mulkar-Mehta. This work was performed in part under the IARPA (DTO) AQUAINT program, contract N61339-06-C-0160.
5. References Baker, Colin F., Charles J. Fillmore & Beau Cronin 2003. The structure of the Framenet database. International Journal of Lexicography 16, 281–296. Bloomfield, Leonard 1933. Language. New York: Holt, Rinehart & Winston. Brugman, Claudia 1981. The Story of ‘Over’. MA thesis. University of California, Berkeley, CA. Cycorp 2008. http://www.cyc.com/. December 9, 2010. Davis, Ernest 1990. Representations of Commonsense Knowledge. San Mateo, CA: Morgan Kaufmann. Fodor, Jerry A. 1980. Methodological solipsism considered as a research strategy in cognitive science. Behavioral and Brain Sciences 3, 63–109. Fodor, Jerry A. 1983. The Modularity of Mind. An Essay on Faculty Psychology. Cambridge, MA: The MIT Press. Ginsberg, Matthew L. (ed.) 1987. Readings in Nonmonotonic Reasoning. San Mateo, CA: Morgan Kaufmann.
32. Word meaning and world knowledge
761
Gruber, Jefirey C. 1965/1976. Studies in Lexical Relations. Ph.D. dissertation. MIT, Cambridge, MA. Reprinted in: J. S. Gruber. Lexical Structures in Syntax and Semantics. Amsterdam: NorthHolland, 1976, 1–210. Hagoort, Peter, Lea Hald, Marcel Bastiaansen & Karl Magnus Petersson 2004. Integration of word meaning and world knowledge in language comprehension. Science 304(5669), 438–441. Hobbs, Jerry R. 1985. Ontological promiscuity. In: Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics (=ACL). Chicago, IL: ACL, 61–69. Hobbs, Jerry R. 1992. Metaphor and abduction. In: A. Ortony, J. Slack & O. Stock (eds.). Communication from an Artificial Intelligence Perspective. Theoretical and Applied Issues. Berlin: Springer, 35–58. Hobbs, Jerry R. 2004. Abduction in natural language understanding. In: L. Horn & G. Ward (eds.). Handbook of Pragmatics. Malden, MA: Blackwell, 724–741. Hobbs, Jerry R. 2005. Toward a useful notion of causality for lexical semantics. Journal of Semantics 22, 181–209. Hobbs, Jerry R., William Croft, Todd Davies, Douglas Edwards & Kenneth Laws 1987. Commonsense metaphysics and lexical semantics. Computational Linguistics 13, 241–250. Hobbs, Jerry R., Mark Stickel, Douglas Appelt & Paul Martin 1993. Interpretation as abduction. Artificial Intelligence 63, 69–142. Jackendoff, Ray 1972. Semantic Interpretation in Generative Grammar. Cambridge, MA: The MIT Press. Johnson, Mark 1987. The Body in the Mind. The Bodily Basis of Meaning, Imagination, and Reason. Chicago, IL: The University of Chicago Press. Kilgariff, Adam 2001. Generative lexicon meets corpus data. The case of nonstandard word uses. In: P. Bouillion & F. Busa (eds.). The Language of Word Meaning. Cambridge: Cambridge University Press, 312–328. Lakoff, George 1987. Women, Fire, and Dangerous Things. What Categories Reveal About the Mind. Chicago, IL: The University of Chicago Press. Lenat, Douglas B. & Ramanathan V. Guha 1990. Building Large Knowledge-based Systems. Representation and Inference in the Cyc Project. Reading, MA: Addison-Wesley. McCarthy, John 1980. Circumscription. A form of non-monotonic reasoning. Artificial Intelligence 13, 27–39. Miller, George 1995. WordNet. A lexical database for English. Communications of the ACM 38, 39–41. Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA: The MIT Press. Rosch, Eleanor 1975. Cognitive representations of semantic categories. Journal of Experimental Psychology 104, 192–233. Shoham, Yoav 1987. Nonmonotonic logics. Meaning and utility. In: J. MacDermott (ed.). Proceedings of the International Joint Conference on Artificial Intelligence (= IJCAI) 10. San Mateo, CA: Morgan Kaufmann, 388–393. Singh, Push 2002. The public acquisition of commonsense knowledge. In: Proceedings of AAAI Spring Symposium on Acquiring (and Using) Linguistic (and World) Knowledge for Information Access. Palo Alto, CA: AAAI. http://web.media.mit.edu/~push/AAAI2002-Spring.pdf. December 15, 2010. Tanenhaus, Michael K. & Sarah Brown-Schmidt 2008. Language processing in the natural world. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences 363, 1105–1122.
Jerry R. Hobbs, Marina del Rey, CA (USA)
VII. Theories of sentence semantics 33. Model-theoretic semantics 1. 2. 3. 4.
Truth-conditional semantics Possible worlds semantics Model-theoretic semantics References
Abstract Model-theoretic semantics is a special form of truth-conditional semantics. According to it, the truth-values of sentences depend on certain abstract objects called models. Understood in this way, models are mathematical structures that provide the interpretations of the (non-logical) lexical expressions of a language and determine the truth-values of its (declarative) sentences. Originally designed for the semantic analysis of mathematical logic, model-theoretic semantics has become a standard tool in linguistic semantics, mostly through the impact of Richard Montague’s seminal work on the analogy between formal and natural languages. As such, it is frequently (and loosely) identified with possible worlds semantics, which rests on an identification of sentence meanings with regions in Logical Space, the class of all possible worlds. In fact, the two approaches have much in common and are not always easy to keep apart. In a sense, (i) modeltheoretic semantics can be thought of as a restricted form of possible worlds semantics, where models represent possible worlds; in another sense, (ii) model-theoretic semantics can be seen as a wild generalization of possible worlds semantics, treating Logical Space as variable rather than given. Consequently, the present introductory exposition of model-theoretic semantics also covers possible worlds semantics – hopefully helping to disentangle the relationship between the two approaches. It starts with a general discussion of truth-conditional semantics (section 1), the main purpose of which is to provide some motivation and background. Model-theoretic semantics is then approached from the possible worlds point of view (section 2), highlighting the similarities between the two approaches that give rise to perspective (i). The final section 3 turns to model theory as a providing a mathematical reconstruction of possible worlds semantics, ultimately arriving at the more abstract perspective (ii).
1. Truth-conditional semantics The starting point for the truth-conditional approach to semantics is the tight connection between meaning and truth. From a pre-theoretic point of view, linguistic meaning may appear a multi-faceted blend of phenomena, partly subjective and private, partly social and inter-subjective, mostly vague, slippery, and apparently hard to define in precise terms. Nevertheless, there are a few unshakable, yet substantial (non-tautological) insights into meaning. Among these is the contention that differences in truth-value necessitate differences in meaning: Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 762–802
33. Model-theoretic semantics Most Certain Principle [MCP]
763 cf. Cresswell (1982: 69)
If S1 and S2 are declarative sentences such that, under given circumstances, S1 is true whereas S2 is not, then S1 and S2 differ in meaning. The MCP certainly has the flavour of an a priori truth. If it is one, an adequate analysis of meaning ought to take care of it by making a conceptual connection with truth; in fact, this connection might lie at the heart of any adequate theoretical reconstruction of meaning. If so, meaning is primarily sentence meaning; for sentences are the bearers of truth, i.e. the only expressions that can be true (or false). We will say that the truth value of a (declarative) sentence S is 1 [under given circumstances] if S is true [under these circumstances]; and that its truth value […] is 0 otherwise. Then the lesson to be learnt from the MCP is that, given arbitrary circumstances, the meaning of a sentence S determines its truth value under these circumstances: nothing, i.e. no sentence S', could have that meaning without having the same truth value. Whatever the meanings of (declarative) sentences may be, then, they must somehow determine their truth conditions, i.e. the circumstances under which the sentences are true. According to truth-conditional semantics, anything beyond this most marked trait of sentence meanings ought to be left out of consideration (or to pragmatics), thus paving the way for the following ‘minimalist’ semantic axiom: Basic Principle of Truth-Conditional Semantics [BP] cf. Wittgenstein (1922: 4.431) Any two declarative sentences agree in meaning just in case they agree in their truth conditions. Since by (our) definition, truth conditions are circumstances of truth, the BP implies that sentences that differ in meaning, also differ in truth value under at least some circumstances. Truth-conditional semantics owes much of its flavour, as well as some of its problems, to this immediate consequence of the BP, to which we will return in section 2. According to the BP, the meanings of sentences covary with their truth conditions. It is important to realize (and easy to overlook) that a statement of covariance is not an equation. The – common – identification of sentence meanings with truth conditions requires further justification. Some motivation for it may be sought in functionalist considerations, according to which truth conditions suffice to explain the extralinguistic rôles attributed to sentence meaning in social interaction and cognitive processes. Though we cannot go into these matters here, we will later see examples of covarying entities – the material models introduced in Section 2.3 – that are less likely to be identified with sentence meanings than truth conditions in the sense envisaged here. Bearing this in mind, we will still read the BP as suggesting that the meanings of sentences may as well be taken to be their truth conditions. In its most radical form, truth-conditional semantics seeks to derive the meanings of non-sentential expressions from sentence meanings, as the contributions these expressions make to the truth conditions of the sentences in which they occur; cf. Frege (1884: 71). Without any further specification and constraints, this characterisation of nonsentential meaning is bound to be incomplete. To begin with, the contribution an expression makes to the truth conditions of a sentence in which it occurs, obviously depends on
764
VII. Theories of sentence semantics the position in which it occurs: the contribution of an expression A occupying position p in a sentence S to the meaning of S must be understood as the contribution of A as occurring in p; for the sake of definiteness, positions p can be identified with (certain) partial, injective functions on the domain of expressions, inserting occupants into hosts, so that an (occupant) expression A occurs in position p of a (host) expression B just in case p(A) = B. Positions in sentences are structural positions that need not coincide with positions in surface strings; in particular, the contributions made by the parts of structurally ambiguous (surface) sentences depend on the latters’ readings, which have different structures, hosting their parts in different structural positions. Given their sensitivity to syntactic structure, the truth-conditional contributions of its parts can hardly be determined by looking at one sentence in isolation. Otherwise minimal pairs like the following, taken from Abbott & Hauser (1995: 6, fn. 10), would create problems when it comes to the truth-conditional contributions of their parts: (1) Someone is buying a car. (2) Someone is selling a car. (3) and (4) have the same truth conditions and the same overall structure, in which the underlined verbs occupy the same position. Since the two sentences are otherwise identical, it would seem that the two alternatives make the same contribution. If so, this contribution cannot be their meaning – after all, the two underlined verb forms are not synonymous and generally do differ in their truth-conditional contributions, as in: (3) Mary is buying a car. (4) Mary is selling a car. Hence, lest the two verbs come out as synonymous, their meanings can only be identified with their truth-conditional contributions if the latter are construed globally, taking into account all positions in which the verbs may occur. Generalising from this example, it emerges that sameness of meaning among any non-sentential expressions A and B (of the same category) requires sameness of truth-conditional contribution throughout the structural positions of all sentences. Since A and B occupy the same structural position in two sentences SA and SB if SB derives from SA by putting B in place of A, this condition on synonymy may be cast in terms of a: Substitution Principle [Subst] If two non-sentential expressions of the same category have the same meaning, either may replace the other in all positions within any sentence without thereby affecting the truth conditions of that sentence. The italicized restriction is made so as to not rule out the possibility that two expressions make the same truth-conditional contribution without being grammatically equivalent. As argued in Gazdar et al. (1985: 32), likely and probable may be cases in point – given the grammatical contrast between likely to leave vs. *probable to leave.
33. Model-theoretic semantics Taken together, the BP and Subst form the backbone of truth-conditional semantics: the meanings of sentences coincide with their truth conditions; and synonymy among non-sentential expression is restricted by the synonymies among the sentences in which they occur. Subst leaves open what kinds of entities the meanings of nonsentential expressions are. Keeping the ‘minimalist’ spirit behind the passage from the MCP to the BP, it is tempting to identify them by strengthening Subst to a biconditional, thereby arriving at a synonymy criterion, according to which two (nonsentential) expressions have the same meaning if they may always replace each other within a sentence without affecting the truth conditions of that sentence. Ontologically speaking, this synonymy criterion has meaning supervene on truth conditions; cf. Kupffer (2007). However, we will depart from radical truth-conditional semantics and not take it for granted that this criterion always applies; possible counterexamples will be addressed in due course. The above considerations also support a more general version of Subst according to which various synonymous (non-sentential) expressions may be replaced simultaneously. As a result, the meaning of any complex expression can be construed as only depending of the meanings of its immediate parts – thus giving rise to a famous principle, which we note here for future reference: Principle of Compositionality [Compo] The meaning of a complex expression functionally depends on the meanings of its immediate parts and the way in which they are combined. Like Subst, Compo does not say what kinds of objects non-sentential meanings are. In fact, a permutation argument has it that they are hopelessly underdetermined: if a truth-conditional account specified some objects x and y as the meanings of some (nonsentential) expressions A and B, then an alternative account could be given that would also be in line with Subst, but according to which x and y have changed their places. We will take a closer look at this argument in section 3. Even though Compo and Subst are not committed to a particular concept of nonsentential meaning, some choices might be more natural than others. Indeed, Subst falls out immediately if truth-conditional contributions and thus meanings (of non-sentential expressions) are constructed by abstraction, as classes of expressions that make the same truth-conditional contributions in all positions in which they occur: if the meaning of an expressions A is identified with the set of all expressions with which A can be replaced without any change in truth conditions, then Subst trivially holds. However, there is a serious problem with this strategy: if (non-sentential) meanings were classes of intersubstitutable expressions, they would be essentially language-dependent. But if meanings were just substitution classes, then meaning-preserving translation between distinct languages would be impossible; and even more absurdly, all expressions of a language would have to change their meanings each time a new word is added to it! This certainly speaks against a construal of meanings in terms of abstraction, even though synonymy classes have proved useful as representatives of meanings in the algebraic study of compositionality; cf. Hodges (2001). If contributing to truth conditions and conforming to Subst were all there is to nonsentential meanings, then there would be no fact of the matter as to what kinds of objects
765
766
VII. Theories of sentence semantics they are; moreover, there would be no obvious criteria for deciding whether two expressions of different languages have the same meaning. Hence a substantial theory of nonsentential meaning cannot live on Subst alone. One way to arrive at such a theory is by finding a natural analogue to the MCP that goes beyond the realm of sentences. The most prominent candidate is what may be dubbed the: Rather Plausible Principle [RPP] If T1 and T2 are terms such that, under given circumstances, T1 refers to something to which T2 does not refer, then T1 and T2 differ in meaning. A term in the sense of the RPP is an expression that may be said to refer to something. Proper names, definite descriptions, and pronouns are cases in point. Thus, in one of its usages, (i) the name Manfred refers to a certain person; under the current actual circumstances, (ii) the description Regine’s husband refers to the same person; and in an email to Manfred Kupffer, I can use (iii) the pronoun you to refer to him. Of course, in different contexts, (iii) can also refer to another person; hence the two terms mentioned in the RPP must be understood as being used in the same context. Under different (past or counterfactual) circumstances, (ii) may refer to another person, or to no one at all; the RPP only mentions the reference of terms under the same circumstances, and in the latter case thus applies vacuously. For the purpose of this survey we will also assume that names with more than one bearer, like (i), are lexically ambiguous; the terms quantified over in the RPP must therefore be taken as disambiguated expressions rather than surface forms. Why should the RPP be less certain than the MCP? One possible reason comes from ontology, the study of what there is: the exact subject matter of sentences containing referring terms is not necessarily clearly defined, let alone obvious. The reader is referred to the relevant literature on the inscrutability of reference; Williams (2005) contains a good, if biased, survey. Putting ontological qualms aside, we may conclude from the RPP that whatever the meanings of terms may be, they must somehow determine their reference conditions, i.e. the referents as depending on the circumstances. According to a certain brand of truthconditional semantics, anything beyond this most marked trait of term meanings ought to be left out of consideration (or to pragmatics), thus giving rise to a stronger version of our ‘minimalist’ semantic axiom: Extended Basic Principle of Truth-Conditional Semantics [EBP] Any two declarative sentences agree in meaning just in case they agree in their truth conditions; and any two terms agree in meaning just in case they agree in their reference conditions. Like the BP, the EBP is merely a statement of covariance, not an equation. The – common – identification of term meanings with reference conditions, also requires further ‘external’ evidence. However, as in the case of the BP, we will read the EBP as suggesting that term meanings be reference conditions. Following this suggestion, and given Subst, the EBP may then be used to determine the meanings of at least some nonsentential expressions in a non-arbitrary and at the same time language-independent way, employing a heuristics based on:
33. Model-theoretic semantics Frege’s Functional Principle [FFP] If not specified by the EBP, the meaning of an expression A is the meaning of (certain) expressions in which A occurs as functionally depending on A’s occurrence. Since the functional meanings of parts can be thought of as contributions to host meanings, FFP can be seen as a version of Frege’s (1884: x) Context Principle that is not committed to the radical form of truth-conditional semantics: ‘nach der Bedeutung der Wörter muss im Satzzusammenhange, nicht in ihrer Vereinzelung gefragt werden’ ‘[≈‘one must ask for the meanings of words in their sentential context, not in isolation’]. Obviously, FFP is more specific; it also deviates from the original in having term extensions determined directly, in accordance with the EBP – and Frege’s (1892) later policy. According to FFP, the meaning of an expression A that is not a sentence or a term, turns out to be a function assigning meanings of host expressions (in which A occurs) to meanings of positions (in which A occurs) – where the latter can be made out in case the positions coincide with adjacent expressions whose meanings have been independently identified – either by the EBP, or by previous applications of the same strategy. As a case in point, the meanings of (coordinating) conjunctions C (like and and or) may be determined from their occurrences in unembedded coordinations of sentences A and B, i.e. sentences of the form A C B; these occurrences are completely determined by the two coordinated sentences and may thus be identified with the ordered pairs (A,B). Hence, following, the meaning of a conjunction comes out as a function from pairs of sentence meanings to sentence meanings, i.e. a function assigning truth conditions to pairs of truth conditions. In particular, the meaning of and assigns the truth conditions of any sentence A and B to the pair consisting of the truth conditions of A and the truth conditions of B. In a similar vein, by looking at simple predications of the form T P (where T is a term), the meanings of predicates P come out as functions assigning sentence meanings to term meanings. Since the latter are determined by the EBP, predicate meanings are thus construed as functions from reference conditions to truth conditions. This result may in turn be used to obtain the meanings of transitive verbs V, which may take terms T as objects to form predicates, the meanings of which have already been identified; hence, by another application of FFP, the meaning of a transitive verb comes out as a function assigning predicate meanings to reference conditions (= the meanings of terms in object position). In one form or another, the heuristics just sketched has been applied widely, and rather successfully, in linguistic semantics. Following it, a large variety of expressions can be assigned meanings whose primary function is to contribute to truth and reference conditions without depending on any particular language. Nevertheless, for a variety of reasons, following FFP (in the way indicated here) does not settle the issue of determining meaning in a fully principled, non-arbitrary way: – As the above examples show, the meanings of expressions depend on a choice of primary occurrences, i.e. positions in which they can be read off the immediate context. However, in general there is no way of telling which of a given number of grammatical constructions ought to serve this purpose; and the choice may influence the meanings obtained by applying FFP. Quantifier phrases – expressions like nobody; most linguists; every atom in the universe; etc. – are cases in point. Customarily they are analyzed by putting them in subject position and thus interpret them as functions from predicate meanings (determined in the way above) to sentence meanings (determined
767
768
VII. Theories of sentence semantics by the EBP). However, they also occur in a host of other environments, e.g. as direct objects. If the latter were chosen as their primary occurrences, their meanings would come out as functions from transitive verb meanings (determined in the way above) to predicate meanings (determined in the way above); and if both environments were taken together, they would be functions defined on both predicate and verb meanings, etc. Although there are one-one correspondences between these different construals of quantifier phrases, it would seem that none of them can lay claim to being more natural than the others. – Once determined via FFP (in the way indicated here), there is no guarantee that the meaning of an expression will behave compositionally beyond the primary occurrences chosen. For instance, it is well known that the meaning of and determined along the above lines, can be put to work in predicate coordinations such as sing and dance – basically due to the existence of corresponding clausal paraphrases of the form x sings and x dances; but is not (or not so easily) adapted to cover ‘collective’ readings of coordinate proper names as in John and Mary are performing a duet. – Due to limits of expressiveness of the language under scrutiny, the range of meanings covered by the primary occurrences of a given expression may be restricted in various ways, sometimes by merely accidental gaps in the lexicon. For instance, if constructed strictly along the above lines, the predicate meanings will depend on what terms there are, and what they mean (i.e. their reference conditions, according to the EBP). However, in order to make the meanings thus determined general enough, their construction often crucially involves a certain amount of idealisation. Thus, in order to have two predicates from different languages come out as synonymous, their meanings would have to have the same domains; but then the meanings of the terms of the two languages need not be precisely the same: maybe one of them contains a name for some peculiar object that is hard to describe – a certain grain of sand, say – in which case it is not obvious that, in the other language, there is a corresponding term with the same reference conditions. In fact the idealisations made in modeltheoretic semantics frequently go far beyond the inclusion of hard-to-describe objects as referents. As a result, the synonymy criterion mentioned above cannot always be upheld. We will return to this point. These limitations notwithstanding, there can be no doubt that FFP has led to considerable advances in natural language semantics. Earlier approaches, trying to proceed in a bottom-up fashion, had encountered serious difficulties when it came to distinguishing the various modes of composing phrasal from lexical meanings. By changing the direction of analysis, FFP solves these problems and at the same time offers a much more differentiated picture of the varieties of meaning: combination mostly proceeds by functional application, and while earlier approaches were confined to (combinations of) content words, FFP strives to make all expressions amenable to semantic analysis, including functional ones such as determiners and conjunctions.
2. Possible worlds semantics 2.1. Logical Space According to the EBP, the meanings of sentences and terms are of a conditional nature: given the circumstances, they determine a truth value or a referent. Following semantic
33. Model-theoretic semantics tradition, we will call the object that the meaning of an expression determines (for given circumstances), as the extension of the expression (for these circumstances); cf. Carnap (1947). In other words, the extension of a sentence is its truth value; and the extension of a term is its referent. It is a remarkable fact that the above strategy for determining meanings may be used to generalize the notion of extension to a wide range of expressions far beyond sentences and terms. In order to do so, FFP needs to be adapted so as to describe the extension of an expression A as a function assigning extensions of host expressions to extensions of positions in which A occurs. We thus have what may be called the: Extensional Version of Frege’s Functional Principle If not specified by the EBP, the extension of an expression A is the contribution A makes to the extensions of (certain) expressions in which A (immediately) occurs. As a case in point, the extensions of coordinating conjunctions may be determined from their occurrences in coordinations of sentences A and B; as we have seen, these occurrences are completely determined by the two sentences coordinated and may thus be identified with the ordered pairs (A,B). Hence, following (the extensional version of FFP), the extension of a conjunction comes out as a (binary) truth table, i.e. a function assigning truth values (= extensions of coordinated sentences) to pairs of truth values (= extensions of coordinations). In particular, the extension of and assigns the truth value of any sentence of the form A and B to the pair consisting of the truth value of A and the truth value of B, i.e. it assigns 1 to the pair (1,1) and 0 to the other three pairs of truth values. In a similar vein, setting out with simple predications, the extensions of predicates come out as functions assigning truth values (= sentence extensions) to individuals (= term extensions). Similarly, predicate extensions turn out to be characteristic functions of sets of individuals, i.e. functions from individuals to truth values. This result may in turn be used to obtain the extensions of transitive verbs, which come out as curried binary relations, i.e. functions assigning characteristic functions (= predicate extensions) to individuals (= the extensions of terms in object position). [NB: The characteristic function of a set M of individuals is that function ƒ that assigns 1 to any member of M and 0 to all other individuals: f = {(u, 1) | u∈M} ∪ {(u, 0) | u∈U \M} where U is the set of all individuals. Given a binary relation R among individuals, i.e. a set of ordered pairs of individuals, the curried version of R is the function that assigns the characteristic function of {y∈U | (y, x)∈R} to any individual x∈U. It is not hard to see that sets of individuals and their characteristic functions stand in a one-one correspondence; similarly, binary relations among individuals correspond to their curried versions. Given this connection, we will not always distinguish between the extensions of predicates and transitives and the corresponding sets and binary relations]. In a similar vein, the extensions of quantifier phrases can be identified as (characteristic functions of) sets of predicate extensions; and once nouns are regarded as synonymous with predicates, the extensions of determiners turn out to be (curried) relations among predicate extensions. In particular, the members of the extension of everybody are precisely the supersets of the set of all persons; the extension of no is the relation of disjointness among sets of individuals; etc. Since the truth value of a sentence with a
769
770
VII. Theories of sentence semantics quantifier in subject position may depend on the value of the predicate extension for nameless individuals, the aforementioned idealised generalisation of predicate extensions to functions operating on arbitrary individuals turns out to be crucial for this step. The above extensional adaptation is much more limited in its application than FFP in general: it only works as long as extensions behave compositionally; otherwise it is bound to fail; and there are a number of environments that cannot be analyzed by compositionally combining extensions. Still, a considerable portion of the grammatical constructions of English is extensional in that the extensional version of FFP may be applied to them, thus extending the notion of extension to ever more kinds of expressions. In the present and the following section we will be exclusively concerned with extensional constructions and, for convenience, pretend that all constructions are extensional. The generalisation of extensions to expressions of (almost) arbitrary categories leads to a natural theoretical construction of meanings in terms of extensions and circumstances. The key to this construction lies in a generalisation of the EBP, which comes for free for those expressions whose extensions have been constructed in the way indicated, i.e. using the extensional version of FFP. To see this, we have to be more specific about truth conditions and reference conditions (in the sense intended here): since they are truth values and referents as depending on circumstances, we will henceforth take them to be functions assigning the former to the latter. This identification obviously presupposes that, whatever circumstances may be, they collectively form a set that may serve as the domain of these functions. Though we will not be entirely specific as to the nature of this set, we assume that it is vast, and that its members do not miss a detail: Vastness: Detail:
Circumstances may be as hypothetical as can be. Circumstances are as specific as can be.
Vastness reflects the raison d’être of circumstances in semantic theory: according to the EBP, they serve to differentiate the meanings of non-synonymous sentences and terms by providing them with distinct truth values and referents, respectively. In order to fulfill this task in an adequate way, actual circumstances – those met in the actual world – obviously do not suffice, as a random example shows: (5) The physicist who discovered the X-rays died in Munich. (6) The first Nobel laureate in physics died in Munich. As the educated reader knows, it so happens that the German physicist Wilhelm Conrad Röntgen discovered a phenomenon known as X-rays in 1895, was subsequently awarded the first Nobel prize in physics in 1901, and died in Munich in 1923. Hence the sentences (5) and (6) are both true; and their subjects refer to the same person. In fact, (5) and (6) are true under any actual circumstances; likewise, their subjects share their referent under any actual circumstances. Nevertheless, (5) and (6) are far from being synonymous, and neither are the definite descriptions in their subject position. In order to reflect these semantic differences in their truth and reference conditions, respectively, circumstances are called for under which (5) and (6) differ in truth value and, consequently, their subjects do not corefer. Such circumstances are not hard to imagine; the inventive reader
33. Model-theoretic semantics is invited to concoct his or her pertinent favourite scenario. Since it is hardly foreseeable what kinds of differences are needed in order to separate two non-synonymous sentences or terms by circumstances under which their extensions do not coincide, we will assume that any conceivable circumstances count as evidence for non-synonymy, however far-fetched or bizarre they may be. The following pair, inspired by Macbeath (1982), is a case in point: (7) Dr Who ate himself. (8) Dr Who is his own father. Lest both (7) and (8) come out as false for all circumstances and thus as synonymous, there better be circumstances that contradict current science, thereby reflecting the meaning difference between the two sentences. Detail is an assumption made mostly for convenience and definiteness; but once made, it cannot be ignored. It captures the idea that circumstances are chosen in such a way as to determine the truth values of any sentence whatsoever. As far as actual circumstances are concerned, this is a rather natural assumption. My present circumstances are such that I am sitting in a train traveling through northern regions of Germany at a fairly high speed. I have no idea what precisely the speed is, nor do I know where exactly the train is right now, how many passengers are on it, how old the engineer is, etc. But all of these details, and innumerable others, are resolved by these circumstances, which are as specific as can be. The point of Detail is that counterfactual (= non-actual) circumstances are just as specific. Hence if I had been on a space shuttle instead of riding a train, the circumstances that I would have been in would have included a specific velocity, a specific number of co-passengers, etc. To be sure, these details could be filled in in different ways, all of which correspond to different counterfactual circumstances. Given Detail, then, counterfactual circumstances are unlike the ordinary conception of worlds of imagination or fiction: stories and novels usually, nay always, leave open a lot of details – like the exact number of hairs on the protagonist’s head, etc. If anything, worlds of fiction correspond to sets of circumstances; for instance, ‘the world of Sherlock Holmes’ corresponds to the set of (counterfactual) circumstances that are in accordance with whatever Conan Doyle’s stories say, and disagree on all kinds of unsettled details. As was said above, we assume that the totality of all actual and counterfactual circumstances forms a rather large set which is called Logical Space; cf. Wittgenstein (1922: 3.4 & passim). We will follow logical and semantic tradition and refer to the elements of Logical Space as possible worlds rather than circumstances, even though the term is slightly misleading in that it not only suggests lack of detail (as was just noted) but also a certain grandeur, which the members of Logical Space need not have; it will be left open here whether possible worlds are all-inclusive agglomerations of facts, or whether they could also correspond to more mundane (!), medium-sized situations. What is important is that Logical Space is sufficiently rich to differentiate the meanings of arbitrary non-synonymous sentences and terms. Yet even the macroscopic Vastness of Logical Space and microscopic Details of its worlds cannot guarantee that any two sentences that appear to differ in meaning also differ in their truth conditions. Thus, e.g., (9)–(10) are true of precisely the same possible worlds, but arguably differ in meaning:
771
772
VII. Theories of sentence semantics (9) General Beauregard Lee is a woodchuck. (10) General Beauregard Lee is a groundhog, and either he lives in Georgia, or he does not live in Georgia. (9) and (10) are logically equivalent, i.e. their truth values coincide in all possible worlds. By definition, logical equivalence marks the limit of truth-conditional analysis of (sentence) meaning: the relation holds between two sentences if no possible worlds, however remote and however detailed they may be, can distinguish them. Consequently, whatever the felt differences between logically equivalent sentences like (9) and (10), they cannot be accounted for by truth-conditional semantics alone, because they are not reflected in Logical Space. We define the intension of an expression A as the function that assigns to each possible world the extension of A at (i.e.: for) that world. Since the intensions of sentences are functions characterising sets of possible worlds, they may be construed as forming a part of the power set of Logical Space. Using standard terminology, we will refer to the members of this power set, i.e. the subsets of Logical Space, as propositions and say that a sentence S expresses a proposition p just in case p is (or, more precisely: characterises) the intension of S. Like all power sets, the set of propositions has a straightforward algebraic structure induced by the familiar Boolean operations (union, intersection, and complement) and the concomitant subset relation; this is why the power set of Logical Space is also known as the algebra of propositions. As far as sentences and terms are concerned, the EBP says that synonymy may be equated with sameness of intension. A straightforward, though tedious, inductive argument shows that this equation generalises to arbitrary expressions A with extensions that have been constructed by applying (the extensional version of) FFP; for reasons of space we leave this for the reader to verify. We will end this section with a typical analysis of a simple example within the framework of possible worlds semantics: (11) Every boy fancies Mary and Jane pouts. The truth conditions of (11) will be specified by spelling out what it takes to make (11) true at an arbitrary world w (which we will now keep fixed). The lexical starting points should be clear from the above remarks and observations: the extension of the conjunction is a truth table; the extensions of the proper names are their bearers; those of the noun and the intransitive verb are (characteristic functions of) sets of individuals; the transitive verb has a (curried) binary relation as its extension; and the extension of the quantificational determiner is a (curried) relation between (characteristic functions of) sets of individuals – to wit, the relation of subsethood. Writing ‘||A||w’ for the extension of an expression A at world w, we thus have: (12) a. b. c. d. e. f.
||every‖w = λP. λQ. ⊢P ⊆ Q⊣ ||boy||w = λx. ⊢x is a boy in w⊣ ||fancies||w = λx. λy . ⊢y fancies x in w⊣ ||Mary||w = Mary; ||Jane||w = Jane ||and||w = λu. λv. u × v ||pouts||w = λx. ⊢x pouts in w⊣
33. Model-theoretic semantics A few words on the notation used in the equations (12) are in order: – As usual in formal semantics, ‘λx. ...x...’ denotes the function assigning to x whatever ‘…x…’ denotes. – If ‘…’ is a statement, then ‘⊢...⊣’ is its truth value, i.e. ‘⊢...⊣’ is short for ‘the truth value that is identical to 1 just in case …’. – In (12a), relations are identified with characteristic functions and, moreover, the succession order of arguments follows the surface bracketing of quantifier phrases, according to which the outermost argument (‘P’) corresponds to the extension of the noun. – (12e) exploits the fact that truth values are numbers that can be subjected to arithmetical operations – in this case multiplication. – For simplicity, the obvious temporal dependence of the right-hand sides of the equations (12b) and (12c) have been suppressed and may be thought of as being supplied by the utterance context. Since the constituents of (11) all constitute primary occurrences of one of their parts, their extensions can be derived by functional application: (13) a.
||every boy fancies Mary||w = ||every||w (||boy||)w (||fancies||w (||Mary||w)) = ⊢{x. ⊢x is a boy in w} ⊆ {y | y fancies Mary in w}⊣
b.
||Jane pouts||w = ||pouts||w (||Jane||)w = ⊢Jane pouts in w⊣
c.
||every boy fancies Mary and Jane pouts||w = ||and||w (||Jane pouts||w) (||every boy fancies Mary||w) = ⊢Jane pouts in w⊣ × ⊢{x | x is a boy in w} ⊆ {y |y fancies Mary in w}⊣
– i.e. the product of the two truth values determined in (13a) and (13b), which is 1 just in case Jane pouts in w and the boys all fancy Mary in w. This appears to be a correct characterisation of the truth conditions (literally) expressed by (11).
2.2. Material models As long as they can be derived compositionally from the extensions of their immediate parts, the extensions of complex expressions are fully determined by the extensions of their ultimate parts and the way the latter are combined. So in order to specify the extensions of arbitrary (extensional) expressions, it suffices to specify (a) the extensions of all (extensional) lexical expressions and (b) the way the (extensional) grammatical constructions combine them. If these constructions constitute the primary contexts of expressions analyzed in terms of (the extensional version of FFP), (b) is always a matter of functional application. For instance, since the primary occurrences of predicates P are simple predications T P, the extensions of the latter come out as the result of applying the extensions of the former to the those of the subjects T: (14) ‖T P‖w = ‖P‖w (‖T‖w)
773
774
VII. Theories of sentence semantics Other (extensional) constructions may require some ingenuity on the semanticist’s part in order to determine the precise way in which extensions combine. Quantified objects are an infamous case in point. As the reader may verify, for any possible world w, the extension of a predicate of the form V Q, where V is transitive verb and Q is a quantifier phrase (in direct object position), can be given by the following equation: (15) ‖V Q‖w = λx. ‖Q‖w (λy. ‖V‖w (y)(x)) Like (14), equation (15) completely specifies the extension of a certain kind of complex expression in terms of the extensions of its immediate parts; and it does so in a perfectly general manner, for arbitrary worlds w. By definition, this is so for all extensional constructions, which are extensional precisely in that the extensions of the expressions they combine determine the extensions of the resulting phrases, thus giving rise to general equations like (14) or (15). In particular, then, the combinatorial part (b) of the specification of extensions does not depend on the particular circumstances envisaged. It thus turns out that the world dependence of extensions is entirely a matter of lexical specification (a). Hence the rôle worlds play in the determination of extensions can be seen as being restricted to the extensions of lexical expressions – the rest is compositionality, captured by general equations like (14) and (15). To the extent that the determination of extensions is the only rôle possible worlds w play in (compositional extensional) semantics, they could be identified with functions assigning extensions to lexical expressions. If A is a lexical expression, we would thus have: (16) ‖A‖w = Fw(A) In (16) Fw is a function assigning to any expression in its domain the extension of that expression at world w. What precisely is the domain of Fw? It has just been noted that it only contains lexical expressions – but does it include all of them? A brief reflection shows that this need not even be so; for some of the equations of the form (16) do not depend on a particular choice of w. For instance, as noted above, under any circumstances w the extension of a (sentence-) coordinating conjunction and is a binary function on truth values; the same goes for other ‘logical’ particles: (17) a. ‖and‖w = λu. λv. u × v b. ‖or‖w = λu. λv. (u + v) – (u × v) c. ‖not‖w = λu. 1 – u
[= (12e)]
The equations in (17) imply that, in any worlds w and w', the extensions of and, or, and not remain stable: ‖and‖w = ‖and‖w', ‖or‖w = ‖or‖w', and ‖not‖w = ‖not‖w'. This being so, the specifications of the world-dependent extensions of lexical expressions may safely skip such logical words. A similar case can be made for quantificational determiners and the is of identity: (18) a. ‖every‖w = λP. λQ. ⊢P ⊆ Q⊣ b. ‖no‖w = λP. λQ. ⊢P ∩ Q = ∅⊣ c. ‖is‖w = λx. λy. ⊢x = y⊣
[= (12a)]
33. Model-theoretic semantics
775
Again, the extensions specified in (18) do not seem to depend on the particular world w. However, there is a subtle difference to the equations in (17), which comes out by closer inspection of the ranges of the λ-bound variables: whereas ‘u’ and ‘v’ in (17) always stand for the truth values 0 and 1, in (18) ‘x’ and ‘y’ stand for individuals and ‘P’ and ‘Q’ for sets of them – but which, and how many, individuals there are, depends on the particular world w. As a consequence, the equations in (18) do depend on w, a fact which is obscured by our loose notation. On the other hand, the latter three equations only depend on what the set of all individuals in the world w is. In other words, their truth does not depend on the particularities of the world but (at most) on the domain of individuals. In this respect, logical words as analyzed in (17) and (18) are atypical: the extensions of the vast majority of lexical expressions do vary across Logical Space and also depend on more than what the domain of individuals is. As far as their extensions concerned, lexical equations like (19) represent the more typical cases than (17) and (18): (19) ‖boy‖w = λx. ⊢x is a boy in w⊣
[= (12b)]
One may thus distinguish three kinds of lexical expressions: (i) truth-functional ones, whose extension remains the very same across Logical Space; (ii) combinatorial ones, whose extension depends only on what the domain of individuals is; and (iii) non-logical ones, whose extensions may depend on all sorts of worldly facts. It has become customary to reflect this distinction in the assignments of world-dependent extensions to lexical expressions: Definition Given a possible world w and a language L, the material model (for L based on w) is the pair Mw = (Uw,Fw) consisting of the domain of individuals Uw in w and the lexical interpretation function Fw which assigns to every non-logical lexical expression A of L the extension of A at w. The above definition is to be understood as presupposing a prior and independent specification of the non-logical vocabulary NL of the language L under scrutiny – and thus implies that the interpretation functions of any two material models share their domain, which is precisely the set NL. It should then be clear how the extensions of arbitrary expressions A of a language L can be characterised in terms of, and as depending on, material models Mw = (Uw,Fw). This characterisation can be given in the form of an inductive definition, which starts out with the lexical material and then moves on step by step, following the grammatical construction principles of L, which derive (analysed and disambiguated) complex expressions from their (immediate) parts, thereby creating an increase in syntactic complexity. The following sample clauses offer a glimpse of the overall structure and content of such a definition for an extensional fragment E of English: (20) For any expression A of E and any material model Mw = (Uw,Fw) for E, the extension of A relative to Mw – |A|Mw – is determined by the following induction (on the grammatical complexity of A): (i-a) …
|and|Mw = λu. λv. u × v …
… where u∈{0, 1} and v∈{0, 1}
776
VII. Theories of sentence semantics Mw
(ii-a) …
|every| …
(iii)
|A|
Mw
(iv-a) |D N|
= λP. λQ. ⊢P ⊆ Q⊣
where P⊆Uw and Q⊆Uw
= Fw(A), if A∈NE
Mw
= |D|
Mw
Mw
(|N|
)
if D N is a quantifier phrase, where D is a quantificational determiner and N is a count noun; … … In order to complete (29), clauses (i) and (ii) would have to take care of all logical words of E. We will now give a semantic characterisation of logical words that helps to draw the line between the non-logical part NE – or NL in general – and the rest of the lexicon. We have already seen that one characteristic feature of logical words is that their extension is largely world-independent: it may depend on which individuals there are in the world but it does not depend on any particular worldly facts. However, having a world-independent intension (even stricto sensu) is not sufficient for being a logical word. Indeed, there are good reasons for taking the intensions of proper names as constant functions over Logical Space.; cf. Kripke (1972), where terms with constant intensions are called rigid designators. Still, a name like Jesus can hardly be called logical even if its bearer turned out to be present in every possible world. What, then, makes a word, or an expression in general, logical? The (rather standard) answer given here rests on the intuition that the extension of a logical word can be described in purely structural terms. As a case in point, the determiner no is logical because its extension may be described as the relation that holds between two sets of individuals just in case they do not overlap – or, in terms of characteristic functions: that function which, successively applied to two characteristic functions (of sets of individuals), yields (the truth value) 1 just in case these functions do not both assign 1 to any individual. If one thinks of functions as configurations of arrows leading from arguments to values, then the extension of no – and that of a logical word in general – can be described solely in terms of the abstract arrangement of these arrows, without mentioning any particular individuals or other extensions – apart from the truth values. In other words, the descriptions do not depend on the identity of the individuals, which may be anything – or replaced by any other individuals. The idea, then, is to characterise logical words as having extensions that are stable under any replacement of individuals. In order to make this intuition precise, a bit of notation should come in handy. It is customary to classify the extensions of expressions according to whether they derive (i) from the EBP, or (ii) by application of FFP: (i) the extensions of sentences are said to be of type t, those of terms are of type e; (ii) if the extension of an expression operates on extensions of some type a resulting in extensions of some type b, it is said to be of type (a,b). In other words, (i) t is the type of truth values; e is the type of individuals (or entities); (ii) (a,b) is the type of (total) functions from a to b. More precisely, t is a canonical label of the set of truth values, etc.; ‘t’ is mnemonic for truth value, ‘e’ abbreviates entity. In this notation, the extensions of sentences, terms, coordinating conjunctions, predicates, transitives, quantificational phrases, and determiners are of types t; e; (t, (t,t)); (e,t); (e,(e,t)); ((e,t),t); and ((e,t),((e,t),t)), respectively. It should be noted that, unlike the extensions of the expressions, their types remain the same throughout Logical Space (and thus across all material models). The function τL assigning to each expression of a
33. Model-theoretic semantics language L the unique type of its extensions is called the type assignment of L; we take τL to be part of the specification of the language L (and will usually suppress the subscript). Now for the characterisation of logicality. Given (not necessarily distinct) possible worlds w and w', a replacement (of w by w') is a bijective function from the domain of individuals of w to the domain of individuals of w' (which must therefore have the same cardinality). Replacements may then be generalized from individuals (of type e) to all types of extensions: given a replacement ρ (from w to w'), the following recursive equations define corresponding functions ρa, for each type a: – ρe = ρ; – ρt = {(0,0),(1,1)} [= λx. x, where ‘x’ ranges over truth values]; – ρ(a,b) = λf. {(ρa(x), ρb(y)| f(x) = y} [where ‘f’ ranges over functions of type (a,b)]. In other words, the generalised replacement leaves truth values untouched – because they define the structure of extensions like characteristic functions – and maps functional extensions to corresponding functions, replacing arrows from x to y by arrows between substitutes; it is readily seen that, for any type a, ρa is the identical mapping on the extensions of type a if (and only if) ρ is the identical mapping on the domain of individuals. Given the above generalisation of replacements from individuals to objects of arbitrary types, a logical word A can be defined as one whose extensions cannot be affected by replacements. More precisely, if ƒ is an intension of type a (i.e. a function from W to extensions of type a), then ƒ is (replacement-) invariant just in case for any replacements ρ and ρ' of w by some world w', it holds that ρα(f (w)) = ρ'α(f (w)); and logical words may then be characterised as lexical expressions A with invariant intensions: ρτ(A)(‖A‖w) = ρ'τ(A)(‖A‖w), for any worlds w and replacements ρ and ρ' of w by some world w'. The definition implies that ρτ(A)(‖A‖w) = ‖A‖w, whenever A is a logical word and ρ is a replacement from a world w to itself (or any world with the same domain of individuals) – which is a classical criterion of logicality, going back to Lindenbaum & Tarski (1935), and rediscovered by a number of scholars since; cf. MacFarlane (2008: Section 5) for a survey. Extensive treatments of replacements in (models of) Logical Space can be found in Fine (1977) and Rabinowicz (1979). Derivatively, we will also call extensions of types a invariant if they happen to be the values of invariant intensions of type a. Invariant extensions are always of a peculiar, combinatorial kind. In particular, and ignoring all too small universes (of cardinalities 1 and 2), extensions of type e are never invariant; there are only two invariant extensions of type (e,t), viz. the empty one and the universe; and four of type (e,(e,t)), viz. identity, distinctness, and the two trivial (empty and universal) binary relations; extensions of type ((e,t),t) are invariant just in case they classify sets of individuals according to their cardinalities; etc. As a consequence, the extensions of logical words are severely limited by the invariance criterion. Still, it takes more for an intension to be invariant than just having invariant extensions; in particular, a function that assigns different invariant extensions to worlds with the same domain cannot be invariant. More generally, if the domains U and U' of two material models M and M' have the same cardinality, the invariance condition on logical words of the second kind (ii) ensure that, intuitively speaking, they have analogous extensions. Thus, e.g., a determiner whose extension relative to M is the subset-relation on U will denote the subset on U' in M'.
777
778
VII. Theories of sentence semantics However, if U and U' have different cardinalities, logicality does not exclude that the extension varies between, say, subsethood on U and disjointness on U'. Analogous observations can be made in connection with the logical operations associated with syntactic constructions. Unfortunately, there seems to be no generally agreed identification procedure of extensions across varying domains to exclude such cardinality-sensitive shifts; cf. Machover (1994: 1081ff); Casanovas (2007). We will briefly return to this point in Section 3.1. The reader may now verify for him- or herself that all extensions specified in clauses (i) and (ii) of (20) are indeed invariant. It may also be noted that the combinations specified in clause (iv) are invariant once they are themselves construed as extensions: the compositional combination of extensions generally proceeds by way of purely structural operations, corresponding to invariant functional extensions. The reader is reminded that we restrict attention to extensional constructions; non-extensional compositionality will be addressed in the following section. The gaps in (20), then, are meant to be filled in as follows. The extensions of truthfunctional logical words, i.e. those whose type consists of ts, commas, and brackets only, are specified in clause (i); and this specification is entirely independent of any particular world. The extensions of all other logical words (which, for lack of a better term, we continue to call combinatorial) are specified in clause (ii); their intensions are invariant, and their extensions in general depend on the domain of individuals. Finally, the ways in which the extensions of complex expressions depend on the extensions of their immediate parts, correspond to invariant (and generally domain-dependent) functional extensions and are specified in clause (iv), according to the syntactic constructions involved. This ends our general characterisation of how extensions are determined according to material models. Using the obvious correspondence with (12), (20) may then be completed to recapitulate the above analysis of (11), replacing possible worlds w with corresponding material models Mw and arriving at the following equation: (21) |every boy fancies Mary and Jane pouts|Mw = ⊢Jane pouts in w⊣ × ⊢{x | x is a boy in w} ⊆ {y | y fancies Mary in w}⊣ (21) easily follows from [a completion of] (20). In fact, once the relevant inductive clauses are adapted in the fashion illustrated above, equation (22a) holds quite generally, for any expression A of a language L and any world w and material model Mw; and since (22a) holds for all expressions A (of a given language L), then (22b) is true of all expressions A and B (of the same language): (22) a. |A|Mw = ‖A‖w b. ‖A‖w = ‖B‖w iff |A|Mw = |B|Mw In other words, two expressions are synonymous just in case their extensions coincide across all material models. As a consequence, material models can also be used to characterise sense relations in quite the same way as possible worlds were above. For instance, a sentence S implies a sentence S' in the sense of the above definition just in case the intension of the former is a subset of the latter, i.e. {w | ‖S‖w = 1} ⊆ {w | ‖S'‖w = 1}, which
33. Model-theoretic semantics by (22a) is the case iff {w | |S|Mw = 1} ⊆ {w | |S'|Mw = 1}. Similar arguments apply to the other sense relations defined above. Let us define the L-intension |A| of an expression A (of a language L) as the function that assigns A’s extension (according to Mw) to each material model: |A|(Mw) = |A|Mw, i.e. |A| = λMw. |A|Mw. Observation (22) shows that L-intensions are as good as intensions when it comes to determining extensions; and we have seen that they may also be used to reconstruct sense relations defined via intensions, basically on account of (22b). Moreover, the Boolean structure of the set of propositions expressible in L (= those that happen to be the intensions of a sentence of L) turns out to be isomorphic to the Boolean structure of L-propositions (= the L-intensions of the sentences of L): as the patient reader may want to verify, the mapping from ‖S‖w to |S|Mw is one-one (injective) and preserves algebraic (Boolean) structure in that ‖S‖w ⊆ ‖S'‖w iff |S|Mw ⊆ |S'|Mw, ‖S‖w ∩ ‖S'‖w = |S|Mw ∩ |S'|Mw, etc. The tight relation between Logical Space and the set of material models can also be gleaned from considering worlds that cannot be distinguished by the expressions of a given language L: Definition If w and w' are possible worlds and L is a language, then w is L-indistinguishable from w – in symbols: w ≡ Lw' – iff ‖A‖w = ‖A‖w', for any expression A of L. In general, L-indistinguishability will not collapse into identity. The special case in which it does, will be marked for future reference: Definition A language L is discriminative iff no two distinct possible worlds w and w' are L-indistinguishable. Hence, in a discriminative language L any two distinct possible worlds w and w', can be distinguished by some expression A: ‖A‖w ≠ ‖A‖w'; it is not hard to see that in this case the material models stand in a one-one correspondence with Logical Space. In particular, if L contains a term the referent of which at a given world is that world itself, L will certainly be discriminative; arguably, the definite description the world is such a term, thus rendering English discriminative. In any case, for almost all languages L, any two L-indistinguishable worlds give rise to the same material model: (23) If w ≡ Lw', then Mw = Mw'. As a consequence, material models stand in a one-one correspondence to the equivalence classes induced by L-indistinguishability (which is obviously an equivalence relation). Since L-intensions and material models are suited to playing the role of worlds in determining extensions and to characterising Boolean structure, one may wonder whether they could replace worlds and intensions in general. In other words, is it possible to do (extensional) semantics without Logical Space altogether? In particular, could meaning be defined in terms of material models instead of possible worlds? Ontological thrift seems to support this option: on the face of it, material models are rather down-to-earth mathematical structures as compared to the dubious figments of imagination that
779
780
VII. Theories of sentence semantics possible worlds may appear to the semantic layman. However, this impression is mistaken: though material models are mathematical structures, they consist of precisely the stuff that possible worlds are made of, viz. fictional individuals under fictional circumstances. Still, replacing intensions with L-intensions may turn out to be, if not ontologically less extravagant, at least theoretically more parsimonious, sparing us a baroque theory of Logical Space and its structure. However, this is not obvious either: – So far material models have been defined in terms of Logical Space, since the extension a material model assigns to an expression A depends on details about w. If material models are to replace worlds altogether, an independent characterisation for them would have to be given. For instance, we have seen that no possible world separates (9) from (10), particularly because there is no intensional difference between woodchuck and groundhog. Hence these two nouns have the same L-intension: this is a consequence of the above definition of a material model, relating it to the world w on which it is based; it is reflected in the general equation (20iii). Now, if the very notion of a material model is going to be defined independently of Logical Space, then the coincidence of |woodchuck|Mw and |groundhog|Mw in all material models Mw would have to be guaranteed in some other way, without reference to the worlds these models are based on. Hence some restriction to the effect that no model can assign different extensions to the nouns under consideration would have to be formulated. And even if in this particular case the restriction should appear unwelcome, there are other cases in which similar restrictions are certainly needed – like the infamous inclusion relation between the extensions of bachelor and unmarried. Arguably, at the end of the day the restrictions to be imposed on the material models amount to a theory of Logical Space; cf. Etchemendy (1990: 23f) for a similar point. – Whereas Logical Space is absolute, material models are language-dependent. In particular (and disregarding neurotic cases), for any two distinct languages L and L', the sets of material models of L and of L' will not overlap. Hence if material models were to replace worlds in semantic theory, expressions from different languages would always have different intensions. In particular, sentences would never be true under the same circumstances, if circumstances corresponded to models. To make up for this obvious inadequacy, some procedure for translating material models across languages would be called for. For instance, a material model for English assigning a certain set of individuals to the noun groundhog would have to be translated into a material model for German assigning the same set to the noun Murmeltier. In general, then, circumstances would correspond to equivalence classes of material models of different languages. Obviously, at the end of the day the construction of these equivalence classes again recapitulates the construction of Logical Space; presumably, a oneone correspondence between material models and possible worlds requires an ideal ‘language’ of Logical Space, as envisaged by Wittgenstein (1922). It thus appears that Logical Space can only be eliminated from (extensional) semantics at the price of a theory of something very much like Logical Space. The upshot is that neither ontological objectionability nor theoretical economy are sufficient motives for reversing the conceptual priority of Logical Space over the set of material models. On the other hand, this does not mean that material models ought to be dispensed with. On
33. Model-theoretic semantics the contrary, they may be seen as compressions of possible worlds, reducing them to the barest necessities of semantic analysis.
2.3. Intensionality The adaptation of FFP to derive extensions for expressions other than sentences and terms only works in so-called extensional contexts, i.e. (syntactic) constructions in which extensionally equivalent parts may replace each other without affecting the extension of the host expression. However, a number of environments prove to be non-extensional. Clausal complements to attitude verbs like think and know are classical cases in point: their contribution to the extension of the predicate cannot be their own extension which is merely a truth value; if it were, materially equivalent clausal complements (= those with identical truth values) would be substitutable salva veritate (= preserving the truth [value] of the host sentence) – which they are not. E.g., (24) and (25) may differ in truth value even if the the underlined clauses do not. (24) John thinks Mary is home. (25) John thinks Ann is pregnant. This failure of substitutivity obviously blocks the derivation of the extension of think via the extensional version of FFP: if it were a function ƒ assigning the predicate extension to the extension of the complement clause, then, as soon as the complement clauses have the same truth value (and are thus co-extensional), ƒ would have to assign the same extension to the predicates: (26) If: ‖Mary is home‖w = ‖Ann is pregnant‖w, then: f(‖Mary is home‖w) = f(‖Ann is pregnant‖w). The argument (26) shows that an attitude verb like think cannot have an extension ƒ that operates on the extension (= truth value) of its complement clause. It does not show that attitude verbs do not have extensions, nor that FFP cannot be used to determine them. Rather, since FFP seeks to derive the extension of an expression in terms of the contribution made by its natural environment (or primary context), the lesson from (26) ought to be that this contribution must consist in more than a truth value. On the other hand, given Subst, it is safe to assume that it is the intension of the complement clause; cf. Frege (1892). After all, by the BP, any two intensionally equivalent sentences are synonymous and may thus replace each other in any (sentential) position – including non-extensional ones – without affecting the truth conditions of the host sentence. In particular, any two sentences of the forms T thinks S and T thinks S' (where T is a term) will be synonymous if the sentences S and S' have the same intension. But then at any possible world, the extensions of the predicates thinks S and thinks S' coincide, and thus so do their intensions. In other words, there can be no difference in the extensions of the predicates without a difference in the intensions of the complement clauses, which is to say that the former functionally depend on the latter. Following the spirit of FFP, then, one can think of the intensions of the embedded clauses as the contributions they
781
782
VII. Theories of sentence semantics make to the extension of the predicate and thus take the extension of the attitude verb to be the function that assigns the extension of the host predicate to this contribution. Hence, at any world w, the extension of thinks comes out as a function that assigns sets of individuals to sets of possible worlds such that the following equation holds: (27) ‖thinks‖w (‖S‖)(‖T‖w) = ‖T thinks S‖w (27) illustrates a common strategy of assigning extensions to expressions that create nonextensional contexts (= those in which extensional substitution fails): given compositionality, they are taken to denote functions operating on the intensions of the expressions in these contexts. This is why non-extensional contexts are usually referred to as intensional. The above strategy of assigning extensions to expressions that create non-extensional contexts requires a richer system of types than the one introduced in the previous section (to which we will from now on refer to as extensional types). More specifically, the extensions of expressions are classified according to whether they derive (i) from the EBP, or (ii) by application of FFP: (i) the extensions of sentences of type t, those of terms are of type e; (ii) if the extensions of an expression operates on extensions of some type a resulting in extensions of some type b, it is of type (a,b); and it is of type ((s,a), b) if it operates on intensions of expressions whose extensions are of type a, resulting in extensions of type b. Hence (i) t is the type of truth values; e is the type of individuals (or entities); (ii) (a,b) is the type of (total) functions from a to b; and (s,a) is the type of (total) functions from Logical Space to a. The notation goes back to R. Montague (1970); ‘s’ is reminiscent of sense, Frege’s (1892) term for (something close to) intension. In this notation, the extension of an attitude verb like think is of type ((s,t),(e,t)). Once the type of attitude verbs has been determined, the compositional derivation of the truth values of attitude reports is straightforward: (28) = =
‖Jane doubts that every boy fancies Mary‖w ‖doubts that every boy fancies Mary‖w (‖Jane‖w) ‖doubts‖w (‖every boy fancies Mary‖)(‖Jane‖w)
… which is the truth value 1 just in case in world w, Jane stands in a certain relation Dw – the extension of doubt at w – to the set of worlds in which the boys form a subset of the individuals fancied by Mary. An adequate lexical analysis of attitude verbs should imply that standing in Dw to any set p of worlds is incompatible with standing in Bw to it, where Bw is the extension of believe: (29) [λp. λx. ‖doubt‖w (p)(x) ∩ ‖believe‖w (p)(x)] = ∅ Compositional derivations like (28) suggest that they can again be simulated with material models in lieu of worlds, as in the case of extensional constructions. In fact, this only requires models to also assign extensions of intensional types containing ‘s’ to certain lexical expressions (like attitude verbs) and to allow the extensions of complex operations to be obtained by combining the extensions and/or intensions of their immediate parts. In other words, the general definition of the notion of a material model can be kept as is, under the assumption that the reference to types is adapted so as to include the intensional ones; and on top of the clauses in (20), the recursive procedure for determining
33. Model-theoretic semantics extensions relative to material models of Ê may also contain clauses like the following, where Ê is a more inclusive fragment of English than E: (iv-c)
|V S|Mw = |V|Mw (λw'. |S|Mw') if V S is a predicate, where V is an attitude verb and S is a clausal complement.
Equations like (iv-c) show that the programme of eliminating Logical Space in favour of the set of material models cannot be upheld beyond the realm of extensional constructions. For, unlike the intension of the embedded clause, λw'. |S|Mw |S|Mw', the corresponding L-intension, λMw'. |S|Mw', cannot serve as an argument to the extension |V|Mw of the attitude verb: the latter is the value of a lexical extension assignment Fw, which itself is a component of Mw, which in turn is a member of the domain of λMw'. |S|Mw' – which cannot be, for set-theoretic reasons. It should be noted that the intension of the embedded clause is defined in terms of its extensions relative to all other material models. While this does not present any technical problem, it does make the material models less self-contained than their extensional counterparts, which contain all the information needed to determine the extensions of all expressions. If need be, this seeming defect can be remedied by tucking in all of Logical Space and its inhabitants as components of material models: Definition Given a possible world w* and a language L, the intensional material model (for ˆ = (W, Û, w*, Fˆ ) consisting of the set W of L based on w*) is the quadruple M w* all possible worlds; the domain function Û assigning to each possible world w the domain of individuals Uw; the world w* itself; and the lexical interpretation function Fˆ which assigns to every non-logical lexical expression A of L the intension of A at w. Two complications arising from this definition are worth noting: – Universal and existential quantification over possible worlds are prime candidates for logical operations of type ((s,t),t), the former yielding the truth value 1 only if applied to Logical Space itself, whereas the latter is true of all but the empty set of possible worlds. Since the extensions of lexical expressions need not be of extensional types, the logicality border needs some adjustment. For the time being we will leave this matter open, returning to it in Section 3.2 with a natural extension of the replacement approach to logicality. – Lest clauses like (iv-c) should make reference to anything outside the intensional model, the lexical interpretation function is defined for all worlds w. As a consequence, two distinct intensional material models only differ in their 3rd component, and each determines the extensions of all expressions at all possible worlds. Hence the procedure for determining the extensions is more general than expected; its precise format will also be given in Section 3.2. Intensional material models are rather redundant objects, which is why they are not used in real-life semantics; we have mainly defined them here for future reference. Once again, it is obvious that set theory precludes Logical Space as it appears in them, from
783
784
VII. Theories of sentence semantics being replaced by the set of all intensional material models. On the other hand, its rôle might be played by the set of extensional material models, i.e. those that only cover the extensional part of L – containing only its lexical expressions with extensions of extensional types, and its extensional syntactic constructions.
3. Model-theoretic semantics 3.1. Extensional model space One rather obvious strategy for constructing (extensional) models within set theory is to start with material models and replace their individuals by (arbitrary) set-theoretic objects; as it turns out, this can be done by a straightforward generalisation of the replacements used in the above characterisation of logical words (cf. Section 2.2). However, since the individuals to be replaced also function as the building blocks of extensions, the latter will have to be generalised first. Given any non-empty set U, the U-extensions of type e are the elements of U; the U-extensions of type t are the truth values; and if a and b are extensional types, then the U-extensions of type (a,b) are the (total) functions from U-extensions of type a to U-extensions of type b. Hence, U-extensions (of extensional types) are to the elements of U what ordinary extensions are to the individuals of a given possible world; and clearly, ordinary extensions come out as Uw-extensions, where Uw happens to be the domain of individuals in w. Definition Given a language L, a formal model (for L) is a pair M = (U, F) consisting of a non-empty set U (= the universe of M) and a function F which assigns to every non-logical lexical expression A of L a U-extension of type τL(A). Using the recursive procedure applied earlier, we can extend any bijection ρ between any (not necessarily distinct) non-empty sets U and Ũ of the same cardinality, to a family of functions ρa taking U-extensions to corresponding Ũ-extensions (where a is an extensional type): ρe = ρ; ρt(x) = x, if x is a truth value; and ρb( f (x)) = ρ(a, b)(f )(ρa(x)), whenever f and x are U-extensions of types (a,b) and a, respectively. It is readily verified that replacements assign structural analogues to the U-extensions they are applied to. For instance, if f is a U-extension of some type a, then ρ(a,t)(f ) is (i.e., characterises) the set of all Ũ-extensions of the form ρa(x), where x is a U-extension of type a; in particular, and given that ρa is a bijection, f and ρ(a,t)(f ) are of the same cardinality. It is also worth noticing that the values of invariant extensions are themselves invariant, in an obvious sense: Observations Let Uw be the domain of individuals of some world w, X an invariant Uwextension of some type a, and ρ a bijection from Uw to a set U of the same cardinality. Then: (*) ρ'a(X) = ρ"a(ρa(X)), for any bijections ρ' and ρ" from Uw and U to some set U' of the same cardinality, respectively; (**) ρa(X) = ρ'a(X), for any bijection ρ' from Uw to U. The proofs of (*) and (**) are rather straightforward and thus left to the readers.
33. Model-theoretic semantics
785
Generalised replacements may now be put to use to substitute material models by structurally identical set-theoretic objects. If M = (U, F) and M˜ = (U%, F% ) are formal models, a bijection ρ from U to Ũ is called a model-isomorphism (from M to M˜ ) just in case F˜ (A) = ρτ(A)(F(A)), whenever A∈NL. If there exists a model-isomorphism from M to M˜, M is said to be isomorphic to M˜ – in symbols: M ≅ M˜. Obviously this relational concept is reflexive as well as symmetric and transitive. Clearly, even if two models are isomorphic, not every bijection between their domains is an isomorphism; but in general there exists more than one isomorphism between them. Given a formal model M = (U,F) and any bijection ρ from U to a set U* of the same cardinality, there exists a formal model M* = (U*, F*) such that ρ is a modelisomorphism from M to M* (and thus M ≅ M*): M* can be constructed by putting: F*(A) : = ρτ(A)(F(A)), whenever A ∈ NL. Hence given a material model Mw = (Uw, Fw), an isomorphic formal model M = (U, F) can be constructed from any set U of the same cardinality as Uw, by choosing an arbitrary bijection ρ from Uw to U. In this case M is said to represent Mw. It should be noted that, if U = Uw, the resulting formal model need not be a material model, because the structure imposed on its individuals may go against their very nature. For example, if U happens to contain the world w itself, then replacing it with Tom, Dick or Harry could put w in the extension of bachelor, which may be not be a genuine possibility for a world to be. If a material model for a language L is represented by a formal model, the recursive procedure for determining extensions of arbitrary expressions relative to the former carries over to the latter: (30) If ρ is a model-isomorphism from the material model Mw = (Uw, Fw) (of a language L) to the formal model M = (U, F), then the extension [[A]]M of an expression A is determined by the following induction on A’s complexity: (i) (ii) (iii) (iv)
[[A]]M = |A|Mw [= ||A||w], if A is a truth-functional lexical item; [[A]]M = ρτ(A)(|A|)Mw), if A is a combinatorial lexical item; [[A]]M = F(A) [= ρτ(A)(Fw(A))], if A ∈ NL; [[A]]M = ρτ(A)(G(ρ–τ 1(B )([[B1]]M),..., ρ–τ 1(Bn)([[Bn]]M))), if A is a complex expression 1 with immediate constituents B1,…, Bn such that |A|Mw = G(|B1|Mw,...,|Bn|Mw).
The final clause relates to the compositional interpretation of L in terms of invariant functions that combine the extensions of the parts of complex expressions. In general, these semantic operations cannot be gleaned from the material models Mw or the extensions |A|Mw determined relative to them; rather, they must be specified independently, as was done above for an extensional fragment of English. Given such a specification, the extensions defined in (30) turn out to be independent of ρ; the verification of this fact – which essentially turns on the above observation (**) on invariance – is left to the reader. The following observation about isomorphic formal models M and M˜ (for any language L) is fundamental to model-theoretic semantics: (31) Let ρ be a model isomorphism from M to M˜. Then for all expressions A of L, ρτ(A)([[A]] M) = [[A]] M.. ˜
(31) follows from (30) by a straightforward inductive argument; clause (iv) makes use of the above observation (**) on invariance. The special case of (31) where M = M˜ = Mw
786
VII. Theories of sentence semantics (for some world w) and ρ is the identical mapping on Uw, reveals that, as far as material models are concerned, the extensions defined in (30) are the familiar ones (and still independent of ρ): (32) If Mw is a material model for a language L and A is an expression of L, then cf. Section 2.2, (22a) [[A]]Mw = |A|Mw = ||A||w. Another immediate consequence of (31) concerns (declarative) sentences S (of a given language L), for which τL(S) = t and hence ρτ(S)([[S]]M) = [[S]]M: (33) If the formal model M (for a language L) represents the material model Mw, then [[S]]M = [[S]]Mw, for all sentences S of L. This observation becomes important in the set-theoretic reconstruction of Logical Space, to which we now turn. As usual in set theory, we will refer to those objects that may occur as elements of sets without being sets themselves, as urelements; and to sets that can be constructed without the aid of urelements (and whose existence is thus guaranteed by set-theoretic principles alone) as pure sets. In view of ontological reservations against possible worlds, such pure sets ought to replace the universes of models: Definition If L is a language, then (L’s) Ersatz Space is the class of all formal models M = (U, F) of L such that U is a pure set and M represents a material model. We will henceforth refer to the elements of L’s Ersatz Space as ersatz models (for L). Since the inventory of Logical Space consists of urelements, no material model is an ersatz model. On the other hand, every material model is represented by some ersatz model and thus makes a vicarious appearance in Ersatz Space, which is why the latter as a whole may be seen as a representation of Logical Space. It should also be noted that, according to our definitions, the material models and the ersatz models do not exhaust the class of all formal models for a language; and that merely having a pure set as its universe does not make a formal model an ersatz model. Clearly, Ersatz Space is far too big to form a set and therefore calls for a background theory that includes proper classes alongside ordinary sets; Cf. Mendelson (1997: 225ff) for a survey of a pertinent set-theoretical framework. Since we have been assuming that Logical Space (as we characterised it) is a set, one cannot expect the ersatz models (for a given language L) to stand in a one-one correspondence with possible worlds or material models. However, the very construction of Ersatz Space via representation suggests that the objects corresponding to worlds or material models are not the ersatz models themselves, but their isomorphicity-classes, i.e. the (proper) classes of the form: (34) |M0|≅ : = {M | M is an ersatz model for L & M ≅ M0}, where M0 is an ersatz model for L. It is therefore natural to inquire into the relation between the classes characterised in (34) and the members of Logical Space. We have already seen that the latter may themselves be represented by material models, and that
33. Model-theoretic semantics this representation is a perfect match if the language L is discriminative. However, even if the material models correspond to the possible worlds in a one-one fashion, there is no guarantee that so do the classes of ersatz models in (34). More precisely, even if (35) holds of any worlds w and w' and the corresponding material models Mw and Mw' the analogous implication (36) about the latter and their representations in Ersatz Space need not be true: (35) If w ≠ w', then Mw ≠ Mw. (36) If Mw ≠ Mw' then |Mw|≅ ≠ |Mw'|≅ In other words, distinct material models may be represented by the same ersatz models – which will be the case precisely if L allows for distinct, but isomorphic material models in the first place. Discriminativity does not exclude this possibility: it only implies some extensional differences between any two material models; but these differences could be made up for by the replacements used in representing material models in Ersatz Space. In order to guarantee a perfect match between Logical Space and Ersatz Space, a stronger condition on L is needed than discriminativity. The natural candidate is completeness, which is defined like discriminativity, except that it is not based on indistinguishability but the weaker notion of equivalence: Definitions If w and w' are possible worlds and L is a language, then w is L-equivalent to w' – in symbols: w ≈Lw' – iff ‖S‖w = ‖S‖w', for any declarative sentence S of L. A language L is complete iff no two distinct possible worlds w and w' are L-equivalent. Unlike L-indistinguishability, L-equivalence is not affected by replacements in that it only concerns the truth values of sentences rather than the extensions of arbitrary expressions. The following observations about arbitrary worlds w and w' and languages L are not hard to establish: (37) a. If w ≡ Lw', then w ≈ Lw'. b. If L is complete, then L is discriminative. c. If L is complete and w ≠ w', then |Mw|≅ ≠ |Mw'|≅ In effect, (37c) says that, via the isomorphicity classes, the Ersatz Space of a complete language matches Logical Space. Thus, completeness plays a similar rôle for the adequacy of ersatz models as does discriminativity in the case of material models. However, whereas discriminative languages are easy to find, completeness seems a much rarer property. Of course, if a language is incomplete, its ersatz models could still match Logical Space in the sense of (37a), but then again their isomorphicity classes may equally well contain, and thus conflate, representations of distinct worlds. However, since the differences between distinct worlds that correspond to the same ersatz models are – by definition – inexpressible in the language under investigation, this imperfect match
787
788
VII. Theories of sentence semantics between Logical Space and Ersatz Space does not necessarily conflict with the general programme of replacing possible worlds with formal models. As far as its potential in descriptive semantics goes, then, Ersatz Space does seem to earn its name. However, as the reader will have noticed, its very definition still appears to be of no avail when it comes to soothing ontological worries: by employing the concept of representation, it depends on material models – and thus presupposes Logical Space. To overcome this embarrassment, a definition of Ersatz Space is needed that does not rely on Logical Space. The usual strategy is one of approximation, starting out from a maximally wide model space and gradually restricting it by eliminating those formal models that do not represent any material counterparts; the crucial point is that these restrictions be formulated without reference to Logical Space, i.e. in the language of pure set theory, or with reference only to urelements that are less dubious than the inhabitants of Logical Space. We will refer to the natural starting point of this enterprise as L’s Model Space, and identify it with the class of all pure models (for a given language L), i.e. all formal models whose universe is a pure set. It should be noted that the concept of a pure model only depends on L’s syntax and type assignment, both of which in principle may be given in terms of pure set theory. In order to exploit Model Space for semantic purposes, the procedure for determining the extensions of expressions must be generalised from ersatz models to arbitrary pure models. Hence, for any pure model M = (U, F), the extensions of logical expressions need to be specified, as well as the combinations of extensions corresponding to the syntactic constructions. As in the case of Logical Space and Ersatz Space, this should not pose any particular problems. In fact, these extensions and combinations are invariant and should only depend on the universe U. Thus, for those U that happen to be of the same cardinality as the universe Uw of some material model Mw = (Uw, Fw), their specifications may be taken over from Ersatz Space that is bound to contain at least some pure model with universe U (and isomorphic to Mw). For all other models they would have to be suitably generalised. Thus, e.g., it is natural to define the extension of the English determiner every as characterising the subset relation on a given universe U, even if the latter is larger than any of the universes Uw encountered in Logical Space; similarly, the extension of a quantifier phrase D N, where D is a determiner and N a count noun, can be determined by functional application – which, suitably restricted, is an invariant U-extension of type ((e,t),t),(e,t); etc. At the end of the day, then, the formal models resemble their material and ersatz counterparts. In particular, the specification of extensions is strikingly similar: (38) For any expression A of E and any formal model M = (U, F) for E, the extension of A relative to M – [[A]]M – is determined by the following induction (on the grammatical complexity of A): (i-a) [[and]]M = λu. λv. u × v … … (ii-a) [[every]]M = λP. λQ. ⊢P ⊆ Q⊣ … … (iii) [[A]]M = Fw(A), if A∈NE (iv-a) [[D N ]]M = [[D]]M ([[N]]M)
… where u∈{0, 1} and v∈{0, 1} where P⊆U and Q⊆U
33. Model-theoretic semantics if D N is a quantifier phrase, where D is a quantificational determiner and N is a count noun; … … Given specifications of extensions in the style (38), one may start approximating the Ersatz Space of a language L by restricting its Model Space. A common strategy to this end is to eliminate models that have certain sentences come out false, viz. analytic sentences that owe their truth to their very meaning. As a case in point, one may require of appropriate models for English that the extension of (39) be the truth value 1. For the sake of the example it may be assumed that the extension of a predicative adjective is a set of individuals and that it is passed on to the predicate (Copula + Adjective); cf. Heim & Kratzer (1998: 61ff). (39) No bachelor is married. More generally, a set Σ of sentences of L may be used to characterise the class of all pure models M (for L) such that [[S]]M = 1, for all S∈Σ. Ever since Carnap (1952), sentences of a language L that are used to characterise the appropriateness of formal models L are called meaning postulates; and we will refer to a set of sentences used for this purpose as as a postulate system. Obviously, the truth of a meaning postulate like (39) may guarantee the truth or falsity of certain other sentences: (40) Every bachelor is not married. (41) Some bachelor is married. Once (39) is adopted as a meaning postulate, there is no need to also take on (40), or to rule out (41) separately, because the truth values of these two sentences come out as intended in any formal model for English relative to which (39) is true. Alternatively, if (40) is taken as a meaning postulate, (39) and (41) come out as desired. As this example suggests, when it comes to approximating Ersatz Space by meaning postulates, there need not be a general criterion for preferring one postulate system over another. And it is equally obvious that the effect of a meaning postulate, or a postulate system, may be achieved by other means. Thus, e.g., the effect of adopting (39) as a meaning postulate may also be obtained by the following constraint on appropriate formal models M = (U, F) for E: (42) F(bachelor) ∩ F(married) = ∅ In other words, the effect of adopting (39) or (40) as a meaning postulate is to establish a certain sense relation between the noun bachelor and the participle married (which we take to be a lexical item, if only for the sake of the example). The relation is known as incompatibility and holds between any two expressions just in case their extensions of some type (a,t) cannot overlap. For future reference, we also define the relation of compatibility as it holds between box and wooden – and in general between expressions A and B if their extensions may overlap. Interestingly, compatibilities do not have to be
789
790
VII. Theories of sentence semantics established by meaning postulates because they are guaranteed by the existence of any model attesting the overlaps; hence compatibility will hold as long as Model Space is not (erroneoulsy) narrowed down so as to exclude all of these models. Obviously any meaning postulate S can be replaced by a corresponding constraint on formal models to the effect that S come out true; and given a procedure for determining the extensions of arbitrary expressions, this constraint can be given directly in terms of the extensions of the (non-logical) lexical expressions S contains. On the other hand, not every constraint on formal models needs to correspond to a meaning postulate, or even a postulate system; cf. Zimmermann (1985) for a concrete example. On top of restrictions on the range of possible extensions of lexical expressions, more global constraints may rule out pure models that do not belong to Ersatz Space for cardinality reasons. In general, then, the set-theoretic reconstruction of Logical Space consists in formulating suitable constraints on the members of Model Space, i.e. the pure models. Taken together, these constraints define a class K of appropriate models (according to the constraints). Ideally, this class should coincide with Ersatz Space. The closer K gets to this ideal, and the more it coincides with Ersatz Space in semantically relevant aspects, the more descriptively adequate will K be. Thus, e.g., the set of K-valid sentences – those that are true according to all M∈K – should be the same as the set of sentences that are true throughout Logical Space. If two formal models are isomorphic, both represent precisely the same material models, and hence there is no reason for ruling out (or counting in) one but not the other. Appropriateness constraints are thus subject to the following meta-constraint on classes K of appropriate models: (43) If M ≅ M', then M∈K iff M'∈K. In mathematical jargon, (43) says that the class K of appropriate models is closed under isomorphism: if M and M' are isomorphic formal models (for some language L), then M is appropriate according to a given set of constraints just in case M' is appropriate according to the same constraints. Two things about (43) are noteworthy. First, as a consequence of (31), constraints formulated in terms of meaning postulates (or postulate systems) always satisfy (43): any two isomorphic models make the same sentences true. Secondly, if the constraints are jointly satsifiable at all (which they should be), the appropriate models always form a proper class; this is so because for any formal model M = (U, F) and any set U ' of the same cardinality as U, there is an isomorphic model of the form M' = (U ', F ') – and hence any pure set whatsoever will be a member of some universe of an appropriate formal model. One interesting consequence of (43) is that, no matter how closely a given system of constraints and/or meaning postulates may approximate Logical Space, it will never be able to pin down a specific model, let alone specific extensions of all expressions. In fact, given any term A (of some language L) and any pure set x, there will be an appropriate model Mx = (Ux, F) (for L), according to which [[A]]Mx = x; Mx may be constructed from any given appropriate model M = (U, F) by replacing [[A]]M with x (and simultaneously x with U, in case x∈U) – thereby preserving appropriateness, by (43). This only reflects the strategy of having inhabitants of Logical Space represented by arbitrary settheoretic objects and is thus hardly surprising. However, a similar line of thought does
33. Model-theoretic semantics
791
give rise to interesting consequences for the relation between reference and truth. This is the gist of the following permutation argument, made famous by Putnam (1977, 1980) with predecessors including Newman (1928: 137ff), Jeffrey (1964: 82ff), Field (1975), and Wallace (1977); see Devitt (1983), Lewis (1984), Abbott (1997), Williams (2005: 89ff), and Button (2011) for critical discussion of its impact. Given a model M for a language L, the extensions according to any isomorphic model M* may be characterised directly by an inductive specification in terms of M, without mention of M*. For the present purposes, it suffices to consider the special case in which M = (U, F) and M* = (U*, F*) are models for our extensional fragment of English and share their universe U = U*. More specifically, since any bijection π on U is a modelisomorphism from M to a model M* = (U*, F*), the following induction characterises the extensions relative to M* entirely in terms of M: (44) For any expression A of E and any formal model M = (U, F) for E, the permuted extension of A relative to M – //A//M – is determined by the following induction (on the grammatical complexity of A): (i-a) //and //M = λu. λv. u × v … … (ii-a) //every//M = λP. λQ. ⊢P ⊆ Q⊣ … … (iii)
… where u∈{0,1} and v∈{0,1} where P⊆U and Q⊆U
//A//M = πτ(A)(Fw(A)), if A∈NE
(iv-a) //D N//M = //D//M (//A//M) if D N is a quantifier phrase, where D is a quantificational determiner and N is a count noun; … … Obviously, //A//M = πτ(A)([[A]]M) = [[A]]M*, for any expression A of E. Moreover, there is a striking similariy between (44) and the specification (38) of the extensions relative to M. In fact, the only difference lies in clause (iii): according to (44), non-logical words are assigned the π-image of the U-extension they are assigned according to (38). The formulation of (44iii) may suggest that the values //A//M somehow depend on the permutation π when of course, they are perfectly independent U-extensions in their own right – just like the values [[A]]M: any instantiation of either (38iii) or (44iii) will assign some specific U-extension to some specific expression. In particular, then, (44) is no more complicated or roundabout than (38); it is just different. Yet, as far as the specification of the truth values of sentences S of L is concerned, the two agree, since //S//M = πτ([[S]]M) = [[S]]M. The construction (44) can also be carried out if M = Mw is a material model. Of course, in this case the permutation model M* cannot be expected to be a material model too, but then again it is not needed for the definition of the //A//M-values anyway. All that is needed is a permutation π of the domain of w. (44) will then deliver exactly the same truth valuation of L as (38). The comparison between (44) and (38) thus illustrates that truth is independent of reference in that the latter is not determined by the former. In particular, then, although the extensions of terms help determining the truth values of the sentences in which they occur, this rôle hopelessly underdetermines them:
792
VII. Theories of sentence semantics if reference is merely contribution to truth, then reference is arbitrary; else, reference has to be grounded independently.
3.2. Intensional Model Space Beyond the realm of extensionality the strategy of approximating Logical Space in terms of formal models is bound to reach its limits. Even when restricted to the extensional part of a language L and constrained by meaning postulates or otherwise, Model Space is a proper class and thus cannot serve as the domain of any function. As a consequence, the set-theoretic reconstruction of intensional semantics cannot have intensional models assign L-intensions to expressions. However, this does not mean that formal models cannot be adapted to intensional languages. Rather, they would each have to come with their own set-theoretic reconstruction of Logical Space, consisting of an arbitrary (non-empty) set W representing the worlds and a system Û of arbitrary (non-empty) universes, i.e. one set per world (representation): Definition A formal ontology is a pair (W, Û), where W is a non-empty set (the worlds according to (W, Û) and Û is a function with domain W such that Û(w) is a non-empty set whenever w∈W [= the individuals of w, according to (W, Û)]. In the literature on modal logic, the requirement of non-emptiness is sometimes weaker, applying to the union of all domains rather than each individual domain; cf. Fine (1977: 144). Given a formal ontology (W, Û), one may generalise U-extensions from extensional to arbitrary types. Due to the world-dependence of individual domains, these generalised extensions also depend on which element w of W is chosen. More precisely, (W, Û, w)-extensions may be defined by induction on (the complexity of) their types: (i) (W, Û, w)-extensions of type t are truth values; (ii) (W, Û, w)-extensions of type e are members of Û(w); (iii) whenever a and b are types, (W, Û, w)-extensions of type (a,b) are functions assigning (W, Û, w)-extensions of type b to all (W, Û, w)-extensions of type a; and (iv) (W, Û, w)-extensions of type (s,a) are functions ƒ with domain W assigning to any w'∈W a (W, Û, w')-extensions of type a (where, again, a is a type). At first glance, this definition may appear gruesome, but closer inspections shows that it merely mimics (a staightforward generalisation of) the replacements defined in the previous part; we invite the reader to check this for her- or himself. (W, Û, w)-extensions give rise to (W, Û)-intensions of any type a as functions assigning a (W, Û, w)-extension of type a to any w∈W. We thus arrive at the following: Definition Given a language L, an intensional formal model (for L) is a quadruple Mˆ = (W,Û, w*, Fˆ ), where (W, Û) is a formal ontology; a member w* of W (= the actual world according to Mˆ ); and a function Fˆ which assigns to every nonlogical lexical expression A of L a (W, Û)-intension of type τL(A) (= the lexical interpretation function according to Mˆ ). Moreover we say that an intensional formal model Mˆ = (W, Uˆ, w *’ Fˆ ) is based on the ontology (W, Û) and call the members of W and U Uˆ (w ) , the worlds and individuals w ∈W
33. Model-theoretic semantics
793
according to Mˆ , respectively. In an obvious sense, intensional formal models are to intensional material models what (extensional) formal models are to (extensional) material models, replacing any dubious entities by arbitrary set-theoretic constructions. And like the extensional ones, intensional formal models can be used to determine the extensions of arbitrary expressions. To see this, we may adapt procedure (38) to intensional formal models Mˆ for a more inclusive English fragment Ê by adding pertinent conditions to determine extensions in the presence of non-extensional constructions. Since the latter make reference to the intensions of (at least some of) the expressions, the semantic recursion must specify the extensions of all expressions at all worlds (as already observed at the end of section 2): (45) For any expression A of Ê, any intensional formal model Mˆ = (W, Uˆ , w*, Fˆ ) for Ê, ˆ and any world w (according to Mˆ ), the extension [[A]]M,w of A at w relative to Mˆ is determined by the following induction: (i-a) …
[[and]]M,w = λu. λv. u × v …
(ii-a) … (ii-c) …
[[every]]M,w = λP. λQ. ⊢P ⊆ Q⊣ … ˆ [[necessarily]]M,w = λp. ⊢p = W⊣ …
(iii)
[[A]]M, w = Fˆ (A)(w), if A ∈NE
ˆ
… where u∈{0,1} and v∈{0,1}
ˆ
... where P⊆ Uˆ (w) and Q ⊆ Uˆ (w) ... where p⊆W
ˆ
(iv-a) [[D N]]M, w = [[D]]M, w ([[N ]]M, w) ˆ
ˆ
ˆ
if D N is a quantifier phrase, where D is a quantificational determiner and N is a count noun; …
…
(iv-c) if V S is a predicate, where V is an attitude verb and S is a clausal complement, [[V S]]M, w = [[S]]M, w (λw'. [[S]]M, w'); ˆ
ˆ
ˆ
(iv-d) if A S is a sentence, where A is a sentential adverb and S is a sentence, [[A S]]M, w = [[A]]M, w (λw'. [[S]]M, w'); ˆ
…
ˆ
ˆ
…
According to (45), an intensional formal model Mˆ = (W, Uˆ, w*, Fˆ ) for a language ˆ L assigns to each expression A of L both an extension [[A]]M , w* and an intension ˆ ˆ ˆ λw. [[A]]M , w, which we will write as ‘[[A]]M ’ and ‘∧ [[A]]M ’, respectively. Routine calculations now show that the truth conditions of sentences relative to intensional models closely resemble their truth conditions in Logical Space. This may be illustrated by (46a– c), which give the truth conditions of the same attitude report with respect to a possible world w0 of Logical Space ℒ, arbitrary worlds w∈W, and the actual world w* of a model ˆ w*, F) ˆ , respectively: Mˆ = (W, U,
794
VII. Theories of sentence semantics (46) a.
‖Jane doubts that every boy fancies Mary‖w0 = 1 iff (||Jane||w0, {w'∈ℒ | ||boy||w' ⊆ {x∈Uw' | (x, ||Mary||w') ∈||fancies||w'}) ∈ ||doubts||w0
b.
Mˆ , w
[[Jane doubts that every boy fancies Mary]]M, w = 1 iff ([[Jane]] ˆ
[[boy]]
Mˆ , w
Mˆ , w'
⊆ {x ∈Uˆ (w')| (x, [[Mary]]
, {w'∈W |
Mˆ , w'
}}) ∈ [[fancies]]
Mˆ , w
∈ [[doubts]]
(c) [[Jane doubts that every boy fancies Mary]]Mˆ = 1 iff ˆ ˆ ˆ ˆ ([[Jane]]M, {w'∈W| [[boy]]M, w' ⊆ {x ∈Uˆ (w')| (x, [[Mary]]M, w'}}) ∈[[fancies]]M, w' ∈[[doubts]]M ˆ
(45ii-c) gives the interpretation of a rather restrictive reading of the modal sentence adverb necessarily, acccording to which it expresses universal quantification over possible worlds. It is understood that τÊ(necessarily) = ((s,t),t) and that the modal adverb combines with sentences according to clause (iv-d). The following calculation shows that the truth value of a sentence introduced by necessarily does not depend on the world: [[Necessarily S]]M, w ˆ
(47)
Mˆ , w
(λw'. [[S]]M, w') ˆ
[[necessarily]]
=
[λp. p = W ] (λw′. [[S]]M,ˆ w ′ ) ⊥
=
⊥
=
λw'. [[S]]M , w' =W⊣ ˆ
which is indeed independent of w. These truth conditions reflect the peculiar reading (45ii-c) of necessarily as an unrestricted universal quantifier, which arguably reconstructs a logical or metaphysical construal of modality. Despite its limited linguistic value, this interpretational clause illustrates that logical words may come with extensions of nonextensional types, thus calling for a generalisation of the notion of logicality (as already observed at the end of section 2). As in the extensional case, we can give a characterisation in terms of replacements. Since extensions of non-extensional types take all possible worlds into account, replacements also have to act simultaneously on all worlds and domains of a given model: Definition s e If (W, Û) is a formal ontology, then a (W, Û)-replacement is a pair ρ = (ρ , ρ ) of functions with domain W such that:
ρs is a bijection on W; ρe (w) is a bijection on Û(ρs(w)), whenever w∈W. It ought to be noted that the second condition requires Uˆ (w) and Uˆ (ρ s (w)) to have the same cardinality whenever w∈W. In parallel to replacements of U-extensions of extensional types, (W, Uˆ )-replacements ρ = (ρs, ρe) may then be generalised to
33. Model-theoretic semantics
795
ˆ )-extensions of arbitrary types (for any w∈W); as in the extensional case, (W, U,w a (W, Uˆ )-intension ƒ of some type a is invariant iff ρa(f(w)) = ρ'a(f(w)), for all (W, Uˆ ) s s -replacements ρ and ρ' and worlds w∈W such that ρ (w ) = ρ′ (w ) – and intensional ˆ ˆ ˆ formal models M = (W, U, w *, F ) are required to assign invariant intensions to logical ˆ, ,ˆ words: ρτ (A) ([[ A ]]M,w ) = ρ′τ (A) ([[ A]]M,w), for any worlds w∈W and (W, Uˆ )-replacements ρ s s and ρ' such that ρ (w ) = ρ′ (w ) . While this requirement reconstructs and extends the earlier approach to logical words in terms of replacements, it does not guarantee any homogeneity of their intensions across models: the same lexical item may be interpreted as conjunction in one model and as disjunction in another one – and still count as logical according to this definition. To rule out this possibility, a global notion of logicality is needed, and can be defined in terms of (intensional) model isomorphisms.
Definition If Mˆ 1 = (W1 , Uˆ 1 , w1 , Fˆ1 ) and Mˆ 2 = (W 2 , Uˆ 2 , w2 , Fˆ2 ) are intensional formal models for a language L, then a model isomorphism from Mˆ 1 to Mˆ 2 is a pair ρ = (ρs, ρe) of bijections from W to W and from ∪Uˆ to ∪Uˆ , respectively, such that ρs(w ) = w ; 1
1
2
2
1
2
ρτ( A ) ( Fˆ1 (A )(w )) = Fˆ 2 (A )(ρs (w )), whenever w∈W1 and A∈NL.
The second equation in this definition generalises the functions ρ = (ρs, ρe) to objects of arbritrary types. Omitting the obvious details, this is to be understood as parallel to the corresponding generalisation of replacements given further above. Aiming at a more global notion of logicality, the extensions of logical words A are required to be invariant ˆ , ρ (w) ˆ ,w M M , for any intensional formal models Mˆ , worlds w across models: ρτ(A) ([[A]] 1 ) = [[A]] 2 s
1
(of Mˆ 1) and model-isomorphisms ρ to some (isomorphic) model Mˆ 2. It is then readily seen that this global requirement implies that logical words are assigned invariant intensions in the earlier, local sense. The ‘globalisation’ of invariance is needed if the extensions of logical words are to be specified independently of the details of individual models. Readers may check for themselves that the extension specified in (45ii) are indeed stable under model-isomorphisms. The same goes for the syntactic operations interpreted in (45iv), which correspond to invariant intensions, combining the intensions of their parts. As a case in point, according to (45iv-c), embedding a that-clause under an attitude verb is interpreted as functional application, which itself corresponds to a function of type ((s,t),(e,t)),((s,t),(e,t)) – the identical mapping, which is certainly (globally) invariant. With the definition of (global) invariance and the ensuing characterisation of logicality, the specification (45) of extensions relative to intensional formal models is complete. The rest of the technical apparatus layed out in the previous section carries over rather smoothly to the intensional case. In particular, one may now define pure intensional models as those intensional formal models Mˆ = (W ,Uˆ ,w *, Fˆ ) that are based on pure sets W of worlds w associated with pure domains Uˆ (w) – and are thus pure sets themselves. Collecting them all, we obtain Intensional Model Space. It is readily verified that each pure intensional model interprets the extensional part of the language in exactly the same way as a corresponding extensional model. It would thus seem that, quite generally, the effect of a meaning postulate S on (extensional) Model Space
796
VII. Theories of sentence semantics may be achieved by restricting Intensional Model Space to those models Mˆ for which ˆ [[S]]M = 1. However, this would not guarantee the general validity of S given that, apart from its actual world w*, each such Mˆ comes with its own logical space W of possible worlds. Hence even though a sentence S may be true at the latter (and thus according to Mˆ ), it need not be true throughout the former. In other words, even if S is true, it may express a contingent proposition within Mˆ , i.e. one that is neither empty nor coinˆ ˆ cides with M’s logical space: [[S]]M, w*=1, but ∅ ≠ {w ∈ W | [[S]]M, w =1} ≠ W. As a conseˆ quence, in such models (for E ) sentences of the form Necessarily S would come out false. To avoid this absurdity, a meaning postulate S should not only rule out intensional models according to which S ßßis actually true, but all those according to which S is true at some world. Though this may be achieved by prefixing S with the modal adverb necessarily, a more principled, language-independent way to guarantee the intended effect of meaning postulates is to adapt the definition of validity to non-extensional languages: Definition Let S be a sentence of a language L, and let Mˆ and Kˆ be a pure model with possible worlds W and a class of pure models for L, respectively. – S is valid in Mˆ – in symbols: Mˆ ⊨ S – iff ^[[S]]M =W. ˆ – S is Kˆ -valid – in symbols: ⊨Kˆ S – iff S is valid in every member K. ˆ
Like (extensional) Model Space, Intensional Model Space can be taken as a starting point to the set-theoretic reconstruction of Logical Space, gradually reducing the abundance of models by suitable constraints. To this end, meaning postulates employed in extensional semantics may be adapted in the way indicated, but they would have no effect on the interpretation of expressions of non-extensional types, for which additional postulates may be formulated. Thus, e.g., it has been argued that, due to the intensionality of their subject positions, certain verbs of change have extensions of type ((s,e),t); cf. Montague (1973), Löbner (1979), Lasersohn (2005). The analyses try to explain Partee’s Paradox, i.e. the failure of the inference from The temperature is ninety and The temperature is rising to Ninety is rising. The incompatibility of rise and fall may then be captured by adopting the following sentence as a meaning postulate – or by a corresponding meta-linguistic constraint: (48) Nothing both rises and falls. Typically, however, the semantic relations between non-extensional expressions are beyond the expressive power of the object language. Hence, as in the extensional case, meta-linguistic constraints may be added if need be. The cardinality of Logical Space is a case in point: (49) W is infinite. Constraints like (49), and other ones concerning the class of possible worlds as a whole, have no analogues to constraints on Model Space, because they do not rule
33. Model-theoretic semantics out particular worlds, but entire constellations of possible worlds instead. In order to guarantee the infinity of Logical Space, the constraints on Model Space would have to conspire in some way, but no single constraint on appropriate pure extensional models could express (49). Like those applying to Model Space, the constraints and meaning postulates narrowing down Intensional Model Space may be seen as a means to approximate Logical Space by classes of models defined by appropriateness constraints. In the previous section we met a natural meta-constraint on such classes of pure (extensional) models, viz. that they be closed under isomorphism: constraints should only rule out models that do not represent possible constellations of facts, i.e. worlds; and since any two isomorphic models represent the same worlds, they stand and fall together. Closure under isomorphism is solely motivated by the fact that for the adequacy of set-theoretic representations, only set-theoretic structure counts. Clearly, then, closure under isomorphism equally applies to constraints ruling out pure intensional models. However, there is another, straightforward meta-constraint on narrowing down Intensional Model Space that does not have a counterpart in extensional Model Space. It is readily seen, that the actual world of an intensional formal model Mˆ never offers a reason for ruling out a particular model: a pure intensional model violates a meaning postulate just in case its variants do, i.e. those models that only hiffer from it in their actual world. This is so because meaning postulates only concern Logical Space as a whole and do not bear on actual facts. The same goes for meta-theoretic constraints like (49), which all preserve closure under variation: if an intensional formal model satisfies one of them, then so do all its variants. Clearly, this is no coincidence: two variants only differ in their actual world, and so ruling out one but not the other amounts to deciding between which one assigns more appropriate extensions, i.e. truth values, referents, etc., while all that matters in semantics are tuth conditions, reference conditions, etc. Hence, in addition to closure under isomorphism, classes Kˆ of appropriate pure intensional models Mˆ and Mˆ ' should also satisfy closure under variation: ˆ (50) If Mˆ ≅ Mˆ ', then Mˆ ∈ K iff Mˆ ' ∈K. ˆ (51) If Mˆ is a variant of Mˆ ', then Mˆ ∈Kˆ iff Mˆ ' ∈K. Meta-constraint (51) clearly brings out one difference between Intensional Model Space and extensional Model Space, where the very notion of a variant does not appear to make sense. However, though both may be construed as approximations to Logical Space, there is a more fundamental difference in approach. Each member of Ersatz Space – the goal of the approximation – is meant to represent some (i.e., at least one) member of Logical Space; so the constraints on Model Space serve solely to eliminate models that do not represent any genuine possibility in that they do not correspond to any member of the Logical Space. The approximation process has reached its limit once the remaining class of models contains precisely the representations of all the worlds in Logical Space. In a sense, this is also true of Intensional Model Space. However, since each of its members comes with its own representation of Logical Space (as well as a representation of its actual world), this representation too, must be appropriate. Hence in the case of Intensional Model Space, the approximation process only reaches its limit once the remaining, intended models are precisely those based on ontologies that represent
797
798
VII. Theories of sentence semantics Logical Space as a whole; obviously by (50), this will also guarantee that each possible world of Logical Space is represented by (at least) one of the remaining models. The difference between Ersatz Space and the space of intended models comes out particularly clear in discriminative languages. Though in both cases the goal of the approximation process is a mathematical structure that stands in a one-one relation to Logical Space, this structure manifests itself in totally different ways. In the extensional case, it is formed by bundling together pure extensional models into isomorphicity classes; at the end of approximation day, the class of all of them, Ersatz Space, stands in a one-one relation to Logical Space. In the intensional case, each model comes with a representation of Logical Space of its own; when all is said and done, each of these representations will be an isomorphic image of its archetype. Borrowing standard terminology from plural semantics, one may thus say that Ersatz Space approximates Logical Space collectively, as a whole, whereas the space of intended models does so distributively, in each of its members. Meta-theoretic though it may seem, this difference between extensional and intensional model theory in the approach to Logical Space does have repercussions upon descriptive issues. To see this, let us follow Zimmermann (1999: 544ff) and consider the realm of sense relations. In many cases, it seems as if two given expressions do not stand in any interesting semantic relation, even though their extensions are of the same type. The following pairs are cases in point: (52) a. teacher : smoker b. Mary is asleep : Jane pouts c. expensive : green As a matter of fact, many teachers smoke and some don’t, some smokers are teachers and many are not; but then one could certainly imagine that teachers never smoke, or that smokers cannot be teachers, or that only teachers smoke, or even that only smokers may become teachers. Similarly, Mary may be awake while Jane is pouting, but then again she may also sleep etc. In other words, the extensions of the respective expressions in (52) may relate to each other in any way possible – they are logically independent of each other. Certainly logical independence is a sense relation, holding between the expressions in virtue of their meaning. And even though it might not be particularly thrilling one, if semantic theory strives for completeness, it should not miss it. As it turns out, Ersatz Space takes care of it without further ado: extensions vary across pure (extensional) models in any possible way (within their type), and hence for any distribution of these extensions and any relation between them, there will be a pure model representing this relation. This even goes for pairs like bachelor and married, before unwelcome models are eliminated in which the two overlap. Yet as long as the corresponding models are not accidentally eliminated, the full range of the distribution of the extensions of the pairs in (52) will be preserved. In that sense logical independence is captured by extensional Model Space without further ado. Not so for Intensional Model Space where each member is equipped with its own version of Logical Space. To be sure, any conceivable relation between the extensions of the expressions under (52) will be represented by some pure intensional models extending corresponding pure extensional models. Unfortunately, though, Intensional Model Space also allows for the intensions of these expressions to vary as widely as possible. In particular, there are pure intensional models according to which teacher and smoker have all kinds of weird extensions at any possible
33. Model-theoretic semantics worlds: disjoint, identical, empty,… Certainly these models would have to be eliminated by suitable constraints or postulates before Intensional Model Space can make any claim at approximating Logical Space. Given the extreme frequency of such ‘unmarked’ cases as the ones under (52), this is certainly not a trivial task. From the latter point of view, the attempt is successful if Intensional Model Space has been narrowed down enough so that Logical Space may be conceived of as the common structure of the possible worlds across the class of remaining intended models; the more restrictive this class is, the closer this enterprise gets to a full reconstruction of Logical Space. From the realistic perspective, Intensional Model Space may be understood as a mathematical model of what is known about Logical Space, with (abridged) formal intensional models representing (epistemic) possibilities of what the structure of Logical Space may be. This assessment is based on Zimmermann (1999); cf. Lasersohn (2000: 87ff) for some criticism. A fuller account of reducing Logical Space to Intensional Model Space can be found in Menzel (1990).
3.3. Variants We have seen that constraints on (extensional or intensional) model space may go well beyond what is expressible in the object-language. However, they may be expressible in other languages, and particularly in the ones used in indirect interpretation. This technique, which goes back to Montague (1970), proceeds by assigning meanings to natural language expressions by translating them into logical formulae, which are themselves interpreted model-theoretically. As it turns out, formal languages of higher-order type logic are particularly suited for this purpose, since they allow for step-by-step translation procedures covering all expressions and sub-expressions of natural language, thereby inducing compositional meaning assignments; cf. Janssen (1983) for technical details. At the same time, these languages tend to be more expressive than the natural language sources (cf. 3.4). Consequently, they may be used to formulate more powerful restrictions on model space than the directly expressible meaning postulates considered above. Ample illustration of this – rather popular – technique of indirect meaning postulates, may be found in Dowty (1979). As a case in point, the property of being a referentially transparent (or first-order reducible) transitive verb can be expressed in type logic but not in pertinent fragments of English; cf. Zimmermann (1985) for details. Another respect in which traditional model-theoretic approaches to natural language semantics may diverge, concerns the location of logical material. In the above sketch, logicality crops up as a feature of (i) certain lexical items (logical words) and (ii) of the logical operations on meanings that correspond to the grammatical constructions combining syntactic material. The former are determined by the models, but the latter are assigned as part of a model-independent global compositional interpretation procedure. In principle, we could have made them part of the models too, though, thereby making more space for variations between models – as in the classical account of Montague (1970), where however, logicality restrictions are not made explicit. An even more radical assimilation between (i) logical words and (ii) logical operations is obtained when the latter are represented by underlying ‘functional’ morphemes, which opens the possibility of keeping the logical combinations proper to a minimum. This approach is taken in LF-based type-driven interpretation, as made popular by Heim & Kratzer (1998).
799
800
VII. Theories of sentence semantics
3.4. Mathematical Model Theory Model-theoretic interpretation originated in mathematical logic, where it is not used to approximate Logical Space but rather accounts for semantic variation among various (fragments of) formal languages; cf. Etchemendy (1990) for more on this difference in perspective, which incidentally does not affect any mathematical technicalities and results. Though a large part of the latter concern certain varieties of first-order predicate logic, they may have repercussions on model-theoretic semantics of natural language, especially if they concern questions of expressiveness. The two most fundamental results of mathematical model theory – usually derived as corollaries to completeness of firstorder logic – are cases in point: – According to the Compactness Theorem, a set Σ of (closed) first order formulae can only imply a particular (closed first-order) formula if the latter already follows from a finite subset of Σ. As a consequence, no first-order formula can express that the universe is infinite (though it may imply this) – because such a formula would be implied by the set of (first-order expressible) sentences that say that there are at least n objects, but not by any of its finite subsets. – According to the Löwenheim-Skolem Theorem, a set Σ of (closed) first order formulae that is true in a model with an infinite universe, is true in models with universes of arbitrary infinite cardinalities. In particular, no collection of first-order formulae can express that the universe has a particular infinite cardinality. According to a fundamental result of abstract model theory due to Lindström (1969), the above two theorems characterise first-order logic in that any language exceeding its expressive power is bound to fail at least one of them. There is little doubt that natural languages have the resources to express infinity; if they can also be shown to be able to express, say, the countability of the universe, Lindström’s Theorem would imply that the notion of validity in them cannot be axiomatised – cf. Ebbinghaus et al. (1994) for the technical background. The model-theoretical study of higher-order logics is comparatively less well investigated, a major theme being the distinction between standard and non-standard models, the latter being introduced to restore axiomatisability – at the price of losing the kinds of idealisations in the set-theoretic construction of denotations mentioned and motivated in Section 2; a survey of the most fundamental results of higher-order model theory can be found in van Benthem & Doets (1983).
4. References Abbott, Barbara 1997. Models, truth and semantics. Linguistics & Philosophy 20, 117–138. Abbott, Barbara & Larry Hauser 1995. Realism, Model Theory, and Linguistic Semantics. Paper presented at the Annual Meeting of the Linguistic Society of America. New Orleans, January 8, 1995. http://cogprints.org/256/1/realism.htm. December 9, 2010. Benthem, Johan van & Kees Doets 1983. Higher-order logic. In: D. M. Gabbay & F. Guenthner (eds.). Handbook of Philosophical Logic, vol. I. Dordrecht: Reidel, 275–329. Button, Tim 2011. The metamathematics of Putnam’s model-theoretic arguments. Erkenntnis 74, 321–349. Carnap, Rudolf 1947. Meaning and Necessity. Chicago, IL: The University of Chicago Press. Carnap, Rudolf 1952. Meaning postulates. Philosophical Studies 3, 65–73. Casanovas, Enrique 2007. Logical operations and invariance. Journal of Philosophical Logic 36, 33–60.
33. Model-theoretic semantics Cresswell, Maxwell J. 1982. The autonomy of semantics. In: S. Peters & E. Saarinen (eds.). Processes, Beliefs, and Questions. Dordrecht: Kluwer, 69–86. Devitt, Michael 1983. Realism and the renegade Putnam: A critical study of meaning and the moral sciences. Noûs 17, 291–301. Dowty, David 1979. Word Meaning and Montague Grammar. Dordrecht: Kluwer. Ebbinghaus, Heinz-Dieter, Jörg Flum & Wolfgang Thomas 1994. Mathematical Logic. 2nd edn. Berlin: de Gruyter. Etchemendy, John 1990. The Concept of Logical Consequence. Cambridge, MA: The MIT Press. Field, Hartry 1975. Conventionalism and instrumentalism in semantics. Noûs 9, 375–405. Fine, Kit 1977. Properties, propositions, and sets. Journal of Philosophical Logic 6, 135–191. Frege, Gottlob 1884/1950. Die Grundlagen der Arithmetik: eine logisch-mathematische Untersuchung über den Begriff der Zahl. Breslau: W. Koebner 1884. English translation in J. Austin. The Foundations of Arithmetic: A Logico-mathematical Enquiry into the Concept of Number. 1st edn. Oxford: Blackwell, 1950. Frege, Gottlob 1892/1980. Über Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik 100, 25–50. English translation in: P. Geach & M. Black (eds.). Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1980, 56–78. Gazdar, Gerald, Ewan Klein, Geoffrey Pullum & Ivan A. Sag 1985. Generalized Phrase Structure Grammar. Cambridge, MA: The MIT Press. Heim, Irene & Angelika Kratzer 1998. Semantics in Generative Grammar. Oxford: Oxford University Press. Hodges, Wilfried 2001. Formal features of compositionality. Journal of Logic, Language and Information 10, 7–28. Janssen, Theo M.V. 1983. Foundations and Applications of Montague Grammar. Doctoral dissertation. University of Amsterdam. Jeffrey, Richard J. 1964. Review of Sections IV–V of E. Nagel et al. (eds.). Logic, Methodology, and Philosophy of Science (Stanford, CA 1962). Journal of Philosophy 61, 79–88. Kripke, Saul A. 1972. Naming and necessity. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Kluwer, 253–355. Kupffer, Manfred 2007. Contextuality as supervenience. In: M. Aloni, P. Dekker & F. Roelofsen (eds.). Proceedings of the Sixteenth Amsterdam Colloquium. Amsterdam: ILLC, 139–144. http:// www.illc.uva.nl/AC2007/uploaded_files/proceedings-AC07.pdf. December 9, 2010. Lasersohn, Peter 2000. Same, models and representation. In: B. Jackson & T. Matthews (eds.). Proceedings from Semantics and Linguistic Theory (=SALT) X. Ithaca, NY: Cornell University, 83–97. Lasersohn, Peter 2005. The temperature paradox as evidence for a presuppositional analysis of definite descriptions. Linguistic Inquiry 36, 127–134. Lewis, David K. 1984. Putnam’s paradox. Australasian Journal of Philosophy 62, 221–236. Lindenbaum, Adolf & Alfred Tarski 1935/1983. Über die Beschränktheit der Ausdrucksmittel deduktiver Theorien. Ergebnisse eines mathematischen Kolloquiums 7, 15–22. English translation in: J. Corcoran (ed.). Logic, Semantics, Metamathematics. 2nd edn. Indianapolis, IN: Hackett Publishing Company, 1983, 384–392. Lindström, Per 1969. On extensions of elementary logic. Theoria 35, 1–11. Löbner, Sebastian 1979. Intensionale Verben und Funktionalbegriffe. Tübingen: Niemeyer. Macbeath, Murray 1982. “Who was Dr Who’s Father?”. Synthese 51, 397–430. MacFarlane, John 2008. Logical constants. In: E. Zalta (ed.). The Stanford Encyclopedia of Philosophy (Fall 2008 Edition). Online Publication. Stanford Encyclopedia of Philosophy. http:// plato.stanford.edu/entries/logical-constants. December 9, 2010. Machover, Moshé 1994. Review of G. Sher. The Bounds of Logic. A Generalized Viewpoint (Cambridge, MA 1991). British Journal for the Philosophy of Science 45, 1078–1083. Mendelson, Elliott 1997. An Introduction to Mathematical Logic. 4th edn. London: Chapman & Hall. Menzel, Christopher 1990. Actualism, ontological commitment, and possible world semantics. Synthese 85, 355–389.
801
802
VII. Theories of sentence semantics Montague, Richard 1970. Universal grammar. Theoria 36, 373–398. Montague, Richard 1973. The proper treatment of quantification in ordinary English. In: J. Hintikka, J.M.E. Moravcsik & P. Suppes (eds.). Approaches to Natural Language. Dordrecht: Kluwer, 221–242. Newman, Maxwell H. A. 1928. Mr. Russell’s ‘causal theory of perception’. Mind 37, 137–148. Putnam, Hilary 1977. Realism and reason. Proceedings of the American Philosophical Association 50, 483–498. Putnam, Hilary 1980. Models and reality. Journal of Symbolic Logic 45, 464–482. Rabinowicz, Wlodek 1979. Universalizability. A Study in Morals and Metaphysics. Dordrecht: Reidel. Wallace, John 1977. Only in the context of a sentence do words have any meaning. Midwest Studies in Philosophy 2, 144–164. Williams, John R. G. 2005. The Inscrutability of Reference. Doctoral dissertation. University of St Andrews. Wittgenstein, Ludwig 1922. Tractatus logico-philosophicus. Logisch-philosophische Abhandlung. London: Kegan Paul, Trench, Trubner & Co. Zimmermann, Thomas E. 1985. A note on transparency postulates. Journal of Semantics 4, 67–77. Zimmermann, Thomas E. 1999. Meaning postulates and the model-theoretic approach to natural language semantics. Linguistics & Philosophy 22, 529–561.
Thomas Ede Zimmermann, Frankfurt/ Main (Germany)
34. Event semantics 1. 2. 3. 4. 5. 6. 7. 8.
Introduction Davidsonian event semantics The Neo-Davidsonian turn The stage-level/individual-level distinction Reconsidering states Psycholinguistic studies Conclusion References
Abstract Since entering the linguistic stage in the late sixties, Davidsonian event semantics has taken on an important role in linguistic theorizing. Davidson’s (1967) central claim is that events are spatiotemporal things, i.e., concrete particulars with a location in space and time. This enrichment of the underlying ontology has proven to be of great benefit in explaining numerous combinatorial and inferential properties of natural language expressions. This article will trace the motivation, development, and applications of event semantics during the past decades and provide a picture of current views on the role of events in natural language meaning. Besides introducing the classical Davidsonian paradigm and providing an ontological characterization of events, the article discusses the Neo-Davidsonian turn with its broader perspective on eventualities and the use of thematic roles and/or decompositional approaches. Further topics are the stage-level/individual-level distinction, the somewhat murky category of states and some results of recent psycholinguistic studies that have tested the insights of Davidsonian event semantics. Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 802–829
34. Event semantics
1. Introduction Since entering the linguistic stage in the late sixties, Davidsonian event semantics has taken on an important role in linguistic theorizing. The central claim of Donald Davidson’s seminal (1967) work “The logical form of action sentences” is that events are spatiotemporal things, i.e., concrete particulars with a location in space and time. This enrichment of the underlying ontology has proven to be of great benefit in explaining numerous combinatorial and inferential properties of natural language expressions. Most prominent among the many remarkable advances achieved within the Davidsonian paradigm since then have been the progress made in the theoretical description of verb semantics, including tense and aspect, and the breakthrough in analyzing adverbial modification. Numerous monographs and collections attest to the extraordinary fruitfulness of the Davidsonian program; see, e.g., Rothstein (1998), Tenny & Pustejovsky (2000), Higginbotham, Pianesi & Varzi (2000), Lang, Maienborn & Fabricius-Hansen (2003), Austin, Engelberg & Rauh (2004), Maienborn & Wöllstein (2005), Dölling, HeydeZybatov & Schäfer (2008) to mention just a few collections from the last decade. In the course of the evolution of the Davidsonian paradigm, two moves have turned out to be particularly influential in terms of expanding and giving new direction to this overall approach. These are, first, the “Neo-Davidsonian turn” initiated by Higginbotham (1985, 2000) and Parsons (1990, 2000), and, secondly, Kratzer’s (1995) merger of event semantics with the stage-level/individual-level distinction. The Neo-Davidsonian approach has lately developed into a kind of standard for event semantics. It is basically characterized by two largely independent assumptions. The first assumption concerns the arity of verbal predicates. While Davidson introduced event arguments as an additional argument of (some) verbs, Neo-Davidsonian accounts take the event argument of a verbal predicate to be its only argument. The relation between events and their participants is accounted for by the use of thematic roles. The second Neo-Davidsonian assumption concerns the distribution of event arguments: they are considered to be much more widespread than originally envisaged by Davidson. That is, Neo-Davidsonian approaches typically assume that it is not only (action) verbs that introduce Davidsonian event arguments, but also adjectives, nouns, and prepositions. Thus, event arguments are nowadays widely seen as a trademark for predicates in general. For this broader notion of events, which includes, besides events proper, i.e., accomplishments and achievements in Vendler’s (1967) terms, also processes and states, Bach (1986) coined the term “eventuality”. The second milestone in the development of the Davidsonian program is Kratzer’s (1995) event semantic treatment of the so-called stage-level/individual-level distinction, which goes back to Carlson (1977) and, as a precursor, Milsark (1974, 1977). Roughly speaking, stage-level predicates (SLPs) express temporary or accidental properties, whereas individual-level predicates (ILPs) express (more or less) permanent or inherent properties. On Kratzer’s (1995) account, the SLP/ILP-distinction basically boils down to the presence or absence of an extra event argument. Stage-level predicates are taken to have such an additional event argument, while individual-level predicates lack it. This difference in argument structure is then exploited syntactically by the assumption, e.g., of different subject positions for SLPs and ILPs; see Diesing (1992). Since then interest has been directed towards the role of event arguments at the syntax/semantics interface.
803
804
VII. Theories of sentence semantics These developments are accompanied by a newly found interest in the linguistic and ontological foundation of events. To the extent that more attention is paid to less typical events than the classical “Jones buttering a toast” or “Brutus stabbing Caesar”, which always come to the Davidsonian semanticist’s mind first, there is a growing awareness of the vagueness and incongruities lurking behind the notion of events and its use in linguistic theorizing. A particularly controversial case in point is the status of states. The question of whether state expressions can be given a Davidsonian treatment analogous to process and event expressions (in the narrow sense) is still open to debate. All in all, Davidsonian event arguments have become a very familiar “all-purpose” linguistic instrument over the past decades, and recent years have seen a continual extension of possible applications far beyond the initial focus on verb semantics and adverbials also including a growing body of psycholinguistic studies that aim to investigate the role of events in natural language representation and processing. This article will trace the motivation, development, and applications of event semantics during the past decades and provide a picture of current views on the role of events in natural language meaning. Section 2 introduces the classical Davidsonian paradigm, providing an overview of its motivation and some classical and current applications, as well as an ontological characterization of events and their linguistic diagnostics. Section 3 discusses the Neo-Davidsonian turn with its broader perspective on eventualities and the use of thematic roles. This section also includes some notes on decompositional approaches to event semantics. Section 4 turns to the stage-level/individual-level distinction outlining the basic linguistic phenomena that are grouped together under this label and discussing the event semantic treatments that have been proposed as well as the criticism they have received. Section 5 returns to ontological matters by reconsidering the category of states and asking whether indeed all of them, in particular the referents introduced by so-called “statives”, fulfill the criteria for Davidsonian eventualities. And, finally, section 6 presents some experimental results of recent psycholinguistic studies that have tested the insights of Davidsonian event semantics. The article concludes with some final remarks in section 7.
2. Davidsonian event semantics 2.1. Motivation and applications On the standard view in Pre-Davidsonian times, a transitive verb such as to butter in (1a) would be conceived of as introducing a relation between the subject Jones and the direct object the toast, thus yielding the logical form (1b). (1) a. Jones buttered the toast. b. butter (jones, the toast) The only individuals that sentence (1a) talks about according to (1b) are Jones and the toast. As Davidson (1967) points out such a representation does not allow us to refer explicitly to the action described by the sentence and specify it further by adding, e.g., that Jones did it slowly, deliberately, with a knife, in the bathroom, at midnight. What, asks Davidson, does it refer to in such a continuation? His answer is that action verbs introduce an additional hidden event argument that stands for the action proper. Under
34. Event semantics this perspective, a transitive verb introduces a three-place relation holding between the subject, the direct object and an event argument. Davidson’s proposal thus amounts to replacing (1b) with the logical form in (1c). ᭚e [butter (jones, the toast, e)]
(1) c.
This move paves the way for a straightforward analysis of adverbial modification. If verbs introduce a hidden event argument, then standard adverbial modifiers may be simply analyzed as first-order predicates that add information about this event; cf. article 54 (Maienborn & Schäfer) Adverbs and adverbials on the problems of alternative analyses and further details of the Davidsonian approach to adverbial modification. Thus, Davidson’s classical sentence (2a) takes the logical form (2b). (2) a. Jones buttered the toast in the bathroom with the knife at midnight. b. ᭚e [butter (jones, the toast, e) & in (e, the bathroom) & instr (e, the knife) & at (e, midnight)] According to (2b), sentence (2a) expresses that there was an event of Jones buttering the toast, and this event was located in the bathroom. In addition, it was performed by using a knife as an instrument, and it took place at midnight. Thus, the verb’s hidden event argument provides a suitable target for adverbial modifiers. As Davidson points out, this allows adverbial modifiers to be treated analogously to adnominal modifiers: Both target the referential argument of their verbal or nominal host. Adverbial modification is thus seen to be logically on a par with adjectival modification: what adverbial clauses modify is not verbs but the events that certain verbs introduce. Davidson (1969/1980: 167)
One of the major advances achieved through the analysis of adverbial modifiers as first-order predicates on the verb’s event argument is its straightforward account of the characteristic entailment patterns of sentences with adverbial modifiers. For instance, we want to be able to infer from (2a) the truth of the sentences in (3). On a Davidsonian account this follows directly from the logical form (2b) by virtue of the logical rule of simplification; cf. (3'). See, e.g., Eckardt (1998, 2002) on the difficulties that these entailment patterns pose for a classical operator approach to adverbials such as advocated by Thomason & Stalnaker (1973), see also article 54 (Maienborn & Schäfer) Adverbs and adverbials. (3) a. b. c. d. e.
Jones buttered the toast in the bathroom at midnight. Jones buttered the toast in the bathroom. Jones buttered the toast at midnight. Jones butterd the toast with the knife. Jones buttered the toast.
(3') a. ᭚e [butter (jones, the toast, e) & in (e, the bathroom) & at (e, midnight)] b. ᭚e [butter (jones, the toast, e) & in (e, the bathroom)] c. ᭚e [butter (jones, the toast, e) & at (e, midnight)]
805
806
VII. Theories of sentence semantics d. ᭚e [butter (jones, the toast, e) & instr (e, the knife)] e. ᭚e [butter (jones, the toast, e)] Further evidence for the existence of hidden event arguments can be adduced from anaphoricity, quantification and definite descriptions among others: Having introduced event arguments, the anaphoric pronoun it in (4) may now straightforwardly be analyzed as referring back to a previously mentioned event, just like other anaphoric expressions take up object referents and the like. (4) It happened silently and in complete darkness. Hidden event arguments also provide suitable targets for numerals and frequency adverbs as in (5). (5) a. Anna has read the letter three times/many times. b. Anna has often/seldom/never read the letter. Krifka (1990) shows that nominal measure expressions may also be used as a means of measuring the event referent introduced by the verb. Krifka’s example (6) has a reading which does not imply that there were necessarily 4000 ships that passed through the lock in the given time span but that there were 4000 passing events of maybe just one single ship. That is, what is counted by the nominal numeral in this reading are passing events rather than ships. (6) 4000 ships passed through the lock last year. Finally, events may also serve as referents for definite descriptions as in (7). (7) a. the fall of the Berlin Wall b. the buttering of the toast c. the sunrise See, e.g., Bierwisch (1989), Grimshaw (1990), Zucchi (1993), Ehrich & Rapp (2000), Rapp (2007) for event semantic treatments of nominalizations; cf. also article 51 (Grimshaw) Deverbal nominalization. Engelberg (2000: 100ff) offers an overview of the phenomena for which event-based analyses have been proposed since Davidson’s insight was taken up and developed further in linguistics. The overall conclusion that Davidson invites us to draw from all these linguistic data is that events are things in the real world like objects; they can be counted, they can be anaphorically referred to, they can be located in space and time, they can be ascribed further properties. All this indicates that the world, as we conceive of it and talk about it, is apparently populated by such things as events.
2.2. Ontological properties and linguistic diagnostics Semantic research over the past decades has provided impressive confirmation of Davidson’s (1969/1980: 137) claim that “there is a lot of language we can make systematic sense of if we suppose events exist”. But, with Quine’s dictum “No entity without
34. Event semantics identity!” in mind, we have to ask: What kind of things are events? What are their identity criteria? And how are their ontological properties reflected through linguistic structure? None of these questions has received a definitive answer so far, and many versions of the Davidsonian approach have been proposed, with major and minor differences between them. Focussing on the commonalities behind these differences, it still seems safe to say that there is at least one core assumption in the Davidsonian approach that is shared more or less explicitly by most scholars working in this paradigm. This is that eventualities are, first and foremost, particular spatiotemporal entities in the world. As LePore (1985: 151) puts it, “[Davidson’s] central claim is that events are concrete particulars – that is, unrepeatable entities with a location in space and time.” As the past decades’ discussion of this issue has shown (see, e.g., the overviews in Lombard 1998, Engelberg 2000, and Pianesi & Varzi 2000), it is nevertheless notoriously difficult to turn the above ontological outline into precise identity criteria for eventualities. For illustration, I will mention just two prominent attempts. Lemmon (1967) suggests that two events are identical just in case they occupy the same portion of space and time. This notion of events seems much too coarse-grained, at least for linguistic purposes, since any two events that just happen to coincide in space and time would, on this account, be identical. To take Davidson’s (1969/1980: 178) example, we wouldn’t be able to distinguish the event of a metal ball rotating around its own axis during a certain time from an event of the metal ball becoming warmer during the very same time span. Note that we could say that the metal ball is slowly becoming warmer while it is rotating quickly, without expressing a contradiction. This indicates that we are dealing with two separate events that coincide in space and time. Parsons (1990), on the other hand, attempts to establish genuinely linguistic identity criteria for events: “When a verb-modifier appears truly in one source and falsely in another, the events cannot be identical.” (Parsons 1990: 157). This, by contrast, yields a notion of events that is too fine-grained; see, e.g., the criticism by Eckardt (1998: § 3.1) and Engelberg (2000: 221–225). What we are still missing, then, are ontological criteria of the appropriate grain for identifying events. This is the conclusion Pianesi & Varzi (2000) arrive at in their discussion of the ontological nature of events: […] the idea that events are spatiotemporal particulars whose identity criteria are moderately thin […] has found many advocates both in the philosophical and in the linguistic literature. […] But they all share with Davidson’s the hope for a ‘middle ground’ account of the number of particular events that may simultaneously occur in the same place. Pianesi & Varzi (2000: 555)
We can conclude, then, that the search for ontological criteria for identifying events will probably continue for some time. In the meantime, linguistic research will have to build on a working definition that is up to the demands of natural language analysis. What might also be crucial for our notion of events (besides their spatial and temporal dimensions) is their inherently relational character. Authors like Parsons (1990), Carlson (1998), Eckardt (1998), and Asher (2000) have argued that events necessarily involve participants serving some function. In fact, the ability of Davidsonian analyses to make explicit the relationship between events and their participants, either via thematic roles or by some kind of decomposition (see sections 3.2 and 3.3 below), is certainly one of the major reasons among linguists for the continuing popularity of such analyses. This
807
808
VII. Theories of sentence semantics feature of Davidsonian analyses is captured by the statement in (8), which I will adopt as a working definition for the subsequent discussion; cf. Maienborn (2005a). (8) Davidsonian notion of events: Events are particular spatiotemporal entities with functionally integrated participants. (8) may be taken to be the core assumption of the Davidsonian paradigm. Several ontological properties follow from it. As spatiotemporal entities in the world, events can be perceived, and they have a location in space and time. In addition, given the functional integration of participants, events can vary in the way that they are realized. These properties are summarized in (9): (9) Ontological properties of events: a. Events are perceptible. b. Events can be located in space and time. c. Events can vary in the way that they are realized. The properties in (9) can, in turn, be used to derive well-known linguistic event diagnostics: (10) Linguistic diagnostics for events: a. Event expressions can serve as infinitival complements of perception verbs. b. Event expressions combine with locative and temporal modifiers. c. Event expressions combine with manner adverbials, comitatives, etc. The diagnostics in (10) provide a way to detect hidden event arguments. As shown by Higginbotham (1983), perception verbs with infinitival complements are a means of expressing direct event perception and thus provide a suitable test context for event expressions; cf. also Eckardt (2002). A sentence such as (11a) expresses that Anna perceived the event of Heidi cutting the roses. This does not imply that Anna was necessarily aware of, e.g., who was performing the action; see the continuation in (11b). Sentence (11c), on the other hand, does not express direct event perception but rather fact perception. Whatever it was that Anna perceived, it made her conclude that Heidi was cutting the roses. A continuation along the lines of (11b) is not allowed here; cf. Bayer (1986) on what he calls the epistemic neutrality of event perception vs. the epistemic load of fact perception. (11) a. Anna saw Heidi cut the roses. b. Anna saw Heidi cut the roses, but she didn’t recognize that it was Heidi who cut the roses. c. Anna saw that Heidi was cutting the roses (*but she didn’t recognize that it was Heidi who cut the roses). On the basis of the ontological properties of events spelled-out in (9b) and (9c), we also expect event expressions to combine with locative and temporal modifiers as well as with manner adverbials, instrumentals, comitatives and the like – that is, modifiers that
34. Event semantics elaborate on the internal functional set-up of events. This was already illustrated by our sentence (2); see article 54 (Maienborn & Schäfer) Adverbs and adverbials for details on the contribution of manner adverbials and similar expressions that target the internal structure of events. This is, in a nutshell, the Davidsonian view shared (explicitly or implicitly) by current event-based approaches. The diagnostics in (10) provide a suitable tool for detecting hidden event arguments and may therefore help us to assess the Neo-Davidsonian claim that event arguments are not confined to action verbs but have many further sources, to which we will turn next.
3. The Neo-Davidsonian turn 3.1. The notion of eventualities Soon after they took the linguistic stage, it became clear that event arguments were not to be understood as confined to the class of action verbs, as Davidson originally proposed, but were likely to have a much wider distribution. A guiding assumption of what has been called the Neo-Davidsonian paradigm, developed particularly by Higginbotham (1985, 2000) and Parsons (1990, 2000), is that any verbal predicate may have such a hidden Davidsonian argument as illustrated by the following quotations from Higginbotham (1985) and Chierchia (1995). The position E corresponds to the ‘hidden’ argument place for events, originally suggested by Donald Davidson (1967). There seem to be strong arguments in favour of, and little to be said against, extending Davidson’s idea to verbs other than verbs of change or action. Under this extension, statives will also have E-positions. Higginbotham (1985: 10) A basic assumption I am making is that every VP, whatever its internal structure and aspectual characteristics, has an extra argument position for eventualities, in the spirit of Davidson’s proposal. […] In a way, having this extra argument slot is part of what makes something a VP, whatever its inner structure. Chierchia (1995: 204)
Note that already some of the first commentators on Davidson’s proposal took a similarly broad view on the possible sources for Davidson’s extra argument. For instance, Kim (1969: 204) notes: “When we talk of explaining an event, we are not excluding what, in a narrower sense of the term, is not an event but rather a state or a process.” So it was only natural to extend Davidson’s original proposal and combine it with Vendler’s (1967) classification of situation types into states, activities, accomplishments and achievements. In fact, the continuing strength and attractiveness of the overall Davidsonian enterprise for contemporary linguistics rests to a great extent on the combination of these two congenial insights: Davidson’s introduction of an ontological category of events present in linguistic structure, and Vendler’s subclassification of different situation types according to the temporal-aspectual properties of the respective verb phrases; cf., e.g., Piñón (1997), Engelberg (2002), Sæbø (2006) for some more recent event semantic studies on the lexical and/or aspectual properties of certain verb classes.
809
810
VII. Theories of sentence semantics The definition and delineation of events (comprising Vendler’s accomplishments and achievements), processes (activities in Vendler’s terms) and states has been an extensively discussed and highly controversial topic of studies particularly on tense and aspect. The reader is referred to the articles 48 (Filip) Aspectual class and Aktionsart and 97 (Smith) Tense and aspect. For our present purposes the following brief remarks shall suffice: First, a terminological note: The notion “event” is often understood in a broad sense, i.e. as covering, besides events in a narrow sense, processes and states as well. Bach (1986) has introduced the term “eventuality” for this broader notion of events. In the remainder of this article I will stick to speaking of events in a broad sense unless explicitly indicated otherwise. Other labels for an additional Davidsonian event argument that can be found in the literature include “spatiotemporal location” (e.g. Kratzer 1995) and “Davidsonian argument” (e.g. Chierchia 1995). Secondly, events (in a narrow sense), processes, and states may be characterized in terms of dynamicity and telicity. Events and processes are dynamic eventualities, states are static. Furthermore, events have an inherent culmination point, i.e., they are telic, whereas processes and states, being atelic, have no such inherent culmination point; see Krifka (1989, 1992, 1998) for a mereological characterization of events and cf. also Dowty (1979), Rothstein (2004). Finally, accomplishments and achievements, the two subtypes of events in a narrow sense, differ wrt. their temporal extension. Whereas accomplishments such as expressed by read the book, eat one pound of cherries, run the 100m final have a temporal extension, achievements such as reach the summit, find the solution, win the 100m final are momentary changes of state with no temporal duration. See Kennedy & Levin (2008) on socalled degree achievements expressed by verbs like to lengthen, to cool, etc. The variable aspectual behavior of these verbs – atelic (permitting the combination with a for-PP) or telic (permitting the combination with an in-PP) – is explained in terms of the relation between the event structure and the scalar structure of the base adjective; cf. (12). (12) a. The soup cooled for 10 minutes. b. The soup cooled in 10 minutes.
(atelic) (telic)
Turning back to the potential sources for Davidsonian event arguments, in more recent times not only verbs, whether eventive or stative, have been taken to introduce an additional argument, but other lexical categories as well, such as adjectives, nouns and also prepositions. Motivation for this move comes from the observation that all predicative categories provide basically the same kind of empirical evidence that motivated Davidson’s proposal and thus call for a broader application of the Davidsonian analysis. The following remarks from Higginbotham & Ramchand (1997) are typical of this view: Once we assume that predicates (or their verbal, etc. heads) have a position for events, taking the many consequences that stem therefrom, as outlined in publications originating with Donald Davidson (1967), and further applied in Higginbotham (1985, 1989), and Terence Parsons (1990), we are not in a position to deny an event-position to any predicate; for the evidence for, and applications of, the assumption are the same for all predicates. Higginbotham & Ramchand (1997: 54)
As these remarks indicate, nowadays Neo-Davidsonian approaches often take event arguments to be a trademark not only of verbs but of predicates in general. We will come back to this issue in section 5 when we reconsider the category of states.
34. Event semantics
811
3.2. Events and thematic roles The second core assumption of Neo-Davidsonian accounts, besides assuming a broader distribution of event arguments, concerns the way of relating the event argument to the predicate and its regular arguments. While Davidson (1967) introduced the event argument as an additional argument to the verbal predicate thereby augmenting its arity, Neo-Davidsonian accounts use the notion of thematic roles for linking an event to its participants. Thus, the Neo-Davidsonian version of Davidson’s logical form in (2b) for the classical sentence (2a), repeated here as (13a/b) takes the form in (13c). (13) a. Jones buttered the toast in the bathroom with the knife at midnight. b. ᭚e [butter (jones, the toast, e) & in (e, the bathroom) & instr (e, the knife) & at (e, midnight)] c. ᭚e [butter (e) & agent (e, jones) & patient (e, the toast) & in (e, the bathroom) & instr (e, the knife) & at (e, midnight)] On a Neo-Davidsonian view, all verbs are uniformly one-place predicates ranging over events. The verb’s regular arguments are introduced via thematic roles such as agent, patient, experiencer, etc., which express binary relations holding between events and their participants; cf. article 18 (Davis) Thematic roles for details on the nature, inventory and hierarchy of thematic roles. Note that due to this move of separating the verbal predicate from its arguments and adding them as independent conjuncts, NeoDavidsonian accounts give up to some extent the distinction between arguments and modifiers. At least it isn’t possible anymore to read off the number of arguments a verb has from the logical representation. While Davidson’s notation in (13b) conserves the argument/modifier distinction by reserving the use of thematic roles for the integration of circumstantial modifiers, the Neo-Davidsonian notation (13c) uses thematic roles both for arguments such as the agent Jones as well as for modifiers such as the instrumental the knife; see Parsons (1990: 96ff) for motivation and defense and Bierwisch (2005) for some criticism on this point.
3.3. Decompositional event semantics The overall Neo-Davidsonian approach is also compatible with adopting a decompositional perspective on the semantics of lexical items, particularly of verbs; cf. articles 16 (Bierwisch) Semantic features and primes and 17 (Engelberg) Frameworks of decomposition. Besides a standard lexical entry for a transitive verb such as to close in (14a) that translates the verbal meaning into a one-place predicate close on events, one might also choose to decompose the verbal meaning into more basic semantic predicates like the classical cause, become etc; cf. Dowty (1979). A somewhat simplified version of Parsons’ “subatomic” approach is given in (14b); cf. Parsons (1990: 120). (14) a. to close: b. to close:
λy λx λe [close (e) & agent (e, x) & theme (e, y)] λy λx λe [agent (e, x) & theme (e, y) & ᭚e' [cause (e, e') & theme (e', y) & ᭚s [become (e', s) & closed (s) & theme (s, y)]]]
According to (14b) the transitive verb to close expresses an action e taken by an agent x on a theme y which causes an event e' of y changing into a state s of being closed. On this account a causative verb introduces not one hidden event argument but three.
812
VII. Theories of sentence semantics See also Pustejovsky (1991, 1995) for a somewhat different conception of a decompositional event structure. Additional subevent or state arguments as introduced in (14b) might also be targeted by particular modifiers. For instance, the repetitive/restitutive ambiguity of again can be accounted for by letting again, roughly speaking, have scope over either the causing event e (= repetitive reading) or the resulting state s (= restitutive reading); cf., e.g., the discussion in von Stechow (1996, 2003), Jäger & Blutner (2003). Of course, assuming further implicit event and state arguments, as illustrated in (14b), raises several intricate questions concerning, e.g., whether, when, and how such subevent variables that depend upon the verb’s main event argument are bound. No common practice has evolved so far on how these dependent event arguments are compositionally treated. See also Bierwisch (2005) for arguments against projecting more than the highest event argument onto the verb’s argument structure. This might be the right place to add a remark on a further tradition of decompositional event semantics that goes back to Reichenbach (1947). Davidson’s core idea of introducing event arguments can already be found in Reichenbach (1947), who, instead of adding an extra argument to verbal predicates, assumed a more general “event function” [ p ]*, by which a proposition p is turned into a characteristic property of events; see Bierwisch (2005) for a comparison of the Davidsonian, Neo-Davidsonian and Reichenbachian approaches to events. (Note that Reichenbach used the two notions “event function” and “fact function” synonymously.) Thus, Reichenbach’s way of introducing an event variable for the verb to butter would lead to the representation in (15a). This in turn yields (15b) as Reichenbach’s version of the logical form for the classical sentence (2a). (15) a. [butter (x, y)]*(e) b. ᭚e [ [butter (jones, the toast)]*(e) & in (e, the bathroom) & instr (e, the knife) & at (e, midnight)] As Bierwisch (2005: 20) points out, Reichenbach’s and Davidson’s event variables were intended to account for roughly the same range of phenomena, including an analysis of adverbial modification in terms of conjunctively added event predicates. Note that Kamp & Reyle’s (1993) use of the colon to characterize an event e by a proposition p in DRT is basically a variant of Reichenbach’s event function; cf. also article 37 (Kamp & Reyle) Discourse Representation Theory. Further notational variants are Bierwisch’s (1988, 1997) inst-operator e inst p, or the use of curly brackets {p}(e) in Wunderlich (1997). All these are different notational versions for expressing that an event e is partially characterized by a proposition p. Reichenbach’s event function offers a way to pursue a decompositional approach to event semantics without being committed to a Parsons’-style proliferation of subevent variables (with their unclear binding conditions) as illustrated in (14b). Thus, a (somewhat simplified) Bierwisch-style decomposition for our sample transitive verb to close would look like (16). (16) to close:
λy λx λe [e: cause (x, become (closed (y))]
As these remarks show, there is a considerable range of variation exploited by current event semantic approaches as to the extent to which event and subevent variables are
34. Event semantics
813
used and combined with further semantic instruments such as decompositional and/or thematic role approaches.
4. The stage-level/individual-level distinction 4.1. Linguistic phenomena A particularly prominent application field for contemporary event semantic research is provided by the so-called stage-level/individual-level distinction, which goes back to Carlson (1977) and, as a precursor, Milsark (1974, 1977). Roughly speaking, stage-level predicates (SLPs) express temporary or accidental properties, whereas individual-level predicates (ILPs) express (more or less) permanent or inherent properties; some examples are given in (17) vs. (18). (17) Stage-level predicates a. adjectives: tired, drunk, available, … b. verbs: speak, wait, arrive, … (18) Individual-level predicates a. adjectives: intelligent, blond, altruistic, … b. verbs: know, love, resemble, … The stage-level/individual-level distinction is taken to be a conceptually founded distinction that is grammatically reflected. Lexical predicates are classified as being either SLPs or ILPs. In the last years, a growing list of quite diverse linguistic phenomena have been associated with this distinction. Some illustrative cases will be mentioned next; cf., e.g., Higginbotham & Ramchand (1997), Fernald (2000), Jäger (2001), Maienborn (2003: §2.3) for commented overviews of SLP/ILP diagnostics that have been discussed in the literature.
4.1.1. Subject effects Bare plural subjects of SLPs have, besides a generic reading (‘Firemen are usually available.’), also an existential reading (‘There are firemen who are available.’) whereas bare plural subjects of ILPs only have a generic reading (‘Firemen are usually altruistic.’): (19) a. Firemen are available. b. Firemen are altruistic.
(SLP: generic + existential reading) (ILP: only generic reading)
4.1.2. There-coda Only SLPs (20) but not ILPs (21) may appear in the coda of a there-construction: (20) a. There were children sick. b. There was a door open.
(SLP)
(21) a. *There were children tall. b. *There was a door wooden.
(ILP)
814
VII. Theories of sentence semantics
4.1.3. Antecedents in when-conditionals ILPs cannot appear as restrictors of when-conditionals (provided that all argument positions are filled with definites; cf. Kratzer 1995): (22) a. When Mary speaks French, she speaks it well. b. *When Mary knows French, she knows it well.
(SLP) (ILP)
4.1.4. Combination with locative modifiers SLPs can be combined with locative modifiers (23a), while ILPs don’t accept locatives (23b): (23) a. Maria was tired/hungry/nervous in the car. b. ??Maria was blond/intelligent/a linguist in the car.
(SLP) (ILP)
Adherents of the stage-level/individual-level distinction take data like (23) as strong support for the claim that there is a fundamental difference between SLPs and ILPs in the ability to be located in space; see, e.g., the following quote from Fernald (2000: 24): “It is clear that SLPs differ from ILPs in the ability to be located in space and time.”
4.1.5. Complements of perception verbs Only SLPs, not ILPs, are admissible as small clause complements of perception verbs: (24) a. Johann saw the king naked. b. *Johann saw the king tall.
(SLP) (ILP)
4.1.6. Depictives SLPs, but not ILPs, may build depictive secondary predicates: (25) a. Pauli stood tiredi at the fence. b. Paul has bought the booksi usedi.
(SLP)
(26) a. *Pauli stood blondi at the fence. b. *Paul has bought the booksi interestingi.
(ILP)
Further cross-linguistic evidence that has been taken as support for the stage-level/ individual-level distinction includes the alternation of the two copula forms ser and estar in Spanish and Portuguese (see Maienborn 2005c for a critical discussion), two different subject positions for copular sentences in Scottish Gaelic (e.g. Ramchand 1996), or the Nominative/Instrumental case alternation of nominal copular predicates in Russian (e.g. Geist 2006).
34. Event semantics
815
In sum, the standard perspective under which all these contrasts concerning subject effects, when-conditionals, locative modifiers, and so on have been considered is that they are distinct surface manifestations of a common underlying contrast. The stage-level/ individual-level hypothesis is that the distinction of SLPs and ILPs rests on a fundamental (although still not fully understood) conceptual opposition that is reflected in multiple ways in the grammatical system. The following quotation from Fernald (2000) is representative of this view: Many languages display grammatical effects due to the two kinds of predicates, suggesting that this distinction is fundamental to the way humans think about the universe. Fernald (2000: 4)
Given that the conceptual side of the coin is still rather mysterious (Fernald (2000: 4): “Whatever sense of permanence is crucial to this distinction, it must be a very weak notion”), most stage-level/individual-level advocates content themselves with investigating the grammatical side (Higginbotham & Ramchand (1997: 53): “Whatever the grounds for this distinction, there is no doubt of its force”). We will come back to this issue in section 4.3.
4.2. Event semantic treatments A first semantic analysis of the stage-level/individual-level contrast was developed by Carlson (1977). Carlson introduces a new kind of entities, which he calls “stages”. These are spatiotemporal partitions of individuals. SLPs and ILPs are then analyzed as predicates ranging over different kinds of entities: ILPs are predicates over individuals, and SLPs are predicates over stages. Thus, on Carlson’s approach the stage-level/individuallevel distinction amounts to a basic difference at the ontological level. Kratzer (1995) takes a different direction locating the relevant difference at the level of the argument structure of the corresponding predicates. Crucially, SLPs have an extra event argument on Kratzer’s account, whereas ILPs lack such an extra argument. The lexical entries for a SLP like tired and an ILP like blond are given in (27). (27) a. tired: b. blond:
λx λe [tired (e, x)] λx [blond (x)]
This difference in argument structure may now be exploited for selectional restrictions, for instance. Perception verbs, e.g., require an event denoting complement; see the discussion of (11) in section 2.2. This prerequisite is only fulfilled by SLPs, which explains the SLP/ILP difference observed in (24). Moreover, the ban of ILPs from depictive constructions (see (25) vs. (26)) can be traced back to the need of the secondary predicate to provide a state argument that includes temporally the main predicate’s event referent. A very influential syntactic explanation for the observed subject effects within Kratzer’s framework has been proposed by Diesing (1992). She assumes different subject positions for SLPs and ILPs: Subjects of SLPs have a VP-internal base position; subjects of ILPs are base-generated VP-externally. In addition, Diesing formulates a
816
VII. Theories of sentence semantics so-called Mapping Hypothesis, which serves as a syntax/semantics interface condition on the derivation of a logical form. (Diesing assumes a Lewis-Kamp-Heim style tripartite logical form consisting of a non-selective quantifier Q, a restrictive clause (RC), and a nuclear scope (NS).) Diesing’s (1992) Mapping-Hypothesis states that VP-material is mapped into the nuclear scope, and VP-external material is mapped into the restrictive clause. Finally, Diesing takes the VP-boundary to be the place for the existential closure of the nuclear scope. The different readings for SLP and ILP bare plural subjects follow naturally from these assumptions: If SLP subjects stay in their VP-internal base position, they will be mapped into the nuclear scope and, consequently, fall under the scope of the existential quantifier. This leads to the existential reading; cf. (28a). Or they move to a higher, VP-external subject position, in which case they are mapped into the restrictive clause and fall under the scope of the generic operator. This leads to the generic reading; cf. (28b). ILP subjects, having a VP-external base position, may only exploit the latter option. Thus, they only have a generic reading; cf. (29). (28) a. ᭚e, x [ns firemen (x) & available (e, x)] (cf. Kratzer 1995: 141) b. Gen e, x [rc firemen (x) & in (x, e)] [ns available (e, x)] (29) Gen x [rc firemen (x)] [ns altruistic (x)] Kratzer’s account also offers a straightforward solution for the different behavior of SLPs and ILPs wrt. locative modification; cf. (23). Having a Davidsonian event argument, SLPs provide a suitable target for locative modifiers, hence, they can be located in space. ILPs, on the other hand, lack such an additional event argument, and therefore do not introduce any referent whose location could be further specified via adverbial modification. This is illustrated in (30)/(31). While combining a SLP with a locative modifier yields a semantic representation like (30b), any attempt to add a locative to an ILP must necessarily fail; cf. (31b). (30) a. Maria was tired in the car. b. ᭚e [tired (e, maria) & in (e, the car)] (31) a. */??Maria was blond in the car. b. [blond (maria) & in (???, the car)] Thus, on a Kratzerian analysis, SLPs and ILPs indeed differ in their ability to be located in space (see the above quote from Fernald), and this difference is traced back to the presence vs. absence of an event argument. Analogously, the event variable of SLPs provides a suitable target for when-conditionals to quantify over in (22a), whereas the ILP case (22b) lacks such a variable; cf. Kratzer’s (1995) Prohibition against Vacuous Quantification. A somewhat different event semantic solution for the incompatibility of ILPs with locative modifiers has been proposed by Chierchia (1995). He takes a Neo-Davidsonian perspective according to which all predicates introduce event arguments. Thus, SLPs and ILPs do not differ in this respect. In order to account for the SLP/ILP contrast in combination with locatives, Chierchia then introduces a distinction between two kinds of events: SLPs refer to location dependent events whereas ILPs refer to location
34. Event semantics
817
independent events; see also McNally (1998). The observed behavior wrt. locatives follows on the assumption that only location dependent events can be located in space. As Chierchia (1995: 178) puts it: “Intuitively, it is as if ILP were, so to speak, unlocated. If one is intelligent, one is intelligent nowhere in particular. SLP, on the other hand, are located in space.” Despite all differences, Kratzer’s and Chierchia’s analyses have some important commonalities. Both consider the SLP/ILP contrast in (30)/(31) as a grammatical effect. That is, sentences like (31a) do not receive a compositional semantic representation; they are grammatically ill-formed. Kratzer and Chierchia furthermore share the general intuition that SLPs (and only those) can be located in space. This is what the difference in (30a) vs. (31a) is taken to show. And, finally, both analyses rely crucially on the idea that at least SLPs, and possibly all predicates, introduce Davidsonian event arguments. All in all, Kratzer’s (1995) synthesis of the stage-level/individual-level distinction with Davidsonian event semantics has been extremely influential, opening up a new field of research and stimulating the development of further theoretical variants and of alternative proposals.
4.3. Criticisms and further developments In subsequent studies of the stage-level/individual-level distinction two tendencies can be observed. On the one hand, the SLP/ILP contrast has been increasingly conceived of in information structural terms. Roughly speaking, ILPs relate to categorial judgements, whereas SLPs may build either categorial or thetical judgements; cf., e.g., Ladusaw (1994), McNally (1998), Jäger (2001). On this move, the stage-level/individual-level distinction is usually no longer seen as a lexically codified contrast but rather as being structurally triggered. On the other hand there is growing skepticism concerning the empirical adequacy of the stage-level/individual-level hypothesis. Authors such as Higginbotham & Ramchand (1997), Fernald (2000), and Jäger (2001) argue that the phenomena subsumed under this label are actually quite distinct and do not yield such a uniform contrast upon closer scrutiny as a first glance might suggest. For instance, as already noted by Bäuerle (1994: 23), the group of SLPs that support an existential reading of bare plural subjects is actually quite small; cf. (19a). The majority of SLPs, such as tired or hungry in (32) behaves more like ILPs, i.e., they only yield a generic reading. (32) Firemen are hungry/tired.
(SLP: only generic reading)
In view of the sentence pair in (33) Higginbotham & Ramchand (1997: 66) suspect that some notion of speaker proximity might also be of relevance for the availability of existential readings. (33) a. (Guess whether) firemen are nearby/at hand. b. ?(Guess whether) firemen are far away/a mile up the road. There-constructions, on the other hand, also appear to tolerate ILPs, contrary to what one would expect; cf. the example (34) taken from Carlson (1977: 72).
818
VII. Theories of sentence semantics (34) There were five men dead. Furthermore, as Glasbey (1997) shows, the availability of existential readings for bare plural subjects – both for SLPs and ILPs – might also be evoked by the context; cf. the following examples taken from Glasbey (1997: 170ff). (35) a. Children are sick. b. We must get a doctor. Children are sick.
(SLP: no existential reading) (SLP: existential reading)
(36) a. Drinkers were under-age. (ILP: no existential reading) b. John was shocked by his visit to the Red Lion. Drinkers were under-age, drugs were on sale, and a number of fights broke out while he was there. (ILP: existential reading) As these examples show, the picture of the stage-level/individual-level contrast as a clearcut, grammatically reflected distinction becomes a lot less clear upon closer inspection. The actual contributions of the lexicon, grammar, conceptual knowledge, and context to the emergence of stage-level/individual-level effects still remain largely obscure. While the research focus of the stage-level/individual-level paradigm has been directed almost exclusively towards the apparent grammatical effects of the SLP/ILP contrast, no major efforts were made to uncover its conceptual foundation, although there has never been any doubt that a definition of SLPs and ILPs in terms of the dichotomy “temporary vs. permanent” or “accidental vs. essential” cannot be but a rough approximation. Rather than being a mere accident, this missing link to a solid conceptual foundation could be a hint that the overall perspective on the stage-level/individual-level distinction as a genuinely grammatical distinction that reflects an underlying conceptual opposition might be wrong after all. The studies of Glasbey (1997), Maienborn (2003, 2004, 2005c) and Magri (2008, 2009) point in this direction. They all argue against treating stage-level/ individual-level effects as grammatical in nature and provide alternative, pragmatic analyses for the observed phenomena. In particular, Maienborn argues against an eventbased explanation objecting that the use of Davidsonian event arguments does not receive any independent justification in terms of the event criteria discussed in section 2.2 in such stage-level/individual-level accounts. The crucial question is whether all state expressions, or at least those state expressions that express temporary/accidental properties, i.e. SLPs, can be shown to introduce a Davidsonian event argument. This takes us back to the ontological issue of a proper characterization of states.
5. Reconsidering states As mentioned in section 3.1 above, one of the two central claims of the Neo-Davidsonian paradigm is that all predicates, including state expressions, have a hidden event argument. Despite its popularity this claim has seldom been defended explicitly. Parsons (1995, 2000) is among the few advocates of the Neo-Davidsonian approach who have subjected this assumption to some scrutiny. And the conclusion he reaches wrt. state expressions is rather sobering:
34. Event semantics Based on the considerations reviewed above, it would appear that the underlying state analysis is not compelling for any kind of the constructions reviewed here and is not even plausible for some (e.g., for nouns). There are a few outstanding problems that the underlying state analysis might solve, […] but for the most part the weight of evidence seems to go the other way. (Parsons 2000: 88)
Parsons (2000) puts forth his so-called time travel argument to make a strong case for a Neo-Davidsonian analysis of state expressions; but cf. the discussion in Maienborn (2007). In any case, if the Neo-Davidsonian assumption concerning state expressions is right, we should be able to confirm the existence of hidden state arguments by the event diagnostics mentioned in section 2.2; cf. (10). Maienborn (2003, 2005a) examines the behavior of state expressions wrt. these and further event diagnostics and shows that there is a fundamental split within the class of non-dynamic expressions: State verbs such as sit, stand, lie, wait, gleam, and sleep meet all of the criteria for Davidsonian eventualities. In contrast, stative verbs like know, weigh, own, and resemble do not meet any of them. Moreover, it turns out that copular constructions behave uniformly like stative verbs, regardless of whether the predicate denotes a temporary property (SLP) or a more-or-less permanent property (ILP). The behavior of state verbs and statives with respect to perception reports is illustrated in (37). While state verbs can serve as infinitival complements of perception verbs (37a-c), statives, including copula constructions, are prohibited in these contexts (37d-e). (The argumentation in Maienborn (2003, 2005a) is based on data from German. For ease of presentation I will use English examples in the following.) (37) Perception reports: a. I saw the child sit on the bench. b. I saw my colleague sleep through the lecture. c. I noticed the shoes gleam in the light. d. *I saw the child be on the bench. e. *I saw the tomatoes weigh 1 pound. f. *I saw my aunt resemble Romy Schneider. Furthermore, as (38a-c) shows, state verbs combine with locative modifiers, whereas statives do not; see (38d-g). (38) Locative modifiers: a. Hilda waited at the corner. b. Bardo slept in a hammock. c. The pearls gleamed in her hair. d. *The dress was wet on the clothesline. e. *Bardo was hungry in front of the fridge. f. *The tomatoes weighed 1 pound besides the carrots. g. *Bardo knew the answer over there. Three remarks on locatives should be added here. First, when using locatives as event diagnostics we have to make sure to use true event-related adverbials, i.e., locative
819
820
VII. Theories of sentence semantics VP-modifiers. They should not be confounded with locative frame adverbials such as those in (39). These are sentential modifiers that do not add an additional predicate to a VP’s event argument but instead provide a semantically underspecified domain restriction for the overall proposition. Locative frame adverbials often yield temporal or conditional interpretations (e.g. ‘When he was in Italy, Maradona was married.’ for (39c)) but might also be interpreted epistemically, for instance (‘According to the belief of the people in Italy, Maradona was married.’); see Maienborn (2001) for details and cf. also article 54 (Maienborn & Schäfer) Adverbs and adverbials. (39) Locative frame adverbials: a. By candlelight, Carolin resembled her brother. b. Maria was drunk in the car. c. In Italy, Maradona was married. Secondly, we are now in a position to make more precise what is going on in sentence pairs like (23), repeated here as (40), which are often taken to demonstrate the different behavior of SLPs and ILPs wrt. location in space; cf. the discussion in section 4. (40) a. Maria was tired/hungry/nervous in the car. b. ??Maria was blond/intelligent/a linguist in the car.
(SLP) (ILP)
Actually, this SLP/ILP contrast is not an issue of grammaticality but concerns the acceptability of these sentences under a temporal reading of the locative frame; cf. Maienborn (2004) for a pragmatic explanation of this temporariness effect. Thirdly, sentences (38d/e) are well-formed under an alternative syntactic analysis that takes the locative as the main predicate and the adjective as a depictive secondary predicate. Under this syntactic analysis sentence (38d) would express that there was a state of the dress being on the clothesline, and this state is temporally included in an accompanying state of the dress being wet. This is not the kind of evidence needed to substantiate the Neo-Davidsonian claim that states can be located in space. If the locative were a true event-related modifier, sentence (38d) should have the interpretation: There was a state of wetness of the dress, and this state is located on the clothesline. (38d) has no such reading; cf. the discussion on this point between Rothstein (2005) and Maienborn (2005b). Turning back to our event diagnostics, the same split within the group of state expressions that we observed in the previous cases also shows up with manner adverbials, comitatives and the like – that is, modifiers that elaborate on the internal functional structure of events. State verbs combine regularly with them, whereas statives do not, as (41) shows. Katz (2003) dubbed this the Stative Adverb Gap. (41) Manner adverbials etc.: a. Bardo slept calmly/with his teddy/without a pacifier. b. Carolin sat motionless/stiff at the table. c. The pearls gleamed dully/reddishly/moistly. d. *Bardo was calmly/with his teddy/without a pacifier tired. e. *Carolin was restlessly/patiently thirsty. f. *Andrea resembled with her daughter Romy Schneider. g. *Bardo owned thriftily/generously much money.
34. Event semantics
821
There has been some discussion on apparent counterexamples to this Stative Adverb Gap such as (42). While, e.g., Jäger (2001), Mittwoch (2005), Dölling (2005) or Rothstein (2005) conclude that such cases provide convincing evidence for assuming a Davidsonian argument for statives as well, Katz (2000, 2003, 2008) and Maienborn (2003, 2005a,b, 2007) argue that these either involve degree modification as in (42a) or are instances of event coercion, i.e. a sentence such as (42b) is, strictly speaking, ungrammatical but can be “rescued” by inferring some event argument to which the manner adverbial may then apply regularly; see the discussion in section 6.2. For instance, what John is passionate about in (42b) is not the state of being a Catholic but the activities associated with this state (e.g. going to mass, praying, going to confession). If no related activities come to mind for some predicate such as being a relative of Grit in (42'b) then the pragmatic rescue fails and the sentence becomes odd. According to this view, understanding sentences such as (42b) requires a non-compositional reinterpretation of the stative expression that is triggered by the lack of a regular Davidsonian event argument. (42) a. Lisa firmly believed that James was innocent. b. John was a Catholic with great passion in his youth. (42') b. ??John was a relative of Grit with great passion in his youth. See also Rothmayr (2009) for a recent analysis of the semantics of stative verbs including a decompositional account of stative/eventive ambiguities as illustrated in (43): (43) a. Hair obstructed the drain. b. A plumber obstructed the drain.
(stative reading) (preferred eventive reading)
A further case of stative/eventive ambiguities is discussed by Engelberg (2005) in his study of dispositional verbs such as German helfen (help), gefährden (endanger), erleichtern (facilitate). These verbs may have an eventive or a stative reading depending on whether the subject is nominal or sentential; cf. (44). Trying to account for these readings within the Davidsonian program turns out to be challenging in several respects. Engelberg advocates the philosophical concept of supervenience as a useful device to account for the evaluative rather than causal dependency of the effect state expressed by these verbs. (44) a. Rebecca helped Jamaal in the kitchen. (eventive) b. That Rebecca had fixed the water pipes helped Jamaal in the kitchen. (stative) In view of the evidence reviewed above, it seems justified to conclude that the class of statives, including all copular constructions, does not behave as one would expect if they had a hidden Davidsonian argument, regardless of whether they express a temporary or a permanent property. What conclusions should we draw from these linguistic observations concerning the ontological category of states? There are basically two lines of argumentation that have been pursued in the literature. Maienborn takes the behavior wrt. the classical event diagnostics in (10) as a sufficiently strong linguistic indication of an underlying ontological difference and assumes that only state verbs denote true Davidsonian eventualities, i.e., Davidsonian states,
822
VII. Theories of sentence semantics whereas statives resist a Davidsonian analysis but refer instead to what Maienborn calls Kimian states, exploiting Kim’s (1969, 1976) notion of temporally bound property exemplifications. Kimian states may be located in time and they allow for anaphoric reference, Yet, in lacking an inherent spatial dimension, They are ontologically “poorer”, more abstract entities than Davidsonian eventualities; cf. Maienborn (2003, 2005a, b, 2007) for details. Authors like Dölling (2005), Higginbotham (2005), Ramchand (2005) or Rothstein (2005) take a different track. On their perspective, the observed linguistic differences call for a more liberal definition of eventualities that includes the referents of stative expressions. In particular, they are willing to give up the assumption of eventualities having an inherent spatial dimension. Hence, Ramchand (2005: 372) proposes the following alternative to the definition offered in (8): (45) Eventualities are abstract entities with constitutive participants and with a constitutive relation to the temporal dimension. So the issue basically is whether we opt for a narrow or a broad definition of events. 40 years after Davidson’s first plea for events we still don’t know for sure what kind of things event(ualitie)s actually are.
6. Psycholinguistic studies In recent years, a growing interest has emerged in testing hypotheses on theoretical linguistic assumptions about event structure by means of psycholinguistic experiments. Two research areas involving events have attracted major interest within the still developing field of semantic processing; cf. articles 15 (Bott, Featherston, Radó & Stolterfoht) Experimental methods, 102 (Frazier) Meaning in psycholinguistics. These are the processing of underlying event structures and of event coercion.
6.1. The processing of underlying event structures The first focus of interest concerns the issue of distinguishing different kinds of events in terms of the complexity of their internal structure. Gennari & Poeppel (2003) show that the processing of event sentences such as (46a) takes significantly longer than the processing of otherwise similar stative sentences such as (46b). (46) a. The visiting scientist solved the intricate math problem. b. The visiting scientist lacked any knowledge of English.
(eventive) (stative)
This processing difference is attributed to eventive verbs having a more complex decompositional structure than stative verbs; cf. the Bierwisch-style representations in (47). (47) a. to solve: b. to lack:
λy λx λe [e: cause (x, become (solved (y))] λy λx λs [s: lack (x, y)]
Thus, the study of Gennari & Poeppel (2003) adduces empirical evidence for the event vs. state distinction and it provides experimental support for the psychological reality of
34. Event semantics structuring natural language meaning in terms of decompositional representations. This is, of course, a highly controversial issue; cf. the argumentation in Fodor, Fodor & Garrett (1975), de Almeida (1999) and Fodor & LePore (1998) against decomposition, and see also the more differentiated perspective taken in Mobayyen & de Almeida (2005). McKoon & Macfarland (2000, 2002), taking up a distinction made by Levin and Rappaport Hovav (1995) investigate two kinds of causative verbs, viz. verbs denoting an externally caused event (e.g. break) as opposed to verbs denoting an internally caused event (e.g. bloom). Whereas the former include a causing subevent as well as a changeof-state subevent, the latter only express a change of state; cf. McKoon & Macfarland (2000: 834). Thus, the two verb classes differ wrt. their decompositional complexity. McKoon and Macfarland describe a series of experiments that show that there are clear processing differences corresponding to this lexical distinction. Sentences with external causation verbs take significantly longer to process than sentences with internal causation verbs. In addition, this processing difference shows up with the transitive as well as the intransitive use of the respective verbs; cf. (48) vs. (49). McKoon and Macfarland conclude from this finding that the causing subevent remains implicitly present even if no explicit cause is mentioned in the break-case. That is, their experiments suggest that both transitive and intransitive uses of, e.g., awake in (49) are based on the same lexical semantic event structure consisting of two subevents. And conversely if an internal causation verb is used transitively, as wilt in (48a), the sentence is still understood as denoting a single event with the subject referent being part of the change-of-state event. (48) Internal causation verbs: a. The bright sun wilted the roses. b. The roses wilted. (49) External causation verbs: a. The fire alarm awoke the residents. b. The residents awoke. In sum, the comprehension of break, awake etc. requires understanding a more complex event conceptualization than that of bloom, wilt etc. This psycholinguistic finding corroborates theoretically motivated assumptions on the verbs’ lexical semantic representations. See also Härtl (2008) on a more thorough and differentiated study on implicit event information. Härtl discusses whether, to what extent, and at which processing level implicit event participants and implicit event predicates are still accessible for interpretation purposes. Most notably, the studies of McKoon & Macfarland (2000, 2002) and Gennari & Poeppel (2003) provide strong psycholinguistic support for the assumption that verb meanings are represented and processed in terms of an underlying event structure.
6.2. The processing of event coercion The second focus of psycholinguistic research on events is devoted to the notion of event coercion. Coercion refers to the forcing of an alternative interpretation when the compositional machinery fails to derive a regular interpretation. In other words, event coercion is a kind of rescue operation which solves a grammatical conflict by using additional
823
824
VII. Theories of sentence semantics knowledge about the involved event type; cf. also article 25 (de Swart) Mismatches and coercion. There are two types of coercion that are prominently discussed in the literature. The first type, the so-called complement coercion is illustrated in (50). The verb to begin requires an event-denoting complement and forces the given object-denoting complement the book into a contextually appropriate event reading. Hence, sentence (50) is reinterpreted as expressing that John began, e.g., to read the book; cf., e.g., Pustejovsky (1995), Egg (2003). (50) John began the book. The second kind, the so-called aspectual coercion refers to a set of options for adjusting the aspectual type of a verb phrase according to the demands of a temporal modifier. For instance, the punctual verb to sneeze in (51a) is preferably interpreted iteratively in combination with the durative adverbial for five minutes, whereas the temporal adverbial for years forces a habitual reading of the verb phrase to smoke a morning cigarette in (51b), and the stative expression to be in one’s office receives an ingressive reinterpretation due to the temporal adverbial in 10 minutes in (51c); cf., e.g., Moens & Steedman (1988), Pulman (1997), de Swart (1998), Dölling (2003), Egg (2005). See also the classification of aspectual coercions developed in Hamm & van Lambalgen (2005). (51) a. John sneezed for five minutes. a. John smoked a morning cigarette for years. b. John was in his office in 10 minutes. There are basically two kinds of theoretical accounts that have been developed for the linguistic phenomena subsumed under the label of event coercion: type-shifting accounts (e.g., Moens & Steedman 1988, de Swart 1998) and underspecification accounts (e.g., Pulman 1997, Egg 2005); cf. articles 24 (Egg) Semantic underspecification and 25 (de Swart) Mismatches and coercion. These accounts and the predictions they make for the processing of coerced expressions have been the subject of several psycholinguistic studies; cf., e.g., de Almeida (2004), Pickering McElree & Traxler (2005) and Traxler et al. (2005) on complement coercion and Piñango, Zurif & Jackendoff (1999), Piñango, Mack & Jackendoff (to appear), Pickering et al. (2006), Bott (2008a, b), Brennan & Pylkkänen (2008) on aspectual coercion. The crucial question is whether event coercion causes additional processing costs, and if so at which point in the course of meaning composition such additional processing takes place. The results obtained so far still don’t yield a fully stable picture. Whether processing differences are detected or not seems to depend partly on the chosen experimental methods and tasks; cf. Pickering et al. (2006). Pylkkänen & McElree (2006) draw the following interim balance: Whereas complement coercion always raises additional processing costs (at least without contextual support), aspectual coercion does not appear to lead to significant processing difficulties. Pylkkänen & McElree (2006) propose the following interpretation of these results: Complement coercion involves an ontological type conflict between the verb’s request for an event argument and a given object referent. This ontological type conflict requires an immediate and time-consuming repair; otherwise the compositional process would break down. Aspectual coercion, on the other hand, only involves sortal shifts within the category of events that do not seem to affect composition and should therefore best be taken as an instance of semantic
34. Event semantics underspecification. For a somewhat more differentiated picture on the processing of different types of aspectual coercion see Bott (2008a).
7. Conclusion Although psycholinguistic research on event structure might be said to be still in its infancy, the above remarks on some pioneer studies already show that Davidsonian events are about to develop into a genuine subject of psychological research on natural language. Hidden event arguments, as introduced by Davidson (1967), have not only proven to be of great benefit in explaining numerous combinatorial and inferential properties of natural language expressions, such that they show up virtually everywhere in present-day assumptions about linguistic structure. In addition, there is growing evidence that they are also psychologically real. Admittedly, we still don’t know for sure what kind of things events actually are. Nevertheless, 40 years after they appeared on the linguistic scene, Davidsonian events continue to be both an indispensable everyday linguistic instrument and a constant source of fresh insights into the constitution of natural language meaning.
8. References de Almeida, Roberto G. 1999. What do category-specific semantic deficits tell us about the representation of lexical concepts? Brain & Language 68, 241–248. de Almeida, Roberto G. 2004. The effect of context on the processing of type-shifting verbs. Brain & Language 90, 249–261. Asher, Nicholas 2000. Events, facts, propositions, and evolutive anaphora. In: J. Higginbotham, F. Pianesi & A. Varzi (eds.). Speaking of Events. Cambridge, MA: The MIT Press, 123–150. Austin, Jennifer, Stefan Engelberg & Gesa Rauh (eds.) 2004. Adverbials. The Interplay between Meaning, Context, and Syntactic Structure. Amsterdam: Benjamins. Bach, Emmon 1986. The algebra of events. Linguistics & Philosophy 9, 5–16. Bäuerle, Rainer 1994. Zustand – Prozess – Ereignis. Zur Kategorisierung von Verb(al)-phrasen. Wuppertaler Arbeitspapiere zur Sprachwissenschaft 10, 1–32. Bayer, Josef 1986. The role of event expressions in grammar. Studies in Language 10, 1–52. Bierwisch, Manfred 1988. On the grammar of local prepositions. In: M. Bierwisch, W. Motsch & I. Zimmermann (eds.). Syntax, Semantik und Lexikon. Berlin: Akademie Verlag, 1–65. Bierwisch, Manfred 1989. Event nominalizations: Proposals and problems. In: W. Motsch (ed.). Wortstruktur und Satzstruktur. Berlin: Akademie Verlag, 1–73. Bierwisch, Manfred 1997. Lexical information from a minimalist point of view. In: Ch. Wilder, H.-M. Gärtner & M. Bierwisch (eds.). The Role of Economy Principles in Linguistic Theory. Berlin: Akademie Verlag, 227–266. Bierwisch, Manfred 2005. The event structure of CAUSE and BECOME. In: C. Maienborn & A. Wöllstein (eds.). Event Arguments: Foundations and Applications. Tübingen: Niemeyer, 11–44. Bott, Oliver 2008a. The Processing of Events. Doctoral dissertation. University of Tübingen. Bott, Oliver 2008b. Doing it again and again may be difficult, but it depends on what you are doing. In: N. Abner & J. Bishop (eds.). Proceedings of the West Coast Conference on Formal Linguistics (= WCCFL) 27. Somerville, MA: Cascadilla Proceedings Project, 63–71. Brennan, Jonathan & Liina Pylkkänen 2008. Processing events: Behavioral and neuromagnetic correlates of aspectual coercion. Brain & Language 106, 132–143. Carlson, Gregory N. 1977. Reference to Kinds in English. Ph.D. dissertation. University of California, Irvine, CA. Reprinted: New York: Garland, 1980.
825
826
VII. Theories of sentence semantics Carlson, Gregory N. 1998. Thematic roles and the individuation of events. In: S. Rothstein (ed.). Events and Grammar. Dordrecht: Kluwer, 35–51. Chierchia, Gennaro 1995. Individual-level predicates as inherent generics. In: G. N. Carlson & F. J. Pelletier (eds.). The Generic Book. Chicago, IL: The University of Chicago Press, 176–223. Davidson, Donald 1967. The logical form of action sentences. In: N. Resher (ed.). The Logic of Decision and Action. Pittsburgh, PA: University of Pittsburgh Press, 81–95. Reprinted in: D. Davidson (ed.). Essays on Actions and Events.Oxford: Clarendon Press, 1980, 105–122. Davidson, Donald 1969. The individuation of events. In: N. Resher (ed.). Essays in Honor of Carl G. Hempel. Dordrecht: Reidel, 216–234. Reprinted in: D. Davidson (ed.). Essays on Actions and Events. Oxford: Clarendon Press, 1980, 163–180. Diesing, Molly 1992. Indefinites. Cambridge, MA: The MIT Press. Dölling, Johannes 2003. Flexibility in adverbal modification: Reinterpretation as contextual enrichment. In: E. Lang, C. Maienborn & C. Fabricius-Hansen (eds.). Modifying Adjuncts. Berlin: Mouton de Gruyter, 511–552. Dölling, Johannes 2005. Copula sentences and entailment relations. Theoretical Linguistics 31, 317–329. Dölling, Johannes, Tatjana Heyde-Zybatow & Martin Schäfer (eds.) 2008. Event Structures in Linguistic Form and Interpretation. Berlin: de Gryuter. Dowty, David R. 1979. Word Meaning and Montague Grammar. The Semantics of Verbs and Times in Generative Semantics and in Montague’s PTQ. Dordrecht: Reidel. Eckardt, Regine 1998. Adverbs, Events and Other Things. Issues in the Semantics of Manner Adverbs. Tübingen: Niemeyer. Eckardt, Regine 2002. Event semantics. In: F. Hamm & T. E. Zimmermann (eds.). Semantics (Linguistische Berichte, Sonderheft 10). Hamburg: Buske, 91–128. Egg, Markus 2003. Beginning novels and finishing hamburgers. Remarks on the semantics of ‘to begin’. Journal of Semantics 20, 163–191. Egg, Markus 2005. Flexible Semantics for Reinterpretation Phenomena. Stanford, CA: CSLI Publications. Ehrich, Veronika & Irene Rapp 2000. Sortale Bedeutung und Argumentstruktur: ung-Nominalisierungen im Deutschen. Zeitschrift für Sprachwissenschaft 19, 245–303. Engelberg, Stefan 2000. Verben, Ereignisse und das Lexikon. Tübingen: Niemeyer. Engelberg, Stefan 2002. Intransitive accomplishments and the lexicon: The role of implicit arguments, definiteness, and reflexivity in aspectual composition. Journal of Semantics 19, 369–416. Engelberg, Stefan 2005. Stativity, supervenience, and sentential subjects. In: C. Maienborn & A. Wöllstein (eds.). Event Arguments: Foundations and Applications. Tübingen: Niemeyer, 45–68. Fernald, Theodore B. 2000. Predicates and Temporal Arguments. Oxford: Oxford University Press. Fodor, Janet D., Jerry A. Fodor & Merrill Garrett 1975. The psychological unreality of semantic representations. Linguistic Inquiry 6, 515–532. Fodor, Jerry A. & Ernest LePore 1998. The emptiness of the lexicon: Reflections on James Pustejovsky’s “The Generative Lexicon”. Linguistic Inquiry 29, 269–288. Geist, Ljudmila 2006. Die Kopula und ihre Komplemente. Zur Kompositionalität in Kopulasätzen. Tübingen: Niemeyer. Gennari, Silvia & David Poeppel 2003. Processing correlates of lexical semantic complexity. Cognition 89, 27–41. Glasbey, Sheyla 1997. I-level predicates that allow existential readings for bare plurals. In: A. Lawson (ed.). Proceedings of Semantics and Linguistic Theory (= SALT) VII. Ithaca, NY: Cornell University, 169–179. Grimshaw, Jane 1990. Argument Structure. Cambridge, MA: The MIT Press. Hamm, Fritz & Michiel van Lambalgen 2005. The Proper Treatment of Events. Oxford: Blackwell. Härtl, Holden 2008. Implizite Informationen: Sprachliche Ökonomie und interpretative Komplexität bei Verben. Berlin: Akademie Verlag.
34. Event semantics Higginbotham, James 1983. The logic of perceptual reports: An extensional alternative to situation semantics. Journal of Philosophy 80, 100–127. Higginbotham, James 1985. On semantics. Linguistic Inquiry 16, 547–593. Higginbotham, James 1989. Elucidations of meaning. Linguistics & Philosophy 12, 465–517. Higginbotham, James 2000. On events in linguistic semantics. In: J. Higginbotham, F. Pianesi & A. Varzi (eds.). Speaking of Events. Oxford: Oxford University Press, 49–79. Higginbotham, James 2005. Event positions: Suppression and emergence. Theoretical Linguistics 31, 349–358. Higginbotham, James & Gillian Ramchand 1997. The stage-level/individual-level distinction and the mapping hypothesis. Oxford University Working Papers in Linguistics, Philology & Phonetics 2, 53–83. Higginbotham, James, Fabio Pianesi & Achille Varzi (eds.) 2000. Speaking of Events. Oxford: Oxford University Press. Jäger, Gerhard 2001. Topic-comment structure and the contrast between stage-level and individuallevel predicates. Journal of Semantics 18, 83–126. Jäger, Gerhard & Reinhard Blutner 2003. Competition and interpretation: The German adverb wieder (‘again’). In: E. Lang, C. Maienborn & C. Fabricius-Hansen (eds.). Modifying Adjuncts. Berlin: Mouton de Gruyter, 393–416. Kamp, Hans & Uwe Reyle 1993. From Discourse to Logic. Dordrecht: Kluwer. Katz, Graham 2000. Anti Neo-Davidsonianism: Against a Davidsonian semantics for state sentences. In: C. Tenny & J. Pustejovsky (eds.). Events as Grammatical Objects. The Converging Perspectives of Lexical Semantics and Syntax. Stanford, CA: CSLI Publications, 393–416. Katz, Graham 2003. Event arguments, adverb selection, and the Stative Adverb Gap. In: E. Lang, C. Maienborn & C. Fabricius-Hansen (eds.). Modifying Adjuncts. Berlin: Mouton de Gruyter, 455–474. Katz, Graham 2008. Manner modification of state verbs. In: L. McNally & Ch. Kennedy (eds.). Adjectives and Adverbs. Syntax, Semantics, and Discourse. Oxford: Oxford University Press, 220–248. Kennedy, Christopher & Beth Levin 2008. Measure of change: The adjectival core of degree achievements. In: L. McNally & Ch. Kennedy (eds.). Adjectives and Adverbs. Syntax, Semantics, and Discourse. Oxford: Oxford University Press, 156–182. Kim, Jaegwon 1969. Events and their descriptions: Some considerations. In: N. Rescher (ed.). Essays in Honor of Carl G. Hempel. Dordrecht: Reidel, 198–215. Kim, Jaegwon 1976. Events as property exemplifications. In: M. Brand & D. Walton (eds.). Action Theory. Proceedings of the Winnipeg Conference on Human Action. Dordrecht: Reidel, 159–177. Kratzer, Angelika 1995. Stage-level and individual-level predicates. In: G. N. Carlson & F. J. Pelletier (eds.). The Generic Book. Chicago, IL: The University of Chicago Press, 125–175. Krifka, Manfred 1989. Nominal reference, temporal constitution and quantification in event semantics. In: R. Bartsch, J. van Benthem & P. van Emde Boas (eds.). Semantics and Contextual Expression. Dordrecht: Foris, 75–115. Krifka, Manfred 1990. Four thousand ships passed through the lock: Object-induced measure functions on events. Linguistics & Philosophy 13, 487–520. Krifka, Manfred 1992. Thematic relations as links between nominal reference and temporal constitution. In: I. Sag & A. Szabolcsi (eds.). Lexical Matters. Stanford, CA: CSLI Publications, 29–53. Krifka, Manfred 1998. The origins of telicity. In: S. Rothstein (ed.). Events and Grammar. Dordrecht: Kluwer, 197–235. Ladusaw, William 1994. Thetic and categorical, stage and individual, weak and strong. In: M. Harvey & L. Santelmann (eds.). Proceedings of Semantics and Linguistic Theory (= SALT) IV. Ithaca, NY: Cornell University, 220–229. Lang, Ewald, Claudia Maienborn & Cathrine Fabricius-Hansen (eds.) 2003. Modifying Adjuncts. Berlin: Mouton de Gruyter. Lemmon, Edward J. 1967. Comments on D. Davidson’s “The Logical Form of Action Sentences”. In: N. Resher (ed.). The Logic of Decision and Action. Pittsburgh, PA: University of Pittsburgh Press, 96–103.
827
828
VII. Theories of sentence semantics LePore, Ernest 1985. The semantics of action, event, and singular causal sentences. In: E. LePore & B. McLaughlin (eds.). Actions and Events: Perspectives on the Philosophy of Donald Davidson. Oxford: Blackwell, 151–161. Levin, Beth & Malka Rappaport Hovav 1995. Unaccusativity: At the Syntax-Lexical Semantics Interface. Cambridge, MA: The MIT Press. Lombard, Lawrence B. 1998. Ontologies of events. In: S. Laurence & C. Macdonald (eds.). Contemporary Readings in the Foundations of Metaphysics. Oxford: Blackwell, 277–294. Magri, Giorgio. 2008. A Theory of Individual-level Predicates Based on Blind Scalar Implicatures (Extended version). Ms. Cambridge, MA, MIT. http://web.mit.edu/gmagri/www/. December 11, 2010. Magri, Giorgio 2009. A theory of individual-level predicates based on blind mandatory scalar implicatures. Natural Language Semantics 17, 245–297. Maienborn, Claudia 2001. On the position and interpretation of locative modifiers. Natural Language Semantics 9, 191–240. Maienborn, Claudia 2003. Die logische Form von Kopula-Sätzen. Berlin: Akademie Verlag. Maienborn, Claudia 2004. A pragmatic explanation of the stage-level/individual-level contrast in combination with locatives. In: B. Agbayani, V. Samiian & B. Tucker (eds.). Proceedings of the Western Conference on Linguistics (= WECOL) 15. Fresno, CA: CSU, 158–170. Maienborn, Claudia 2005a. On the limits of the Davidsonian approach: The case of copula sentences. Theoretical Linguistics 31, 275–316. Maienborn, Claudia 2005b. Eventualities and different things: A reply. Theoretical Linguistics 31, 383–396. Maienborn, Claudia 2005c. A discourse-based account of Spanish ‘ser/estar’. Linguistics 43, 155–180. Maienborn, Claudia 2007. On Davidsonian and Kimian states. In: I. Comorovski & K. von Heusinger (eds.). Existence: Semantics and Syntax. Dordrecht: Springer, 107–130. Maienborn, Claudia & Angelika Wöllstein (eds.) 2005. Event Arguments: Foundations and Applications. Tübingen: Niemeyer. McKoon, Gail & Talke Macfarland 2000. Externally and internally caused change of state verbs. Language 76, 833–858. McKoon, Gail & Talke Macfarland 2002. Event templates in the lexical representations of verbs. Cognitive Psychology 45, 1–44. McNally, Louise 1998. Stativity and theticity. In: S. Rothstein (ed.). Events and Grammar. Dordrecht: Kluwer, 293–307. Milsark, Gary L. 1974. Existential Sentences in English. Ph.D. dissertation. MIT, Cambridge, MA. Reprinted: Bloomington, IN: Indiana University Linguistics Club, 1976. Milsark, Gary L. 1977. Toward an explanation of certain peculiarities of the existential construction in English. Linguistic Analysis 3, 1–29. Mittwoch, Anita 2005. Do states have Davidsonian arguments? Some empirical considerations. In: C. Maienborn & Angelika Wöllstein (eds.). Event Arguments: Foundations and Applications. Tübingen: Niemeyer, 69–88. Mobayyen, Forouzan & Roberto G. de Almeida 2005. The influence of semantic and morphological complexity of verbs on sentence recall: Implications for the nature of conceptual representation and category-specific deficits. Brain and Cognition 57, 168–171. Moens, Marc & Mark Steedman 1988. Temporal ontology and temporal reference. Computational Linguistics 14, 15–28. Parsons, Terence 1990. Events in the Semantics of English. A Study in Subatomic Semantics. Cambridge, MA: The MIT Press. Parsons, Terence 1995. Thematic relations and arguments. Linguistic Inquiry 26, 635–662. Parsons, Terence 2000. Underlying states and time travel. In: J. Higginbotham, F. Pianesi & A. Varzi (eds.). Speaking of Events. Oxford: Oxford University Press, 81–93. Pianesi, Fabio & Achille C. Varzi 2000. Events and event talk: An introduction. In: J. Higginbotham, F. Pianesi & A. Varzi (eds.). Speaking of Events. Oxford: Oxford University Press, 3–47.
34. Event semantics
829
Pickering, Martin J., Brian McElree & Matthew J. Traxler 2005. The difficulty of coercion: A response to de Almeida. Brain & Language 93, 1–9. Pickering, Martin J., Brian McElree, Steven Frisson, Lillian Chen & Matthew J. Traxler 2006. Underspecification and aspectual coercion. Discourse Processes 42, 131–155. Piñango, Maria M., Jennifer Mack & Ray Jackendoff (to appear). Semantic combinatorial processes in argument structure: Evidence from light-verbs. In: Proceedings of the Annual Meeting of the Berkeley Linguistics Society (= BLS) 32. Piñango, Maria M., Edgar Zurif & Ray Jackendoff 1999. Real-time processing implications of enriched composition at the syntax-semantics interface. Journal of Psycholinguistic Research 28, 395–414. Piñón, Christopher 1997. Achievements in an event semantics. In: A. Lawson & E. Cho (eds.). Proceedings of Semantics and Linguistic Theory (= SALT) VII. Ithaca, NY: Cornell University, 276–293. Pulman, Stephen G. 1997. Aspectual shift as type coercion. Transactions of the Philological Society 95, 279–317. Pustejovsky, James 1991. The syntax of event structure. Cognition 41, 47–81. Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA: The MIT Press. Pylkkänen, Liina & Brian McElree 2006. The syntax-semantics interface: On-line composition of meaning. In: M. A. Gernsbacher & M. Traxler (eds.). Handbook of Psycholinguistics. 2nd edn. New York: Elsevier, 537–577. Ramchand, Gillian 1996. Two subject positions in Scottish Gaelic: The syntax-semantics interface. Natural Language Semantics 4, 165–191. Ramchand, Gillian 2005. Post-Davidsonianism. Theoretical Linguistics 31, 359–373. Rapp, Irene 2007. „Was den Besuch zum Ereignis macht“ – eine outputorientierte Analyse für die Verb-Nomen-Konversion im Deutschen. Linguistische Berichte 208, 407–437. Reichenbach, Hans 1947. Elements of Symbolic Logic. New York: Macmillan. Rothmayr, Antonia 2009. The Structure of Stative Verbs. Amsterdam: Benjamins. Rothstein, Susan (ed.) 1998. Events and Grammar. Dordrecht: Kluwer. Rothstein, Susan 2004. Structuring Events: A Study in the Semantics of Lexical Aspect. Oxford: Blackwell. Rothstein, Susan 2005. States and modification: A reply to Maienborn. Theoretical Linguistics 31, 375–381. Sæbø, Kjell Johan 2008. The structure of criterion predicates. In: J. Dölling, T. Heyde-Zybatow & M. Schäfer (eds.). Event Structures in Linguistic Form and Interpretation. Berlin: de Gruyter, 127–147. von Stechow, Arnim 1996. The different readings of wieder ‘again’: A structural account. Journal of Semantics 13, 87–138. von Stechow, Arnim 2003. How are results represented and modified? Remarks on Jäger & Blutner’s anti-decomposition. In: E. Lang, C. Maienborn & C. Fabricius-Hansen (eds.). Modifying Adjuncts. Berlin: Mouton de Gruyter, 417–451. de Swart, Henriëtte 1998. Aspect shift and coercion. Natural Language and Linguistic Theory 16, 347–385. Tenny, Carol & James Pustejovsky (eds.) 2000. Events as Grammatical Objects. The Converging Perspectives of Lexical Semantics and Syntax. Stanford, CA: CSLI Publications. Thomason, Richmond H. & Robert C. Stalnaker 1973. A semantic theory of adverbs. Linguistic Inquiry 4, 195–220. Traxler, Matthew J., Brian McElree, Rihana S. Williams & Martin J. Pickering 2005. Context effects in coercion: Evidence from eye movements. Journal of Memory and Language 53, 1–25. Vendler, Zeno 1967. Linguistics in Philosophy. Ithaca, NY: Cornell University Press. Wunderlich, Dieter 1997. CAUSE and the structure of verbs. Linguistic Inquiry 28, 27–68. Zucchi, Alessandro 1993. The Language of Propositions and Events. Dordrecht: Kluwer.
Claudia Maienborn, Tübingen (Germany)
830
VII. Theories of sentence semantics
35. Situation Semantics and the ontology of natural language 1. 2. 3. 4. 5. 6. 7.
Introduction Introducing situations into semantics: Empirical motivations The Austinian picture A wider ontological net A type theoretic ontology for interaction Conclusions References
Abstract Situation Semantics emerged in the 1980s with an ambitious program of reform for semantics, both in the domain of semantic ontology and with regard to the integration of context in meaning. This article takes as its initial focus the topic of a situation-based ontology, more generally discussing the approach to NL ontology that emerged from situation semantics. The latter part of the article will explain how recent work synthesizing situation semantics with type theory enables the original intuitions from situation semantics to be captured in a dynamic, computationally tractable framework.
1. Introduction Situation Semantics emerged in the 1980s with an ambitious program of reform for semantics, both in the domain of semantic ontology and with regard to the integration of context in meaning. In their 1983 book Situations and Attitudes (Barwise & Perry 1983), Barwise and Perry argued for the preeminence of a situation-based ontology and took contexts of utterance to be situations, thereby offering the potential for a richer view of context. For situation semantics and utterance–oriented interpretation, see article 36 (Ginzburg) Situation Semantics. This article takes as its initial focus the topic of a situation-based ontology, more generally discussing the approach to NL ontology that emerged from situation semantics. The latter part of the article will explain how recent work synthesizing situation semantics with type theory enables the original intuitions from situation semantics to be captured in a dynamic, computationally tractable framework. As a semantic framework, Barwise & Perry (1983) view Situation Semantics as following on – but also crucially breaking from – the tradition of Montogovian model theoretic semantics. The strategy this latter embodies they view as being Fregean: intensions providing a logically fruitful way of explicating “[Frege’s] third realm, a realm neither of ideas nor of worldly events, but of senses.” (Barwise & Perry 1983: 4) Given Barwise and Perry’s ambitious program they reject aspects of the Frege-Montague programme as cognitively intractable, and argue that the ontology it postulates is unnecessarily coarse grained. For instance, the choice of truth values as the denotata of declarative sentences they view as resting on a bad argument (‘The slingshot’). The desiderata for a semantic framework Barwise and Perry put forward include the following: Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 830–851
35. Situation Semantics and the ontology of natural language – The priority of information: language has external significance, as model theoretic semantics has always emphasized, but, as cognitive scientists of various stripes emphasize, it also has mental significance, yielding information about agents’ internal states; in this respect see also article 11 (Kempson) Formal semantics and representationalism. What is needed is a way of capturing the commonality between the external and the mental, a matter exacerbated when multimodal meaning (gesture, gaze, visual access) enters into the picture. – Cognitive realizability: in common with all other biological organisms, language users are resource bounded agents. This requires that only relatively “small” entities feature in semantic accounts, hence the emphasis on situations and their characterization in a computable fashion. – Structured objects: semantic objects such as propositions need to be treated in a way that treats their identity conditions very much on a par with ‘ordinary’ individuals. Such entities are structured objects: (1) The primitives of our theory are all real things: individuals, properties, relations, and space-time locations. Out of these and objects available from the set theory we construct a universe of abstract objects. (Barwise & Perry 1983: 178) That is, structured objects arrive on the scene with certain constraints that ‘define them’ in terms of other entities of the ontology in a manner that is inspired by proof theoretic approaches. This way of setting up the ontology has the potential of avoiding various foundational problems that beset classical theories of properties and propositions. For propositions, these problems typically center around doxastic puzzles such as logical omniscience and its variants. An important component in fulfilling these desiderata, according to Barwise and Perry, is a theory by means of which external (and internal) reality can be represented – an ontology of some kind. The formalism that emerged came to be known as Situation Theory – its make up and motivation constitute the focus of sections 2, 3, and 4 of the paper. These proceed in an order that reflects the evolution of Situation Theory: initially as a theory of situations, then as a theory that includes both concrete entities such as situations and abstract ones such as propositions, finally as a more extended ontology, comprising in addition entities such as questions, outcomes, and possibilities. Section 5 of the paper concerns the emergence of a type theoretic version of the theory, within a formalism initiated by Robin Cooper. Section 6 provides some concluding remarks.
2. Introducing situations into semantics: Empirical motivations Situation Semantics owes its initial prominence to its analysis of the naked infinitive (NI) construction, exemplified in (2). Here is a construction, argued Barwise (1989b), that intrinsically requires positing situations – spatio-temporally located parts of the world. One component of this argument goes as follows: the difference in meaning between (2a) and (2b) illustrates that “logically equivalent” NIs (relative to an evaluation by a world) are not semantically equivalent. And yet, the intuitive validity of the inference from (2b) to (2c) and the inference described in (2d) shows that NIs bring with them
831
832
VII. Theories of sentence semantics clear logical structure. This is a purely linguistic argument, to add to other more methodological ones, that the appropriate ontology for NL cannot be one based solely on worlds, but must include events and situations. (2) a. b. c. d.
Bo saw Millie enter. Bo saw Millie enter and Jan leave or not leave. Bo saw Jan leave or not leave. Bo saw Jan not leave. So, it’s not the case that Bo saw Jan leave. In fact, Bo saw Jan engaged in something inconsistent with leaving.
The account of NI clauses is based on a theory of situations characterized in terms of situation types. Here a few words on nomenclature are due. Barwise and Perry used the term ‘situation’ as a cover term for what have often been called ‘eventualities’, including events, situations, states and so forth; for detailed discussion see also article 34 (Maienborn) Event semantics. I will stick with this choice here, for historical reasons, but the wider intended extension should be noted throughout. Similar remarks apply mutandis mutandi to the term ‘situation type’. Indeed this is Barwise and Perry’s original name of such entities, which subsequently came to be known as states-of-affairs, infons, or SOAs. The return to the original term is intentional given the current type theoretic turn discussed in section 5. Situation types are structured objects that function as ‘potential properties’ situations can possess: situation types are taken to be structured from two components, a relation R, and an assignment α, which assigns real world entities to the argument roles of R, as in (3a). The notation in (3b) indicates that the situation s is of the type given by the situation type 〈〈R; α〉〉. If a situation fails to be correctly classified by a situation type σ, this is notated as in (3c); ‘:’ was traditionally notated as ⊨. (3) a. 〈〈CALM; loc = Jerusalem〉〉 b. s : 〈〈R; α〉〉 c. s : /〈〈R; α〉〉 Situation types are assumed to come in positive/negative pairs, i.e. every relation/assignment pair gives rise to a positive situation type and a negative situation type. We will assume the positive ones to be (notationally) unmarked and notate the corresponding negative with an ‘overline’, as in (4a). Because situations are partial, there is a difference between a situation failing to be correctly classified by σ and being correctly classified by – For any situation s and situation type σ , (4b) holds, but (4c) generally fails. The intuσ. – means that s actually possesses information which rules ition is that classifying s with σ out σ, rather than simply lacking concrete evidence for σ. So, e.g., a situation I perceive in London, slondon, would typically neither be of the type 〈〈CALM; loc = Jerusalem〉〉, nor of the type 〈〈CALM; loc = Jerusalem〉〉. slondon is simply indeterminate about the issue of Jerusalem’s calamity or calmness. Cooper (1998) has proposed a pair of axioms that attempt to capture this intuition. (4d) states that if a situation s supports the dual of σ, then s also supports positive information that precludes σ being the case. (4e) tells us that if a situation s supports the dual of σ, then s also supports information that defeasibly entails that σ is the case. I discuss some linguistic evidence relating to (4e) in section 4., in connection with negative polar interrogatives.
35. Situation Semantics and the ontology of natural language (4) a. b. c. d. e.
s : 〈〈R; α〉〉 Either s : σ or s :/σ – Either s : σ or s : σ ∀s, σ [s : σ implies ∃(Pos)ψ [s : ψ and ψ ⇒ σ]] ∀s, σ [s : σ implies ∃(Pos)ψ [s : ψ and ψ > σ]]
The treatment of NIs and its wider semantic implications opened various debates, debates in which one of the main issues was: does an account of NIs require a radical overhaul of the underlying semantic ontology? Muskens (1989) showed that a Montogovian framework could offer an account if it embraced 4-valued logic. Higginbotham (see Higginbotham 1983, 1994) argued that Davidsonian event theory was sufficient to explicate NIs. Neale (1988) and Cooper (1998) subsequently provided counter arguments to Higginbotham. Cooper claimed, inter alia, that the existence of negative situation types in Situation Theory allows it to explicate cases like (5a) in terms of the perceived scene satisfying (5b), which seem beyond Higgibotham’s Davidsonian account, which is limited to something like (5c): (5) a. Ralph saw Mary not serve Bill. b. s : 〈〈Serve; server : m, servee : b〉〉 c. s : /〈〈Serve; server : m, servee : b〉〉 However one thinks these debates played out – the reckoning must be done relative to the range of phenomena and tractability each framework can ultimately accommodate – one apparently uncontroversial outcome is the recognition that situations are needed in the ontology. Nonetheless, the question that arises is this: how significant are situations for semantics? A syntactic analogy might be the following: there is incontrovertible evidence that NL is not context free, as demonstrated e.g. by Swiss German crossing dependencies. Are situations exotica like Swiss German Crossing Dependencies, or are they an absolutely pervasive feature like unbounded dependencies, inability to deal with which renders any grammar quite unviable? Barwise and Perry’s claim was that the latter is the case. Their claim is that situations are at the heart of semantic use. As discussed in detail in article 36 (Ginzburg) Situation Semantics, one of the early claims of situation semantics, following Austin, was that the meaning of declarative sentences is to be explicated as relating utterance situations to described situations. This intuition can be made concrete: anaphora shows that (described) situations enter into context as a consequence of the assertion of an episodic sentence, even if the assertion is not accepted, as in (6b): (6) a. A: Jo and Mo got married yesterday. It was a wonderful occasion. b. A: Jo’s arriving next week. B: No, that’s happening in about a month. Barwise and Perry also argued, and their arguments were sharpened by Robin Cooper (see Cooper 1993, 1996), that a given utterance can also concern an event/situation that is distinct from the described situation. Ever since Russell (1905), at least one influential school has sought to explain the meaning of singular definites using some notion of uniqueness; for detailed discussion see article 41 (Heim) Definiteness and indefiniteness. More generally, quantification presupposes a domain (cf. terms such as the domain of discourse, the universe etc). With some notable exceptions (e.g. McCawley 1979,
833
834
VII. Theories of sentence semantics Lewis 1979), until Barwise and Perry’s proposal, the requisite relativization was not considered a matter to be handled in semantic theory. Barwise and Perry’s essential idea is that in language use more than one situation comes into the picture: they make a distinction between the described situation, the situation which roughly speaking a declarative utterance picks out, and a resource situation, so called because it is used as a resource to fix the range/reference of an NP. Cooper’s argument is based on data such as (7), modelled on an example from Lewis (1979), where two domains are in play, a local one and a New Zealand one. The former is exploited in the first two sentences, after which the New Zealand domain takes over. At the point marked ǁ we are to imagine a sudden shift back to the local domain. By assuming that domains are situations we capture the fact that once a shift is made, it encompasses the entire situation, ensuring that the dog referred to is local: (7) The dog is under the piano and the cat is in the carton. The cat will never meet our other cat because our other cat lives in New Zealand. Our New Zealand cat lives with the Cresswells and their dog. And there he’ll stay because the dog would be sad if the cat went away. ǁ The cat’s going to pounce on you. And the dog’s coming too. For computational work using resource situations, integrated also with visual information see Poesio (1993). For experimental work on the resolution of definites in conversation taking a closely related perspective see Brown-Schmidt & Tanenhaus (2008).
3. The Austinian picture The ontology we have discussed so far comprises situations and situation types (as well as of course the elements that make up these entities – individuals, role to individual assignments). Situations are the main ingredient in a treatment of bare perceptual reports and play a significant role in underpinning NP meaning and assertion. This is essentially the ontology of Situations and Attitudes in which there were no propositions. These were rejected as ‘artifact(s) of the semantic endeavor’ (Barwise & Perry 1983). As Barwise & Perry (1985) subsequently admitted, this was not a move they were required to make. Indeed propositional-like entities, more intensional than situations, are a necessary ingredient for accounts of attitude reports and illocutionary acts. Sets of situations, although somewhat more fine grained than sets of worlds, are vulnerable to sophisticated variants of logical omniscience (see e.g. Soames’ puzzle in Soames 1985). Nonetheless, Angelika Kratzer has initiated an approach, somewhat confusingly also known as Situation Semantics, that does attempt to exploit sets of situations for precisely this role and develops accounts for a wide range of linguistic phenomena, including modality, donkey anaphora, exhaustivity, and factivity. See Kratzer (1989) for an early version of this approach, and Kratzer (2008) for a detailed, recent survey. The next cheapest solution available within the Situations and Attitudes ontology would be to draft the situation types to serve as the propositional entities. Indeed, situation types are competitive in such a role: they can distinguish identity statements that involve distinct constituents (e.g. (8a) corresponds to the situation type in (8c), whereas (8b) corresponds to the situation type in (8d), while allowing substitutivity of co-referentials and cross-linguistic equivalents, as exemplified respectively by (8e) and (8f), the Hebrew analogue of (8b):
35. Situation Semantics and the ontology of natural language (8) a. b. c. d. e. f.
Enesco is identical with himself. Poulenc is identical with himself. 〈〈Identical; enesco, enesco〉〉 〈〈Identical; poulenc, poulenc〉〉 He is identical with himself. Poulank zehe leacmo.
Nonetheless, post 1985 situation theory did not go for the cheapest solution; as we will see in section 4., not succumbing to ontological stinginess pays off when scaling up the theory to deal with other abstract entities. Building on a conception articulated 30 years earlier by Austin (1970), Barwise & Etchemendy (1987) developed a theory of propositions in which a proposition is a structured object prop(s,σ), individuated in terms of a situation s and a situation type σ. Here the intuition is that s is the described situation (or the belief situation, in so far as it is used to describe an agent’s belief, or utterance token, in the case of locutionary propositions discussed below), with the relationship between s and σ being the one introduced above in our discussion of NIs, leading to a straightforward notion of truth and falsity: (9) a. prop(s, σ) is true iff s : σ (s is of type σ). b. prop(s, σ) is false iff s : /σ (s is not of type σ). In saying that a proposition prop(s, σ) is individuated in terms of s and σ, the intention is to say that prop(s, σ) = prop(t, τ) if and only if s = t and σ = τ. Individuating propositions in terms of their “subject matter” (i.e. the situation type component) is familiar, but what is innovative and/or puzzling is the claim that two propositions can be distinct despite having the same subject matter. I mention three examples from the literature of cases which motivate differentiating propositions on the basis of their situational component. The first is one we saw above in the case of definiteness resolution, where the possibility of using ‘the dog’ is underwritten by distinct presuppositions; the difference in the presuppositions resides in the different resource situations exploited: (10) a. prop(slocal , 〈〈UNIQUE, Dog〉〉) b. prop(snewzealand , 〈〈UNIQUE, Dog〉〉) A second case are the locutionary propositions introduced by Ginzburg (2011). Ginzburg argues that characterizing both the update potential and the range of utterances that can be used to seek clarification about a given utterance u0 requires reference to the utterance token u0 , as well as to its grammatical type Tu (see article 36 (Ginzburg) Situation 0 Semantics for details). By defining propositions (locutionary propositions) individuated in terms of u0 and Tu one can simultaneously define update and clarification potential 0 for utterances. In this case, there are potentially many instances of distinct locutionary propositions, which need to be differentiated on the basis of the utterance token – minimally any two utterances classified as being of the same type by the grammar. The original motivation for Austinian propositions was in the treatment of the Liar paradox by Barwise & Etchemendy (1987). This paradox concerns sentences like (11a,b) which, pretheoretically, are false if true and true if false. Although one approach to this
835
836
VII. Theories of sentence semantics issue involves banning self reference, this is an arbitrary prohibition that runs counter to the felicity of various self referential utterances such as (11c). Moreover, as Kripke (1975) showed, Liar paradox cases can arise in certain contexts from sentences that are normally perfectly felicitous. (11) a. This claim is false. b. What I am saying now is false. c. This is the last announcement about flight 345. Very briefly, Barwise and Etchemendy’s diagnosis is that the apparent paradox is similar to ones involving other implicit parameters (time zones, spatial orientation,...), where “paradoxes” loom if perspectives are ignored: (12) a. A (in Tashkent): It’s 9pm, B (in Baghdad): No, it’s 7pm; Does not license: It’s 7pm and it’s not 7pm. b. (A and B facing each other) A: The cupboard is to our right. B: No it’s to our left. Does not license: The cupboard is to our right and to our left. Similarly, for the Liar, according to Barwise and Etchemendy: the phenomenon dissolves as a paradox once one adopts the Austinian conception of propositions, which recognizes the situational relativity of propositions. In their formalization, liar utterances like (11a) express propositions which satisfy the equation in (13): (13) fs = prop(s, 〈〈True, fs〉〉) The existence of such circular propositions is ensured in Barwise and Etchemendy’s account given their use of the non-well founded set theory developed by Aczel (1988), though the Austinian conception does not depend in any way on using such set theory. In Barwise and Etchemendy’s model theory situations are modelled as sets of situation types. A situation s is of type σ iff σ ∈ s and, moreover, for any actual situation s and proposition p: (a) 〈〈True, p〉〉 ∈ s only if p is true, (b) 〈〈True, p〉〉 ∈ s only if p is false. Given this, a proposition such as (13) ends up being false – if fs is true, then 〈〈True, fs〉〉 ∈ s. This entails that fs is false. Once we accept the falsity of this proposition, there exist situations in which the situation type 〈〈True, fs〉〉 is factual. The minimal such situation is s1 = s ∪{〈〈True, fs〉〉} and, hence, prop(s1, 〈〈True, fs〉〉) is true. This account thereby captures an intuition that liar claims are double edged. This solution crucially depends on a view of propositions as concerning situations and not worlds. As Barwise and Etchemendy explain in detail, in an alternative solution (which they label Russellian), where propositions are not relativized by a situational parameter, there is no way to accommodate the existence of propositions that are not true but whose falsehood is internal to the world. Let us take stock: the Austinian conception builds up from an ontology with situations and situation types and adds to these propositions prop(s, σ ) whose truth condition involves that s is of type σ. Some empirical pluses: it enables accounts of NP situational relativity, update/clarificational potential of utterances, and the Liar (though this latter also requires non-well-founded set theory). It also enables an account of situational anaphora (see e.g. the examples in (6)). As with any theory that employs non-concrete
35. Situation Semantics and the ontology of natural language entities, a variety of issues arise – for critical discussion in context of the Liar, see Moss (1989) and McGee (1991). The most obvious ones center on the vagueness of situations. For instance, how can Austinian propositions be shared? How can we be clear about the identity of propositions? Aren’t we populating the world with a flood of propositions? Taking these in reverse order – technically, it is indeed true that the world is potentially populated with lots of propositions. However, like other contextual parameters, the situations which figure as possible described/belief/utterance situations are in most possible applications ones that are in some sense accessible to the relevant agent. As for sharing Austinian propositions, this is a trickier issue. The undoubted vagueness of situations means that there is a technical issue here, if one insists that successful communication presupposes agents resolving all aspects of content identically. However, this criterion is equally problematic for property terms, a difficulty that does not stop semanticists from postulating such entities as denotations of various expression types. The reason for this is that typically agents will agree on the central, defining characteristics of properties. By the same token, it is also the case that given two very similar situations s, s' by and large propositions of the form prop(s, σ), prop(s', σ) will have identical truth values. These highly sketchy comments are only intended as directions by means of which these issues can be addressed, theoretically – but of course a proper debate requires a detailed theory of situations. For such a theory see inter alia Barwise (1989a) and other papers in Barwise (1989c), and various papers in Cooper, Mukai & Perry (1990), Barwise et al. (1991) and Aczel et al. (1993). Worlds have a role to play in such a theory, typically viewed as maximal situations that resolve all issues. Whether one needs to admit possible situations is a more controversial issue. A treatment of modality, for instance, does not require this, as pointed out by e.g. Schulz (1993) and by Cooper & Poesio (1994) – the non-actuality can be encoded entirely in the situation types. Still, it is certainly possible to develop a version of situation theory that has possible situations, to the extent there are good linguistic or philosophical reasons for this, as argued by Vogel & Ginzburg (1999). One might also wish to link discussion to more empirical investigations. Indeed, for whatever it is worth, arguably, this type of representation for utterances jives well with psychological work on memory (see e.g. Fletcher 1994 for a review), which argues that the two robust memory traces from an utterance are (a) the situational model and (b) the propositional text base. The former is a representation which integrates various modalities (e.g. visual and linguistic stimuli), whereas the latter differs from the surface form of an utterance for instance in that referents have been resolved. It is also worth pointing out that it would be quite consistent to develop an ontology which involved a mixed picture of propositions: as recognized already by Barwise & Perry (1985), one might wish to avoid positing a described situation with general sentences, such as Two and two are four or Fish swim. See Glasbey (1998) and Kim (1998) for proposals that some propositions are Austinian, whereas others (e.g. mathematical and individual-level statements) are Russellian i.e., do not make reference to a particular situation. But wait, I have talked about situations and propositions and their use in reference, assertion or even metacommunicative interaction – what of the attitudes? While writing Situations and Attitudes Barwise and Perry’s original hope was that replacing worlds with situations would yield an account of one of Montogovian semantics’ bugbears, namely attitude reports, on which see also article 60 (Swanson) Propositional attitudes. However, this hope did not survive even past the penultimate chapter of the book. A solid result of philosophical work of the 1990s (e.g. Richard 1990 and Crimmins 1993) is that no
837
838
VII. Theories of sentence semantics viable theory of propositions can on its own deliver a viable theory of the attitudes. This is because attitudes have structure not perfectly mirrored by their external content, a realization of which prompted Barwise and Perry to abandon their initial essentially proposition-based account. The most striking illustration of this is in puzzles like Kripke’s Pierre (Kripke 1979), who is unaware that the wonderful city of Londres about which he learnt as a child is the same place as the squalid London, where he currently resides. While his beliefs are perfectly rational, we can say of him that he believes that London is pretty and also does not believe that London is pretty. One possible conclusion from this (see e.g. Crimmins 1993), a way out of paradox, is that attitude reports involve implicit reference to attitudinal states: relative to information acquired in France, Pierre believes London is pretty; relative to information acquired on the ground in London, he believes the opposite. Here is yet another important role for situations in linguistic description. One way to integrate this in an account of complementation was offered in Cooper & Ginzburg (1996) and Ginzburg (1995) for declarative and interrogative attitude reports, respectively. This constitutes a compositional reformulation of the philosophical accounts cited above. The main idea is to assume that attitude predicates involve at least three arguments: an agent, an attitudinal state and a semantic entity. For instance with respect to belief, this relates an agent’s belief in a proposition to facts about the agent’s mental situation. This amounts to linking a positive belief attribution of proposition p relative to the mental situation ms with the existence of an internal belief state whose content is p. An example of such a mental situation is given in section 5.
4. A wider ontological net The ontology of Situation Theory (ST) was originally designed on the basis of a rather restricted data set. One of the challenges of more recent work has been to extend this ontology in order to account for two related key domains for semantics: root clauses in conversational use and verb complementation. A large body of semantic work that has emerged since the late 1970s demonstrating that interrogative clauses possess denotations (questions) distinct in semantic type from declarative ones; imperative and subjunctive clauses possess denotations (dubbed outcomes by Ginzburg & Sag 2000) distinct in semantic type from declarative and interrogative ones; facts are distinct from true propositions; for detailed empirical evidence for these distinctions, see Vendler (1972), Asher (1993), Peterson (1997) and Ginzburg & Sag (2000); see also article 66 (Krifka) Questions and article 67 (Han) Imperatives. The main challenge in developing an ontology which distinguishes the diverse menagerie of abstract entities including propositions, questions, outcomes and facts is characterizing the structure of these entities, indeed figuring out how the distinct entities relate to each other. As pointed out by Ginzburg & Sag (2000), quantified NPs and certain adverbs are possible in declarative, interrogative and imperative semantic environments. Hence, the ontology must provide a semantic unit which constitutes the input/output of such adverbial modifiers and of NP quantification. To make this concrete – the assumption that the denotation of imperatives is of a type distinct from t (however cashed out) is difficult to square with (a simplistic implementation) of the received wisdom that NPs such as ‘everyone’ are of type , t >. If the latter were the case, composing ‘everyone’ with ‘vacate the building’ in (14c) would yield a denotation of type t:
35. Situation Semantics and the ontology of natural language (14) a. b. c. d. e. f.
Everyone vacated the building. Did everyone vacate the building? Everyone vacate the building! Kim always wins. Does Kim always win? Always wear white!
As we will see subsequently, a good candidate for this role are situation types. These, as we observed in section 3., are not conflated with propositions in the situation theoretic universe. Ginzburg & Sag (2000) set out to construct an ontology that appropriately distinguishes these entities and yet retains the features of the ST ontology discussed earlier. The ontology, dubbed a Situational Universe with Abstract Entities (SU+AE), was developed in line with the strategy of Barwise and Perry’s (1). This was implemented on two levels, one within a universe of type-based feature structures (Carpenter 1992). This universe underpinned grammatical analysis, using Head Driven Phrase Structure Grammar (HPSG). A denotational semantics was also developed in the Axiom of Foundation with Atoms (AFA)–based set theoretic framework of Seligman & Moss (1997). In what follows, I restrict attention to the latter. A semantic universe is identified with a relational structure S of the form [A, S1,..., Sm; R1, ..., Rm]. Here A – sometimes notated also as ⎪S⎪ – is the universe of the structure. From the class of relations we single out the S1, ..., Sm which are called the structural relations, as they are to capture the structure of certain elements in the domain. Each Si can be thought of as providing a condition that defines a single structured object in terms of a list of n objects x1, ..., xn. Situations and situation types serve as the ‘basic building blocks’ from which the requisite abstract entities of the ontology are constructed: – –
–
Propositions are structurally determined by a situation and a situation type. (See discussion in section 3.) Intuitively, each outcome is a specification of a situation which is futurate relative to some other given situation. Given this, outcomes are structurally determined by a situation and a situation type abstract whose temporal argument is abstracted away, thereby allowing specification of fulfilledness conditions. Possibilities, a subclass of which constitutes the universe’s facts, are structurally determined by a proposition. This reflects the tight link between propositions and possibilities. As Ginzburg & Sag (2000) explain, there is no obvious priority between possibilities and propositions: one could develop an ontology where propositions are built out of possibilities.
An additional assumption made is that the semantic universe is closed under simultaneous abstraction. Simultaneous abstraction, originally defined by Aczel & Lunnon (1991), is a semantic operation akin to λ-abstraction with one significant extension: abstraction is over sets of elements, including the empty set. Moreover, abstraction (including over the empty set) is potent – the body out of which abstraction occurs is distinct from the abstract. The assumption about closure under simultaneous abstraction is akin to the common type theoretic assumption about closure under functional type formation.
839
840
VII. Theories of sentence semantics Putting this together, and simplifying somewhat, an SU+AE is an extensional relational structure of the following kind: (15) [A, Possibility, Proposition, Outcome, Fact, True, Fulfill, →prop ] Let me gloss the key notions involved here: A is a λ-situation structure (λ-SITSTR). That is, a situation structure closed under simultaneous abstraction. A situation structure (SITSTR) is a universe which supports a basic set theoretic structure. It contains among its entities a class of spatio-temporally located situations and a class of situation types. Proposition, Possibility, and Outcome are sorts whose elements represent, respectively, the propositions, possibilities, and outcomes of the universe. Those possibilities that are factual, as determined by the predicate Fact, will constitute the facts of the universe. Analogously, there will be properties True and Fulfill, which capture the notions of truth and fulfilledness for propositions and outcomes; →prop is a notion of entailment defined for propositions. What about questions? Their existence follows without further stipulation, once one adopts Ginzburg and Sag’s assumption that they are propositional abstracts: the universe contains propositions, it is closed under simultaneous abstraction, hence it contains questions. Assuming the identification of questions with propositional abstracts is descriptively adequate, this is an instance of an explanatorily satisfying piece of ontological engineering. On the other hand, one would hope that the existing explication of facts within SU+AEs could be improved on, for instance by uncovering additional internal structure such entities possess. To conclude this section, I point to two examples (from Ginzburg & Sag 2000) of linguistic phenomena whose explication relies strongly on properties of SU+AEs. The first concerns the distribution of in situ wh-phrases. In declarative clause-types, which in the absence of a wh-phrase denote propositions, the occurrence of such phrases leads to an ambiguity between two readings, as exemplified in (16a–c): a ‘canonical’ use which expresses a direct query and a use as a reprise query to request clarification of a preceding utterance. In all other clause types, ones which denote outcomes, (16d), questions, (16e), or facts, (16f) – Ginzburg & Sag (2000) argue that exclamative clauses denote facts – the ambiguity does not arise, only a reprise reading is available; a priori one might expect (16d), for instance, to have a reading as a direct question paraphrasable as who should I give the book to? if one could simply abstract over the wh-parameter within an ‘open outcome’: (16) a. b. c. d. e. f.
The bagels, you gave to who? (can be used to make a non-reprise query) You gave the bagels to who? (can be used to make a non-reprise query) Who talked to who? (can be used to make a non-reprise query) Give who the book? (can be used only to make a reprise query) Do I like who? (can be used only to make a reprise query) What a winner who is? (can be used only to make a reprise query) (Ginzburg & Sag 2000: 282, example (72))
Given the assumption that questions are exclusively propositional abstracts, it follows without further stipulation what is the clause type out of which non-reprise in situ interrogatives are constructed, namely ones with a propositional denotation. Reprise clauses,
35. Situation Semantics and the ontology of natural language in contrast, can be built from antecedents of any clause type – the antecedent provides an illocutionary proposition whose main relation is the illocutionary force associated with the given clause type. The second phenomenon concerns the interaction of negation and interrogation: the fact that propositions are constructed from situations and situation types has a consequence that, in contrast to approaches where questions are characterized in terms of exhaustive answerhood conditions (see Groenendijk & Stokhof 1997), positive and negative polar interrogatives are assigned distinct denotations. For instance, (17a) and (17b), due to Hoepelmann (1983), would be assigned the 0-ary abstracts in (17c) and (17d) respectively: (17) a. b. c. d.
Is 2 an even number? Isn’t 2 an even number? ↦ λ{ }prop(s, 〈〈EvenNumber, 2〉〉) ↦ λ{ }prop(s, 〈〈EvenNumber, 2〉〉)
This means that the ontology can explicate the distinct presuppositional backgrounds associated with positive and negative polar interrogatives. For instance, Hoepelmann, in arguing for this distinction, suggests that the contexts appropriate for a question like (17a) is likely to be asked by a person recently introduced to the odd/even distinction, whereas (17b) is appropriate in a context where, say, the opaque remarks of a mathematician sow doubt on the previously well-established belief that two is even. The latter can be tied to the factuality conditions of negative situation types. As I mentioned in section 2., one axiom associated with negative situation types is the following: if a situation s supports the dual of σ, then s also supports information which defeasibly entails – involves wondering about that σ is the case. Hence, wondering about λ{}prop(s, σ) whether s has the characteristics that typically involve σ being the case, but which – nonetheless, in this case – fail to bring about σ. These contextual differences gives rise in some languages including French and Georgian to distinct words to affirm a positive polar question (oui, xo) and a negative polar question (si, diax). Nonetheless, given the definitions of answerhood available in this system, positive and negative interrogatives specify identical answerhood relations. Hence, the identity of truth conditions of sentences like (18) can be captured: (18) a. Kim knows whether Bo left. b. Kim knows whether Bo did not leave.
5. A type theoretic ontology for interaction In previous sections we have observed the gradual evolution of the situation theoretic ontology: from a theory of situations, through a theory of situations and Austinian propositions, to an SU+AE, which includes a variety of abstract entities and is closed under abstraction. This ontology has, as we saw, a wide range of linguistic applications, including perception and attitude complements, definite reference, the Liar, and a rudimentary theory of interaction (for the latter, see Ginzburg 1996). However, as a new millenium dawned the theory was hamstrung by a number of foundational problems. The logical underpinnings for the theory in terms of non-wellfounded set theory, originating in Barwise & Etchemendy (1987), extensively discussed
841
842
VII. Theories of sentence semantics in Barwise (1989c), and comprehensively developed in Seligman & Moss (1997), were rather complex. Concretely, simultaneous λ-abstraction with restrictions is a tool with a variety of uses, including quantification, questions, and the specification of attitudinal states and meanings (for the latter see article 36 (Ginzburg) Situation Semantics). Its complex set theoretic characterization made it difficult to use. Concomitantly, the theory in this form required an auxiliary coding into a distinct formalism (e.g. typed feature structures) for grammatical and computational applications. Neither of these versions of the theory provides an adequate notion of role-dependency that has become standard in recent treatments of anaphora and quantification on which much semantic work has been invested in frameworks such as Discourse Representation Theory and Dynamic Semantics; see Gawron & Peters (1990) for a detailed theory of anaphora and quantification in situation semantics, though one that is not dynamic. Motivated to a large extent by such concerns, the situation theoretic outlook has been redeveloped using tools from Type Theory with Records (TTR), a framework initiated by Robin Cooper. Ever since Sundholm and Ranta’s pioneering work (Sundholm 1986; Ranta 1994), there has been interest in using constructive type theory (often referred to as Martin-Löf Type Theory) as a framework for semantics (see e.g. Fernando 2001 and Krahmer & Piwek 1999). TTR is a model theoretic outgrowth of constructive type theory. Its provision of entities at both levels of tokens and types allows one to combine aspects of the typed feature structures world and the set theoretic world, enabling its use as a computational grammatical formalism. As we will see, TTR provides the semanticist with a formalism that satisfies the desiderata I mentioned in section 1. Cooper (2006a) has shown that the lion’s share of situation theoretic results can be recast in TTR – the main exception being those results that depend explicitly on the existence of a non-wellfounded universe, for instance Barwise and Etchemendy’s account of the Liar; the type theoretic universe is well founded. But one could, according to Cooper (p.c.), recreate non-well-foundedness at the level where witnessing of types occurs. In addition, TTR allows for DRT-oriented or Montogovian treatment of anaphora and quantification. For a computational implementation of TTR, see Cooper (2008); for a closely related framework, the Grammatical Framework see Ranta (2004). The move to TTR is, however, not primarily a means of capturing and perhaps mildly refining past results, but crucially underpins a theory of conversational interaction on both illocutionary and metacommunicative levels. A side effect of this is, via a theory of generation, an account of attitude reports. In the remaining space, I will briefly exposit the basics of TTR, show its ability to underpin SU+AEs and briefly sketch how this can be used to define basic information states in dialogue. One linguistic application will be provided, one that ties up situations, information states, and meaning: a specification of the meaning of a discourse-bound pronoun.
5.1. Generalizing the situation/situation type relation The most fundamental notion of TTR is the typing judgement a : T classifying an object a as being of type T. This can be seen as a generalization of the situation semantics judgement s : σ, generalization in that not only situations can figure as subjects of typing judgements. Note that the theory provides the objects and the types, but this form of judgement, as well as other forms are metatheoretical. Examples are given in
35. Situation Semantics and the ontology of natural language (19). (19a–c) are typing judgements that presuppose the existence of types SIT(uation), IND(ividual), REL(ation), whose identity can be amplified. (19d) is the direct analogue of the situation semantics statement s : 〈〈RUN; b, t〉〉; here run(b, t) is a proof type, about which more below; ‘proof’ can be equally glossed as ‘observation’ or even ‘situation’, as explained by Ranta (1994); the source of the ‘proof-based’ terminology is constructive type theory’s original use as a foundation for mathematics. (19) a. b. c. d.
s b run s
: : : :
SIT IND REL run(b, t)
A useful innovation TTR introduces relative to earlier version of type theory are records and record types. A record is an ordered tuple of the form (20), where crucially each successive field can depend on the values of the preceding fields: (20) ⎡ l i = ki ⎤ ⎢ ⎥ ⎢ l i +1 = ki +1 (l i ) ... ⎥ ⎢ l i + j = ki + j (l i , ..., li + j −1 )⎥ ⎣ ⎦ Together with records come record types. Technically, a record type is simply an ordered tuple of the form (21), where again each successive type can depend on its predecessor types within the record: (21) ⎡ l i : Ti ⎤ ⎢ ⎥ ⎢ l i +1 : Ti +1 (l i ) ... ⎥ ⎢Ti + j : Ti + j (l i , ..., li + j −1 )⎥ ⎣ ⎦ Record types allow us to place constraints on records: the basic typing mechanism assumed is that a record r is of type RT if all the typing constraints imposed by RT are satisfied by r. More precisely, (22) The record: ⎡ l1 = a1 ⎤ ⎢l = a ⎥ 2 ⎥ ⎢2 is of type: ⎢... ⎥ ⎢ ⎥ ⎣ l n = an ⎦
⎡ l1 : ⎢l : ⎢2 ⎢... ⎢ ⎣ln :
⎤ ⎥ ⎥ ⎥ ⎥ Tn (l1 , l 2 , ..., ln −1 )⎦ T1 T2 (l1 )
iff a1 : T1, a2 : T2(a1), ... , an : Tn(a1, a2, ..., an–1)
5.2. Recreating SU+AEs in TTR Ginzburg (2005b) shows how to recreate SU+AEs within the type theoretic universe constructed in Cooper (2006a). As with SU+AEs, one can recognize here the sitsemian
843
844
VII. Theories of sentence semantics strategy Barwise and Perry allude to in (1). The universe is connected to the real world via a model which assigns witnesses to the basic types and sets of witnesses to the proof types depending on their r-ity. From these beginnings, arise structured objects via type construction which allows for a recursive building up of the type theoretic universe. Ranta (1994) and Cooper (2006a) list a dozen such constructors. Here, apart from the afore mentioned record typing construction, I will list only a small number that are necessary for the tasks to be performed here: (23) a. Function types: if T1 and T2 are types, then so is (T1 → T2), the type of functions from elements of type T1 to elements of type T2. b. The type of lists: if T is a type, then [T], the type of lists each of whose members is of type T, is also a type. [a1, ..., an] : [T] iff for all i ai : T c. The unique type: if T is a type and x : T, then Tx is a type. a : Tx iff a = x.
5.2.1. Abstraction Function types allow one to model abstraction. As Cooper points out, although abstraction in TTR works in a deceptively familiar ‘type theoretic’ way, the existence of record typing yields a rich notion of abstraction. It is simultaneous and restricted, i.e. it allows for multiple entities to be abstracted over simultaneously while encoding restrictions, and allows for vacuous abstraction. As an illustration of abstraction in TTR, consider a mental state that Pierre can be assumed to possess (see section 3. and Cooper 2006a, where this example is discussed in detail). (24a), a function mapping records into record types, represents the internal type, whereas (24b) represents a possible external setting for this type. The internal type is a perfectly consistent type, external incoherence is captured by the fact that applying the internal type to the setting yields a contradiction. (24) (a) r: ( ⎡ x : Ind (b) ⎤ ⎢c1 : Named (x, ‘Londres’)⎥ c 3 : pretty (r. x) ⎤ ⎢ ⎥) ⎡ ⎢ y : Ind ⎥ ⎢⎣ 4 : ¬ pretty (r.x) ⎥⎦ ⎢ ⎥ ⎣c 2 : Named (y, ‘London ’) ⎦
⎡ x = london ⎢c1 = s Named(london, ‘Londres’) ⎢ ⎢ y = london ⎢ ⎢⎣c 2 = sNamed(london, ‘London’)
⎤ ⎥ ⎥ ⎥ ⎥ ⎥⎦
See also article 36 (Ginzburg) Situation Semantics for the use of this sort of abstraction in the specification of the meaning/content relationship.
5.2.2. Situations Cooper (2006a) proposes that situations (in the sense of Situation Theory) be modelled as records. Situation types are then directly accommodated as record types. The type of a situation with a woman riding a bicycle would then be the one in (25a). A record of this type (a witness for this type) would be as in (25b), where the required corresponding typing judgements are given in (25c):
35. Situation Semantics and the ontology of natural language ⎤ (25) (a) ⎡ x : IND ⎢c1 : woman(x ) ⎥ ⎢ ⎥ ⎢ y : IND ⎥ ⎢ ⎥ 2 : ( ) c bicycle y ⎢ ⎥ ⎢time : TIME ⎥ ⎢ ⎥ ⎢ loc : LOC ⎥ ⎢c 3 : ride(x, y, time, loc)⎥ ⎣ ⎦
⎤ (b) ⎡... ⎢x = a ⎥ ⎢ ⎥ ⎢c1 = p1 ⎥ ⎢ ⎥ ⎢y = b ⎥ ⎢c 2 = p2 ⎥ ⎢ ⎥ ⎢time = t0 ⎥ ⎢ loc = 10 ⎥ ⎢ ⎥ ⎢c 3 = p3 ⎥ ⎢ ⎥ ⎣... ⎦
(c) a : IND; p1 : woman(a); b : IND; p2 : bicycle(b); t0 : TIME; 10 : LOC; p3 : ride(a,b,t0,10); In particular, given an identification of utterances with speech events, this enables us to have simultaneous access to utterances and utterance types (or signs). These are important ingredients for a theory of metacommunicative interaction, as discussed in article 36 (Ginzburg) Situation Semantics. In a series of recent papers (e.g. Fernando 2007a, 2007b), Tim Fernando has provided a type theoretic account of the internal make up of situations. Events and situations are represented by strings of temporally ordered observations, on the basis of which the events and situations are recognized. This allows a number of important temporal constructions to be derived, including Allen’s basic interval relations Allen (1983) and Kamp’s event structures Kamp (1979). Observations are generalized to temporal propositions, leading to event-types that classify event-instances.
5.2.3. Propositions There are two obvious ways to develop an account of propositions in TTR, implicitly Austinian or explicitly so. Cooper (2006a) develops the former in which a proposition p is taken to be a record type. A witness for this type is a situation as e.g. (25b). On this strategy, a witness is not directly included in the semantic representation. Ginzburg (2005b) develops an explicitly Austinian approach. The type of propositions is the record type (26a). The correspondence with the situation semantics conception is quite direct. We can define truth conditions as in (26b). : Record ⎤ sit (26) a. Prop =def ⎡ ⎢sit-type : RecType ⎥ ⎣ ⎦ = s0 ⎤ ⎡sit b. A proposition p = ⎢ ⎥ is true iff s0 : ST0 ⎣sit-type = ST0 ⎦ TTR actually provides very fine-grained entities and so does not run into the problems that beset traditional semantic approaches with respect to logical omniscience and various other puzzles. In fact, as Cooper (2006a) discusses, this can be too much of a good thing, given that record types distinct only by their labelling are distinguished. Cooper
845
846
VII. Theories of sentence semantics goes on to offer a criterion of type individuation of record types using ∑-types, where the corresponding ‘labels’ function as bound variables. Ginzburg (2005a) shows how to formulate a theory of questions as propositional abstracts in TTR, while using the standard TTR notion of abstraction. In this way, a possible criticism of the approach of Ginzburg & Sag (2000), that they use an ad hoc and complex notion of abstraction, can be circumvented. Similarly, Ginzburg (2005b) shows how to explicate outcomes within TTR.
5.3. Ontology in interaction The most active area in the application of TTR to the description of NL is in the area of dialogue. Larsson (2002) and Cooper (2006b) showed how to decompose interaction protocols, such as those specified situation theoretically in Ginzburg (1996), by using TTR to describe update rules on the information states of dialogue participants. This was extended by Ginzburg (2011) to cover a variety of illocutionary moves, metacommunicative interaction (see article 36 (Ginzburg) Situation Semantics for some discussion) and conversational genres. Fernández (2006) uses TTR to develop a wide coverage of the range of non-sentential utterances that occur in conversation. In these works, information states are assumed to consist of a public and unpublicized part. For current purposes we restrict attention to the public part, also known as each participant’s dialogue gameboard (DGB). Each DGB is a record of the type given in (27) – the spkr, addr fields allow one to track turn ownership, Facts represents conversationally shared assumptions, Pending and Moves represent respectively moves that are in the process of/have been grounded, QUD tracks the questions currently under discussion: (27) ⎡ spkr : Ind ⎢ addr : Ind ⎢ ⎢ c-utt : addressing(spkr,addr) ⎢ ⎢ Facts : Set(Prop) ⎢ Pending : list(LocProp) ⎢ ⎢ Moves : list(LocProp) ⎢ QUD : poset(Question) ⎣
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
We call a mapping that indicates how one DGB can be modified by conversationally related action a conversational rule, and the types specifying its domain and its range respectively the preconditions and the effects. Here I exemplify the use of TTR to give a partial characterization of the meaning of pronouns in dialogue, a task that links assertion acceptance, situations, and meaning. The main challenge for a theory of meaning for pronouns is of course how to characterize their antecedency conditions; here I restrict attention to intersentential cases, see Ginzburg (2011) for an extension of this account to intra-sentential cases. Dialogue takes us away quite quickly from certain received ideas on this score: antecedents can arise from queries (28a), from partially understood or even disfluent utterances ((28b,c) respectively). Moreover, as (28d) illustrates, where ‘he’ cannot refer to ‘Jake’, the shelf life of an antecedent is potentially quite short. Although the data are subtle, a plausible assumption is that for non-referential NPs anaphora are not generally possible from
35. Situation Semantics and the ontology of natural language
847
within a query (polar or wh) (Groenendijk 1998), or from an assertion that has been rejected (e.g. (28e,f)). (28) a. b. c. d.
A: Did John phone? B: He’s out of contact in Daghestan. A: Did John phone? B: Is he someone with a booming bass voice? Peter was, well he was fired. A: Jake hit Bill. / B: No, he patted him on the back. / A: Ah. Is Bill going to the party tomorrow? /B: No. / A: Is #he/Jake? e. A: Do you own an apartment? B: Yes. A: Where is it located? f. A: Do you own an apartment? B: No. A: #Where might you buy it?
This means, naturally enough, that witnesses to non-referential NPs can only emerge in a context where the corresponding assertion has been accepted. A natural move to make in light of this is to postulate a witnessing process as a side effect of assertion acceptance, a consequence of which will be the emergence of referents for non-referential NPs. For uniformity’s sake, we can assume that these witnesses get incorporated into the contextual parameters (c-params) of that utterance, which in any case includes (witnesses for) the referential NPs. This means that c-params serves uniformly as the locus for witnesses of ‘discourse anaphora’. The rule of incorporating non-referential witnesses in c-params is actually simply a minor add on to the rule that underwrites assertion acceptance (see Ginzburg 2011, chapter 4) – the rule underpinning the utterance of acceptances – it can be viewed as providing for a witness for situation/event anaphora since this is what gets directly introduced into c-params. In cases where the witness is a record (essentially when the proposition is positive), NP witnesses will emerge. In (29) the preconditions involve the fact that the speaker’s latest move is an assertion of a proposition whose type is T1. The effects change the speaker/addressee roles (since the asserted to becomes the accepter) and adds a record w, including a witness for T1, to the contextual parameters. (29) Accept move: ⎡ ⎡spkr : Ind ⎤ ⎢ ⎢addr : Ind ⎥ ⎢ ⎢ ⎥ ⎢ preconds : ⎢ ⎥ ⎡sit = sit1 ⎤ ⎢ : Prop ⎢p = ⎢ ⎥ ⎥ ⎢ ⎢ ⎥ ⎣sit-type = T1⎦ ⎢ ⎢LatestMovecontent = Assert(spkr, addr, p) : IllocProp ⎥ ⎣ ⎦ ⎢ ⎢ ⎡spkr = preconds.addr : Ind ⎢ ⎢addr = preconds.spkr: Ind ⎢ ⎢ ⎢ ⎢ w = preconds.LatestMove.c-params U [ sit = sit1] : Rec ⎢effects : ⎢ ⎢ ⎢Moves = m1 ⊕ preconds.Moves : list(LocProp) ⎢ ⎢m1content = Accept(spkr, addr, p) : LocProp ⎢ ⎢ ⎢ ⎢⎣m0.c-param = w : Rec ⎣
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎤⎥ ⎥⎥ ⎥⎥ ⎥⎥ ⎥⎥ ⎥⎥ ⎥⎥ ⎥⎥ ⎥⎦ ⎥⎦
We can now state the meaning of a singular pronoun somewhat schematically as follows: it is a word whose contextual parameters include an antecedent, which is to be sought from among the constituents of an active move; the pronoun is identical in reference to
848
VII. Theories of sentence semantics this antecedent and agrees with it. Space precludes a careful characterization of what it means to be active, but given the data we saw above it is a composite property determined by QUD – essentially being specific to an element of QUD – and Pending. See article 36 (Ginzburg) Situation Semantics for the justification for including reference to an utterance’s constituents in grammatical representation. In (30), I provide a lexical entry in the style of HPSG that captures this specification: here m represents the active move and a the antecedent; the final condition on c-params requires that within m’s contextual parameters is one whose index is identical to that of a’s content: ⎤ (30) ⎡ PHON : she ⎢ ⎥ ⎡m : LocProp ⎤⎥ ⎢ ⎢ ⎥ ⎢ ⎥ ⎢a : Sign ⎥⎥ ⎢ ⎥⎥ ⎢c-params : ⎢c1 : member(a, m.constits) ⎢ ⎥⎥ ⎢ ⎢c 2 : ActiveMove(m) ⎥⎥ ⎢ ⎢m.sit.c-params : [a.c-params.index = a.cont.index : Ind]⎥ ⎥ ⎢ ⎣ ⎦ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎤ ⎡head : N ⎢ ⎥ ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ana : + ⎢ ⎥ ⎥ : syncat ⎥ ⎢cat = ⎢ = num sg : Number ⎡ ⎤ ⎥ ⎢ ⎢ ⎥ ⎢ agr = c-params.m.cat.agr : ⎢gen = fem : Gender ⎥ ⎥ ⎢ ⎥ ⎢ ⎥⎥ ⎢ ⎢ ⎥ ⎢ ⎥ = pers third : Person ⎢⎣ ⎣ ⎦ ⎥⎦ ⎢ ⎥ ⎢cont : [ index = a.cont.index : Ind] ⎥ ⎣ ⎦
6. Conclusions Situation semantics initiated an ontology–oriented approach to semantics: the aim being to develop means of representing the external and internal reality of agents in a cognitively tractable way. The initial emphasis was on situations, based in part on evidence from the naked infinitival construction. Situations, parts of the world, have proved to be of significant importance to a variety of phenomena ranging from the domains associated with NP use to negation and, one way or the other, play a significant role in the very act of asserting. The theory of situations subsequently lead to a theory of propositions (Austinian propositions), structured objects constructed from situations and situation types: Austinian propositions are significantly more fine-grained than possible worlds propositions, but coarse grained enough to pass translation and paraphrase criteria. They also have a potential construal in terms of differingly coarse grained memory traces. The technique of individuating abstract entities as structured objects enables the theory to scale up: by integrating questions, outcomes and facts into the ontology, Situation Theory was able to underpin a rudimentary theory of illocutionary interaction (entities such as questions, propositions and outcomes serve as the descriptive content of queries, assertions and requests) and a theory of complementation for attitudinal predicates.
35. Situation Semantics and the ontology of natural language A recent development has been to recast the theory in type theoretic terms, concretely using the formalism of Type Theory with Records. Type Theory with Records has many similar characteristics to situation theory – an ontology – oriented approach, computational tractability, structured objects. This enables most of the results situation theory achieved to be maintained. On the other hand, Type Theory with Records brings with it a more transparent formalism and the existence of dependent types allows both dynamic semantic and unification grammar techniques to be utilized. Indeed, all these can be combined to construct a theory of illocutionary and metacommunicative interaction, one of the key areas of development for semantics in the early 21st century.
7. References Aczel, Peter 1988. Non Well Founded Sets. Stanford, CA: CSLI Publications. Aczel, Peter, David Israel, Stanley Peters & Yasuhiro Katagiri (eds.) 1993. Situation Theory and Its Applications, III. Stanford, CA: CSLI Publications. Aczel, Peter & Rachel Lunnon 1991. Universes and parameters. In: J. Barwise et al. (eds.). Situation Theory and its Applications, II. Stanford, CA: CSLI Publications, 3–24. Allen, James 1983. Maintaining knowledge about temporal intervals. Communications of the ACM 26, 832–843. Asher, Nicholas 1993. Reference to Abstract Objects in English: A Philosophical Semantics for Natural Language Metaphysics. Dordrecht: Kluwer. Austin, John L. 1970. Truth. In: J. Urmson & G. J. Warnock (eds.). Philosophical Papers. 2nd edn. Oxford: Oxford University Press, 117–133. Barwise, Jon 1989a. Branch points in Situation Theory. In: J. Barwise. The Situation in Logic. Stanford, CA: CSLI Publications, 255–276. Barwise, Jon 1989b. Scenes and other situations. In: J. Barwise. The Situation in Logic. Stanford, CA: CSLI Publications, 5–36. Barwise, Jon 1989c. The Situation in Logic. Stanford, CA: CSLI Publications. Barwise, Jon & John Etchemendy 1987. The Liar: An Essay on Truth and Circularity. Oxford: Oxford University Press. Barwise, Jon et al. (eds.) 1991. Situation Theory and Its Applications, II. Stanford, CA: CSLI Publications. Barwise, Jon & John Perry 1983. Situations and Attitudes. Cambridge, MA: The MIT Press. Barwise, Jon & John Perry 1985. Shifting situations and shaken attitudes. Linguistics & Philosophy 8, 399–452. Brown-Schmidt, S. & Michael K. Tanenhaus 2008. Real-time investigation of referential domains in unscripted conversation: A targeted language game approach. Cognitive Science: A Multidisciplinary Journal 32, 643–684. Carpenter, Bob 1992. The Logic of Typed Feature Structures: With Applications to Unification Grammars, Logic Programs, and Constraint Resolution. Cambridge: Cambridge University Press. Cooper, Robin 1993. Generalized quantifiers and resource situations. In: P. Aczel et al. (eds.). Situation Theory and Its Applications, III. Stanford, CA: CSLI Publications, 191–211. Cooper, Robin 1996. The role of situations in Generalized Quantifiers. In: S. Lappin (ed.). Handbook of Contemporary Semantic Theory. Oxford: Blackwell, 65–86. Cooper, Robin 1998. Austinian propositions, Davidsonian events and perception complements. In: J. Ginzburg et al. (eds.). The Tbilisi Symposium on Logic, Language and Computation: Selected Papers. Stanford, CA: CSLI Publications, 19–34. Cooper, Robin 2006a. Austinian truth in Martin-Löf Type Theory. Research on Language and Computation 3, 333–362.
849
850
VII. Theories of sentence semantics Cooper, Robin 2006b. A type theoretic approach to information state update in issue based dialogue management. In: L. Moss (ed.). Jon Barwise Memorial Volume. Bloomington, IN: Indiana University Press. Cooper, Robin 2008. Oz Implementation of Type Theory with Records. http://www.ling. gu.se/~cooper/records/ttr0.zip. December 15, 2010. Cooper, Robin & Jonathan Ginzburg 1996. A compositional Situation Semantics for attitude reports. In: J. Seligman & D. Westerståhl (eds.). Logic, Language, and Computation. Stanford, CA: CSLI Publications, 151–165. Cooper, Robin, Kuniaki Mukai & John Perry (eds.) 1990. Situation Theory and Its Applications, I. Stanford, CA: CSLI Publications. Cooper, Robin & Massimo Poesio 1994. Situation Theory. Fracas Deliverable D8. Centre for Cognitive Science, Edinburgh: The Fracas Consortium. Crimmins, Mark 1993. Talk about Beliefs. Cambridge, MA: The MIT Press. Fernández, Raquel 2006. Non-Sentential Utterances in Dialogue: Classification, Resolution and Use. Ph.D. dissertation. King’s College, London. Fernando, Tim 2001. Conservative generalized quantifiers and presupposition. In: R. Hastings, B. Jackson & Z. Zvolenszky (eds.). Proceedings of Semantics and Linguistic Theory (= SALT) XI. Ithaca, NY: Cornell University, 172–191. Fernando, Tim 2007a. Observing events and situations in time. Linguistics & Philosophy 30, 527–550. Fernando, Tim 2007b. Situations from events to proofs. In: K. Korta & J. Garmendia (eds.). Meaning, Intentions, and Argumentation. Stanford, CA: CSLI Publications, 113–129. Fletcher, Charles 1994. Levels of representation in memory for discourse. In: M. A. Gernsbacher (ed.). Handbook of Psycholinguistics. San Diego, CA: Academic Press, 589–607. Gawron, Mark & Stanley Peters 1990. Anaphora and Quantification in Situation Semantics. Stanford, CA: CSLI Publications. Ginzburg, Jonathan 1995. Resolving questions, I. Linguistics & Philosophy 18, 459–527. Ginzburg, Jonathan 1996. Interrogatives: Questions, facts, and dialogue. In: S. Lappin (ed.). Handbook of Contemporary Semantic Theory. Oxford: Blackwell, 359–423. Ginzburg, Jonathan 2005a. Abstraction and ontology: Questions as propositional abstracts in constructive type theory. Journal of Logic and Computation, 113–130. Ginzburg, Jonathan 2005b. Situation Semantics: The ontological balance sheet. Research on Logic and Computation 3, 363–389. Ginzburg, Jonathan 2010. The Interactive Stance: Meaning for Conversation. Oxford: Oxford University Press. Ginzburg, Jonathan & Ivan A. Sag 2000. Interrogative Investigations: The Form, Meaning and Use of English Interrogatives. Stanford, CA: CSLI Publications. Glasbey, Sheila 1998. A situation theoretic interpretation of bare plurals. In: J. Ginzburg et al. (eds.). The Tbilisi Symposium on Logic, Language and Computation: Selected Papers. Stanford, CA: CSLI Publications, 35–54. Groenendijk, Jeroen 1998. Questions in update semantics. In: J. Hulstijn & A. Nijholt (eds.). Proceedings of TwenDial 98, 13th Twente Workshop on Language Technology. Twente: Twente University, 125–137. Groenendijk, Jeroen & Martin Stokhof 1997. Questions. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Linguistics. Amsterdam: North-Holland, 1055–1124. Higginbotham, James 1983. The logic of perceptual reports: An extensional alternative to Situation Semantics. Journal of Philosophy 80, 100–127. Higginbotham, James 1994. The Semantics and Syntax of Event Reference. ESSLLI Course Notes. Copenhagen: Copenhagen Business School. Hoepelmann, Jacob 1983. On questions. In: F. Kiefer (ed.). Questions and Answers. Dordrecht: Reidel, 191–227.
35. Situation Semantics and the ontology of natural language Kamp, H. 1979. Events, instants and temporal reference. In: R. Bäuerle, U. Egli & A. von Stechow (eds.). Semantics from Different Points of View. Berlin: Springer, 376–417. Kim, Yookyung 1998. Information articulation and truth conditions of existential sentences. Language and Information 1, 67–105. Krahmer, Emiel & Paul Piwek 1999. Presupposition projection as proof construction. In: H. Bunt & R. Muskens (eds.). Computing Meaning: Current Issues in Computational Semantics. Dordrecht: Kluwer, 281–300. Kratzer, Angelika 1989. An investigation of the lumps of thought. Linguistics & Philosophy 12, 607–653. Kratzer, Angelika 2008. Situations in natural language semantics. In: E. N. Zalta (ed.). The Stanford Encyclopedia of Philosophy (Fall 2008 Edition). http://plato.stanford.edu/entries/situationsemantics/. December 15, 2010. Kripke, Saul 1975. Outline of a theory of truth. The Journal of Philosophy, 690–716. Kripke, Saul 1979. A puzzle about belief. In: A. Margalit (ed.). Meaning and Use. Dordrecht: Reidel, 239–283. Larsson, Staffan 2002. Issue based Dialogue Management. Doctoral dissertation. University of Gothenburg. Lewis, David K. 1979. Score keeping in a language game. In: R. Bäuerle, U. Egli & A. von Stechow (eds.). Semantics from Different Points of View. Berlin: Springer, 172–187. McCawley, James D. 1979. Presupposition and discourse structure. In: C.-K. Oh & D. Dinneen (eds.). Presupposition. New York: Academic Press, 371–388. McGee, Vann 1991. Review of J. Barwise & J. Etchemendy. The Liar (Oxford, 1987). Philosophical Review 100, 472–474. Moss, Lawrence 1989. Review of J. Barwise & J. Etchemendy. The Liar (Oxford, 1987). Bulletin of the American Mathematical Society 20, 216–225. Muskens, Reinhard 1989. Meaning and Partiality. Doctoral dissertation. University of Amsterdam. Reprinted: Stanford, CA: CSLI Publications, 1995. Neale, Stephen 1988. Events and logical form. Linguistics & Philosophy 11, 303–321. Peterson, Philip 1997. Fact, Proposition, Event. Dordrecht: Kluwer. Poesio, Massimo 1993. A situation–theoretic formalization of definite description interpretation in plan elaboration dialogues. In: P. Aczel et al. (eds.). Situation Theory and Its Applications, III. Stanford, CA: CSLI Publications, 339–374. Ranta, Aarne 1994. Type Theoretical Grammar. Oxford: Oxford University Press. Ranta, Aarne 2004. Grammatical framework. Journal of Functional Programming 14, 145–189. Richard, Mark 1990. Propositional Attitudes: An Essay on Thoughts and How We Ascribe Them. Cambridge, MA: The MIT Press. Russell, Bertrand 1905. On denoting. Mind 14, 479–493. Schulz, Stephen 1993. Modal Situation Theory. In: P. Aczel et al. (eds.). Situation Theory and Its Applications, III. Stanford, CA: CSLI Publications, 163–188. Seligman, Jerry & Larry Moss 1997. Situation Theory. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Linguistics. Amsterdam: North-Holland, 239–309. Soames, Scott 1985. Lost innocence. Linguistics & Philosophy 8, 59–71. Sundholm, Göran 1986. Proof Theory and meaning. In: D. Gabbay & F. Guenthner (eds.). Handbook of Philosophical Logic. Oxford: Oxford University Press, 471–506. Vendler, Zeno 1972. Res Cogitans. Ithaca, NY: Cornell University Press. Vogel, Carl & Jonathan Ginzburg 1999. A situated theory of modality. Paper presented at the 3rd Tbilisi Symposium on Logic, Language, and Computation, Batumi, Republic of Georgia, September 1999.
Jonathan Ginzburg, Paris (France)
851
VIII. Theories of discourse semantics 36. Situation Semantics: From indexicality to metacommunicative interaction 1. 2. 3. 4. 5. 6.
Introduction Desiderata for semantics The Relational Theory of Meaning Meaning, utterances, and dialogue Closing remarks References
Abstract Situation Semantics emerged in the 1980s with an ambitious program of reform for semantics, both in the domain of semantic ontology and with regard to the integration of context in meaning. This article takes as its starting point the focus on utterance (as opposed to sentence) interpretation. The far reaching aims Barwise and Perry proposed for semantic theory are spelled out. Barwise and Perry’s Relational Theory of Meaning is described, in particular its emphasis on utterance situations and on the reification of information. The final part of the article explains how conceptual apparatus from situation semantics has ultimately come to play an important role in a highly challenging enterprise, modelling dialogue interaction, in particular metacommunicative interaction.
1. Introduction Situation Semantics emerged in the 1980s with an ambitious program of reform for semantics, both in the domain of semantic ontology and with regard to the integration of context in meaning. In their 1983 book Situations and Attitudes (Barwise & Perry 1983), as well as a host of other publications around that time collected in Barwise (1989) and Perry (2000), Barwise and Perry argued for the preeminence of a situation-based ontology and took contexts of utterance to be situations, thereby offering the potential for a richer view of context than was available previously. For situation semantics and ontology, see article 35 (Ginzburg) Situation Semantics and NL ontology. This article takes as its starting point the focus on utterance (as opposed to sentence) interpretation. In section 2 I spell out the far reaching aims Barwise and Perry proposed for semantic theory. In section 3 I sketch Barwise and Perry’s Relational Theory of Meaning, in particular its emphasis on utterance situations and on the reification of information. I also point out some of the weaknesses of Barwise and Perry’s enterprise, particularly the approach to context. One of these weaknesses, in my view, is that the theory is quite powerful, but it was, largely, applied to dealing with traditional, sentence-level semantics. The final section of this article, section 4, explains how conceptual apparatus from situation semantics has ultimately come to play an important role in a highly challenging enterprise, modelling dialogue interaction. Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 852–872
36. Situation Semantics: From indexicality to metacommunicative interaction
2. Desiderata for semantics Barwise and Perry’s starting point is model theoretic semantics, as developed in the classical Montague Semantics tradition (see e.g. Montague 1974; Dowty, Wall & Peters 1981; Gamut 1991 and article 33 (Zimmermann) Model-theoretic semantics): a natural language is likened to a formal language (first order logic, intensional logic etc). On this approach, providing a semantics for such a language involves primarily assigning interpretations (or denotations) to the words of the language and rules that allow phrases to be interpreted in a compositional manner. This allows both the productivity of NL meaning and the potential for various kinds of ambiguity to be explicated. Contexts, on this view, are identified with indices – tuples consisting of a small and fixed number of dimensions, prototypically providing values for speaker, addressee, time, location. Interpretations of words/phrases are then all taken to be relative to contexts, thereby yielding two essential semantic entities: characters/meanings which involve abstracting away indices from contents/interpretations. These – supplemented by lexical meaning postulates – can be used to explicate logically valid inference. Barwise and Perry view this picture of semantics as significantly too restrictive. The basic perspective they adopt is one in which linguistic understanding is assimilated to the extraction of information by resource bounded agents in their natural environment (inspired in part by the work of Gibson, e.g. Gibson 1979). This drives their emphasis on a number of unorthodox seeming fundamental desiderata for semantic theory, desiderata we will subsequently come to see find considerable resonance in the desiderata for a theory of meaning for conversational interaction. The first class of desiderata are metatheoretical in nature and can be summed up as follows: Desideratum 1: The priority of information Language has external significance, as model theoretic semantics has always emphasized, but, as cognitive scientists of various stripes emphasize, it also has mental significance, yielding information about agents’ internal states. What is needed is a way of capturing the commonality between the external and the mental, the flow of information – the chain from fact to thought in one participant’s mind to utterance to thought in another participant’s mind, graphically exemplified in Fig. 36.1.
Fig. 36.1: The Flow of Information. (Barwise & Perry 1983: 17)
853
854
VIII. Theories of discourse semantics An important component in fulfilling this desideratum, according to Barwise and Perry, is a theory by means of which external (and internal) reality can be represented – an ontology of some kind. This is what developed into situation theory and type theory with records (see article 35 (Ginzburg) Situation Semantics and NL ontology). A key ingredient in such a theory are some notion of constraints, a way of capturing necessary, natural, or conventional linkages between situations (e.g. smoke means fire, image being such and such means leg is broken etc.), along with a theory of how agents in a situation extract information using constraints. The other crucial component is the naturalization of linguistic meanings – their reduction to concepts from the physical world – in terms of constraints. The other two pivotal desiderata put forward by Barwise and Perry are more directly aimed at repositioning the semantic fulcrum, from interpretation towards context. Desideratum 2: Information content is underdetermined by interpretation We might provide news about the Argentinean elections using any of the following sentences in (1). All three sentences uttered in these circumstances intuitively have the same external significance – we would wish to identify their content and, on some accounts, their meaning as well. Nonetheless, different information can be acquired from each: for instance, (1b) allows one to infer that Kirchner is a woman, whereas Lavagna is a man. (1) a. Kirchner beat Lavagna. b. Señora Kirchner defeated Señor Lavagna. c. Cristina’s losing opponent was Lavagna. Desideratum 3: Language is an efficient medium Barwise and Perry emphasize that the flip side of productivity gets less attention as a fundamental characteristic of NL: the possibility of reusing the same expression to say different things. Examples of the phenomena Barwise and Perry had in mind are in (2), which even in 2009 are tricky. By ‘tricky’ I don’t mean we lack a diagnosis, I mean there is no single formal and/or implemented semantic/pragmatic theory that picks them all off with ease, interfacing along the way with inter alia theories of gesture, gaze, and visual access. (2) a. A: I’m right, you’re wrong. B: I’m right, you’re wrong. b. I want you, you, and you to stand here and I want you, you, and you to stand here. (based on examples in Levinson 1983; Pollard & Sag 1994) c. A: John is irritating John no end. B: He can’t be annoying him so badly. d. In last week’s FoLLI dissertation prize meeting sadly the linguist voted for the linguist, whereas the logician voted for the logician. (based on an example in Cooper 1996)
3. The Relational Theory of Meaning At the heart of situation semantics is the Relation Theory of Meaning. There are two fundamentally innovative aspects underlying this theory, which bear significant importance to current semantic theory in the wider sense:
36. Situation Semantics: From indexicality to metacommunicative interaction (3) a. Meaning Reification: the reification of meanings as entities on which humans reason (rather than as metatheoretical entities, as standard in logic). b. Speech Events as Semantic Entities: recognition of speech events (incl speakers, addressees, the speech token) as fundamental semantic units; sentences are viewed as derivative: type-like entities that emerge from utterances, or, as Barwise and Perry put it, uniformities over utterances. To get a feel for the theory, consider a simple example. (4b), taken to be the meaning of (4a), is a crude representation of an early version of the Relational Theory of Meaning: a (declarative) meaning relates all utterance events u in which there exists a speaker a, addressee b, spatiotemporal locations l, t, referents j, m (for the names ‘Jacky’ and ‘Molly’ respectively) to described events e in which j bites m at t. This relation is exemplified graphically in Fig. 36.2, which emphasizes the reality of the utterance situation. I have purposely used quasi-Davidsonian notation (see article 34 (Maienborn) Event semantics) to indicate that the central insight there is independent of the various more and particularly less standard formalisms in which the Relational Theory of Meaning has been couched. As we will soon see, there are various ways which differ significantly to cash out the characterization of u, e and their interrelation. (4) a. Jacky bit Molly. b. { u,e | ∃a,b,l,j,m,t [uttering(a,‘Jacky is biting Molly’,u) ∧ addressee(u,b) ∧ In(u,l) ∧ referring(a,j, ‘Jacky’) ∧ Named(j, ‘Jacky’) ∧ referring(a,m, ‘Molly’) ∧ Named(m, ‘Molly’) ∧ coincident(l,t) ∧ describing(a,e) ∧ bite(e,j,m,t)] }
Fig. 36.2: The meaning of ‘Jacky is biting Molly’ as a relation between situations in which this construction is uttered and events in which a Jacky bites a Molly. (Barwise & Perry 1983: 122)
Of the two assumptions, Speech Events as Semantic Entities was introduced by Barwise and Perry in a stronger form than (3), graphically exemplified in Fig. 36.2 – not only do they make reference to speech event, but Barwise and Perry actually posit a compositional aspect to speech events: (5)
a. If α is a phrase with sub-constituents X,Y, then uttering(a, α, u) entails the existence of two subevents of e e1, e2 such that
855
856
VIII. Theories of discourse semantics b. e1 ≺ e2 (e1 temporally precedes e2) c. uttering(a, X, u1) d. uttering(a, Y, u2) This formulation raises a variety of issues concerning syntax, presupposing essentially a strongly surfacey and linearized approach. For obvious reasons of space I cannot enter into these, but they constitute an important backdrop. Speech Events as Semantic Entities underlay a number of grammar fragments subsequent to Barwise & Perry (1983) (e.g. Gawron & Peters 1990; Cooper & Poesio 1994), but on the whole was not the focus of much interest until Poesio realized its significance for conversational processing, as we discuss in section 4.2. In contrast, issues concerning Meaning Reification drove much research in the hey day of Situation Semantics. The relation examplified in (4b) is certainly a relation with relevance to the semantics of utterances of ‘Jacky is biting Molly’: it relates events in which the speaker mouths a particular linguistic form while referring to a Jacky and a Molly with an event the speaker is describing in which that Jacky bit that Molly. Barwise and Perry view attunement – the awareness of similarities between situations and of relationships that obtain between such similar situations – to the constraint in (4) as being what underlies our competence to use and understand such utterances. Nonetheless, there are two aspects which the formulation above abstracts away from: contextual parameter instantiation and truth evaluation. (4) does not make explicit the fact that understanding such an utterance involves finding appropriate referents for the two NP sub-utterances, as indeed in certain circumstances – e.g. for an overhearer who cannot see the speech participants or hears a recording – for the speaker and the time. In fact, in the original formulation of the Relational Theory of Meaning Barwise and Perry made a point of not packaging all of context in one event/situation, but distinguished three components of context: (a) the discourse situation, comprising the public aspects of an utterance (including all the standard indexical parameters), (b) the speaker connections, comprising information pertaining to a speaker’s referential intentions, and (c) resource situations, events/situations distinct from the described situation, used to serve as referential/quantificational domains. Although the discourse situation/speaker connection dichotomy does not seem to have survived – examples such as (2b) illustrate the importance of speaker intention even with ‘simple indexicals’, the ultimate insight to be drawn here, it seems, is the unbounded nature of contextual dependence. Resource situations are one of the important contributions of situation semantics (see particularly Cooper 1996), and are further discussed in article 35 (Ginzburg) Situation Semantics and NL ontology. Returning to (4), the formulation of the Relational Theory of Meaning as a relation between contextual situations (the discourse situation, speaker connections, zero or more resource situations) and described situations, is problematic. It means that the latter cannot serve as the denotations of declarative utterances (since they are not truth bearers), nor does it generalize to non-declarative meaning. This reflects the fact that in Situations and Attitudes Barwise and Perry attempt to stick to an avowedly “concrete” ontology, one which eschews abstract entities such as propositions, leading them into various foundational problems. This stance was abandoned soon after – various notions of propositions emerged as situation theory developed. Hence, in works such as Gawron & Peters (1990), Cooper & Poesio (1994), works from a maturer version of situation semantics, (declarative)
36. Situation Semantics: From indexicality to metacommunicative interaction sentential meanings came to be formulated as relating utterance situations – from whence values for contextual parameters would be drawn – and propositions; meanings for sub-sentential constituents would analogously relate an utterance situation for that constituent with an associated described object (referent [NP], property [VP] etc). As an example of the Relational Theory of Meaning in a current formalism that fixes both problematic aspects discussed above, consider (6), which uses the formalism of Type Theory with Records (see Cooper 2006), discussed in more detail in article 35 (Ginzburg) Situation Semantics and NL ontology. (6a) corresponds to an utterance type (utterance type in the sense of sign as in constraint-based grammars like Head Driven Phrase Structure Grammar or similar notions in Type Logical Grammar. A witness for the type (6a) is given in (6b) – it includes a phonetic token – distinguished here from its associated phonological type in terms of spelling, contextual parameters – a situation sit0, a time time0, a speaker spkr0, addressee addr0, utterance time time1, an individual named Jo j0, and situations grounding the truth of the addressing, precedence, and naming condi⎡sit = sit0 ⎤ tions c10, c20, c30 – and the Austinian prepositional entity ⎢ ⎥. ⎣sit-type = Leave( j0, time0)⎦ c-params represents the type of entities need to instantiate a meaning: (6) a. ⎡ phon: jo left ⎤ ⎢cat = [+ fin] : syncat ⎥ V ⎢ ⎥ ⎢ ⎡s0 : SIT ⎤⎥ ⎢ ⎢s : IND ⎥⎥ ⎢ ⎢ ⎥⎥ ⎢ ⎢a : IND ⎥⎥ ⎢ ⎢ ⎥⎥ ⎢ ⎢t0 : TIME ⎥⎥ ⎢c-params : ⎢t1 : TIME ⎥⎥ ⎢ ⎢ ⎥⎥ ⎢ ⎢c1 : addressing(s, a, t1)⎥ ⎥ ⎢ ⎢c 2 : Pr ecedes(t0, t1) ⎥ ⎥ ⎢ ⎢ ⎥⎥ ⎢ ⎢ j : IND ⎥⎥ ⎢ ⎢ ⎥⎥ ⎢ ⎣c 3 : Named( j, ‘ jo’) ⎦⎥ ⎢ ⎥ ⎤ ⎢cont = ⎡sit = s0 ⎥ ⎢sit-ttype = Leave( j, t0)⎥ : Prop ⎥ ⎢⎣ ⎦ ⎣ ⎦
b. ⎡ phon = jo lef’ ⎤ ⎢cat = [+ fin] ⎥ V ⎢ ⎥ ⎢ ⎥ ⎡s0 = sit0 ⎤ ⎢ ⎥ ⎢s = spkr 0 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢a = addr 0 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢t0 = time0 ⎥ ⎢c-params = ⎥ ⎢t1 = time1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢c1 = c10 ⎥ ⎢ ⎥ ⎢c 2 = c 20 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ j = j0 ⎥ ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ ⎣c 3 = c 30 ⎦ ⎥ ⎢ ⎤⎥ ⎢cont = ⎡sit = s0 ⎢sit-type = Leave( j, t0)⎥ ⎥ ⎢⎣ ⎦⎦ ⎣
Reconstructing the meaning/content relationship in terms of two fields c-params and content, originating in HPSG, allows in the current setting for the possibility of partially instantiating a content and maintaining this as the semantic representation of an utterance until a more detailed instantiation is available, an important possibility in conversational interaction, as we discuss in section 4.3. Situation Semantics is one of the harbingers of dynamic semantics: the relational theory of meaning can be straightforwardly reconstrued as a specification of input/output contexts associated with uttering a given sentence. Indeed the paper (Barwise 1987) was one of the first to spell out a dynamic semantics for NPs, though (in common with most other works in the dynamic semantics tradition) does not spell out how to interface with
857
858
VIII. Theories of discourse semantics the discourse/utterance situation in the above sense. This ties in with a number of weaknesses which Barwise and Perry’s conception of context exhibits: – No dynamics of indexicality is worked out (e.g. interaction between turn taking and structure of context) to deal with cases like (2a,b). – It ignores metacommunication (the focus of sections 4.2. and 4.3.). – In common with traditional speech act theory, it ignores conversational structure: to take two simple examples, the interpretation of the second ‘hi’ as a counter greeting derives from its position following an initial greeting. Similarly, the resolution of ‘No’ picks up in some way on the adjacent assertion: (7) a. A: Hi. B: Hi b. A: I’m right, you’re wrong. B: No. I’m right, you’re wrong. – Due to lack of calculus of constraints, it is not easy to use the Relational Theory of Meaning as a logic which could allow an explicit account of which information can be derived from an utterance. The utterance-based formulation of semantic theory pioneered by Situation Semantics was criticized as misguided by Kaplan (1989), Partee (1985); for a subsequent argument contra, along with a good review of Kaplan’s and related approaches, see Israel & Perry (1996). Indeed the utterance-based formulation has until recently had relatively little impact. Why? Putting aside sociological explanations, one might say that although the theory was intended for conversational language, the methodology and setting were that of the traditional isolated sentence, for which the pay offs do not seem sufficiently significant given the apparent theoretical investment. When these tools are applied to a dialogue setting, significant pay offs for this perspective emerge.
4. Meaning, utterances, and dialogue 4.1. Phenomena from spoken language There has been growing interest in recent years in developing notions of context that can be used to semantically analyze linguistic phenomena characteristic of conversational language and to model dialogue interaction (see Ginzburg 1996b; Poesio & Traum 1997; Larsson 2002; Asher & Lascarides 2003; Ginzburg 2011). The efficiency of language, in the sense discussed above, and concomitant importance of context becomes yet more urgent an issue given how pervasive non-sentential utterances are in conversational settings – one word utterances are estimated to constitute between 30–40% of all utterances, 25% of these are propositional or interrogative, and hence involve significant contextual resolution (see e.g. Fernández 2006). In the remainder of this article I will focus on a number of semantic phenomena that occur in conversational interaction, whose analysis builds on the conceptual apparatus brought into prominence by situation semantics, in particular, the reification of utterances as real world events and the view of meanings as first class citizens of the ontology, not metatheoretical entities. As it turns out, this apparatus provides powerful tools that also offer solutions to old linguistic problems, viz. how to integrate into context non-semantic parallelism conditions, characteristic of ellipsis constructions, and grammatical gender agreement in anaphora.
36. Situation Semantics: From indexicality to metacommunicative interaction The phenomena I consider here revolve around metacommunicative acts, which are rare in texts, but pervasive in dialogue. There are two main types of metacommunicative interactions – acknowledgements of understanding and clarification requests. An addressee can acknowledge speaker’s utterance, either once the the utterance is completed, as in (8a), or concurrently with the utterance as in (8b): (8) a. Tommy: So Dalmally I should safely say was my first schooling. Even though I was about eight and a half. Anon 1: Mm. Now your father was the the stocker at Tormore is that right? (British National Corpus (BNC), K7D) b. A: Move the train . . . B: Aha A:... from Avon ... B: Right A:... to Danville. (Adapted from the Trains corpus) Concomitantly with an utterance’s addressee acknowledging her understanding of an utterance, are a variety of facts about the utterance that potentially enter into the common ground. This is evinced, here for (9a), by the possibility of embedding them under a factive-presupposition predicate such as ‘interesting’. (9) exemplifies two classes of facts about the utterance that become presupposable, facts about the content of subutterances (9b–d) and also facts that concern solely the phonology and word order of the utterance (9e). (9) a. b. c. d. e.
A: Did Mark send you a love letter? B: No, though it’s interesting that you refer to Mark/my brother/our friend. B: No, though it’s interesting that you bring up the sending of love letters. B: No, though it’s interesting that you ask about Mark’s epistolary habits. B: No, though it’s interesting that the final two words you just uttered start with ‘l’.
A recurring theme since the Russell/Strawson dispute over definites has been the notion of presupposition failure (see article 41 (Heim) Definiteness and indefiniteness and article 91 (Beaver & Geurts) Presupposition). However, in interaction there is rarely failure as such. Rather, conversationalists’ mismatches lead to a clarification request (CR) – a query about an unclear aspect of a previous utterance – being posed. Natural Language allows for fine grained potential for CRs, using both sentential and nonsentential means. (10) illustrates a form-based taxonomy of CRs that covers virtually all of the CRs occurring in the BNC: (10) a. b. c. d. e. f. g. h. i.
A: Did Bo leave? Wot: B: Eh? / What? / Pardon? Explicit: B: What did you say? / Did you say ‘Bo’ / What do you mean ‘leave’? Literal reprise: B: Did BO leave? / Did Bo LEAVE? Wh-substituted Reprise (sub): B: Did WHO leave? / Did Bo WHAT? Reprise sluice (slu): B: Who? / What? / Where? Reprise Fragments (rf): B: Bo? / Leave? Gap: B: Did Bo ... ? Filler: A: Did Bo ... B: Win? (Table I from Purver 2006)
859
860
VIII. Theories of discourse semantics In this taxonomy, four classes of contents were identified: they can be exemplified in the form of Explicit CRs: (11) a. b. c. d.
Repetition: What did you say? Did you say ‘Bo’? Clausal Confirmation: Are you asking if Bo left? You’re asking if who left? Intended Content: What do you mean ()? Who is ‘Bo’? Correction: Did you mean to say ‘Bro’?
In practice, though most CRs are not of the Explicit category. Many CR utterances are multiply ambiguous. The most extreme case are reprise fragments, which seems able to exhibit all four readings, though in practice 99% of cases found in the corpus study Purver, Ginzburg & Healey (2001) were either Clausal Confirmation or Intended Content. Ginzburg & Cooper (2004) and Ginzburg (2011) demonstrate that reprise fragments display parallelism on a syntactic and phonological level with its source. Clausal confirmation readings, on the one hand, and intended content and repetition readings, on the other, involve distinct parallelism conditions, suggesting that different linguistic mechanisms underlie the distinct understandings. Clausal Confirmation readings do not require phonological identity between target and source, as shown in (12a,b). Nonetheless, as (12c–f) show, they require partial syntactic parallelism: an XP used to clarify an antecedent sub-utterance u1 must match u1 categorially: (12) a. b. c. d. e. f.
A: Did Bo leave? B: My cousin? (Are you asking if BO, my cousin, left?) A: Did she annoy Bo? B: Sue? (Are you asking if SUE annoyed Bo?) A: I phoned him. B: him? / #he? A: Did he phone you? B: he? / #him? A: Did he adore the book. B: adore? / #adored? A: Were you cycling yesterday? B: Cycling?/biking?/#biked?
That repetition readings of RF involve (segmental) phonological identity with their source follows from their very nature (‘Did you say ...’). And this requirement also applies to intended content readings of RF: (13) (i) A: Did Bo leave? B: Max? (cannot mean: intended content reading: Who are you referring to? or Who do you mean?) The existence of syntactic and phonological parallelism in CRs across utterances is further evidence to that provided above in (9) that the notion of context we need is one that tracks non-semantic information associated with utterances, not merely content, presuppositions and the like. I will show that one way to capture this requirement is by defining contextual updates in terms of locutionary propositions, propositions constructed from utterances and the types that classify them. This idea has antecedents in the Relational Theory of Meaning and in the Austinian conception of propositions, discussed in detail in article 35 (Ginzburg) Situation Semantics and NL ontology. It should be emphasized just how central a phenomenon metacommunicative interaction is in interaction: a rough idea of the frequency of acknowledgements can be gleaned from the word counts for ‘yeah’ and ‘mmh’ in the demographic part of the BNC: ‘yeah’ occurs 58,810 times (rank: 10; 10–15% of turns), whereas ‘mmh’ occurs 21,907 times
36. Situation Semantics: From indexicality to metacommunicative interaction (rank: 30; 5% of turns). Clarification Requests (CRs) constitute approximately 4–5% of all utterances (see e.g. Purver 2004; Rodriguez & Schlangen 2004). Moreover, there is suggestive evidence from artificial life simulation studies that the existence of CRs is not an incidental feature of interaction but a key component in the long-term viability of a language. Macura & Ginzburg (2006) and Macura (2007) show that when repair acts are a part of a linguistic interaction system, a stable language can be maintained over generations. Whereas, in a community endowed with a language that lacks CRification, as I refer to the interaction brought about by a CR, the emergent divergence among language users is so high that the language eventually dies out. Ignoring metacommunicative interaction, as has been the case for just about the entire tradition of formal semantics, means missing out one of the basic building blocks of linguistic interaction. Situation Semantics was itself complicit in this. However, the view of language it provides, with its reference to speech events as part of the semantic domain, and the reification of meanings provides important building blocks for a theory of metacommunicative interaction. How then to integrate metacommunicative aspects into the semantic process? Such phenomena have been studied extensively by psycholinguists and conversational analysts in terms of notions such as grounding and feedback (in the sense of Clark 1996 and Allwood 1995, respectively) and of repair (in the sense of Schegloff 1987). The main claim that originates with Clark & Schaefer (1989) is that any dialogue move m1 made by A must be grounded (viz acknowledged as understood) by the other conversational participant B before it enters the common ground; failing this CRification must ensue. While Clark and Schaefer’s assumption about grounding is somewhat too strong, as Allwood argues, it provides a starting point, indicating the need to interleave the potential for grounding/CRification incrementally; the size of the increments being an important empirical issue. From a semantic theory, we might expect the ability to generate concrete predictions about forms/meanings of metacommunicative interaction utterances in context. Such a characterization needs to cover both the range of possibilities associated with successful communication (grounding), as well as with imperfect communication – indeed it has been argued that miscommunication is the more general case (see e.g. Healey 2008). Thus, we can suggest that the adequacy of semantic theory involves the ability to characterize for any utterance type the update that emerges in the aftermath of successful grounding and the full range of possible CRs otherwise. This is, arguably, the early 21st century analogue of truth conditions. The update component of this criterion builds on earlier adequacy criteria that emerged from dynamic semantics’ frameworks (see article 38 (Dekker) Dynamic semantics). Nonetheless, these frameworks have abstracted away from metacommunication. I now consider two general approaches that strive to develop semantic theories capable of delivering grounding conditions/CRification potential. The first approach, an extension of Discourse Representation Theory (DRT) (see article 37 (Kamp & Reyle) Discourse Representation Theory), aims at explicating inter alia the potential for acknowledgements and utterance-oriented presuppositions; the second approach, constructed from the start as a theory of dialogue per se, shows how to characterize CRification potential. A crucial assumption both approaches bear in common, one that distinguishes them from other dynamic semantic work (e.g. Roberts 1996; Groenendijk 1998; Dekker 2004; Asher & Lascarides 2003), but one that seems inescapable if metacommunicative
861
862
VIII. Theories of discourse semantics interaction is to be tackled, is the need for semantic distributivity: given the fact that a single (public) input can lead to distinct outputs for each conversationalist, the effect of semantic operations can no longer be defined on a common ground simpliciter, but this needs in one way or another to be relativized across the conversational participants. This is exemplified in Turn Taking Puzzles (Ginzburg 1997) such as (14) and (15), where depending on who gets the turn, resolution possibilities for ellipsis vary: (14) a. A: Who does Bo admire? B: Bo? b. Reading 1 (short answer): Does Bo admire Bo? c. Reading 2 (Clausal Confirmation): Are you asking who BO (of all people) admires? d. Reading 3 (Intended content Clarification): Who do you mean ‘Bo’? (15) a. A: Who does Bo admire? Bo? b. Reading 1: (short answer): Does Bo admire Bo? c. Reading 2: (self correction): Did I say ‘Bo’? The relativization of context is what enables an account of the contrast between (14) and (15), sketched in section 4.3.: the semantic material necessary for ellipsis resolution in cases like (14c,d) can only emerge once a clarification request has been introduced by the addressee.
4.2. Acknowledgements, grounding, and micro conversational events Massimo Poesio and David Traum and collaborators (e.g. Poesio & Traum 1997; Matheson, Poesio & Traum 2000; Poesio & Rieser 2009) have developed a framework known as PTT (not an acronym), which integrates a dynamic semantic framework (a version of DRT, Kamp & Reyle 1993) with a framework for representing conversational interaction inspired by speech act theory. One of the starting points of PTT is the assumption Speech Events as Semantic Entities (see (3b) above). On the basis of this, they assimilate the treatment of speech acts to the treatment of other events in DRT. Thus, conversational events can serve as the antecedents of anaphoric expressions, just like normal events. The standard DRT construction algorithm would assign to the text in (16a) an interpretation along the lines of (16b) (using the syntax from Poesio & Muskens 1997) for Discourse Representation Structures (DRSs) – a single DRS containing the merged propositional content of both assertions.). In contrast, Poesio and Traum hypothesize that upon hearing an assertion of that sentence, the common ground in a conversation would be roughly in (16c): (16) a. A: There is an engine at Avon. B: It is hooked to a boxcar. b. [x,w,y,z,s,s'| engine(x), Avon(w), s: at(x,w), boxcar(y), s':hooked-to(z,y), z is x ] c. [ce1,ce2| ce1 : assert(A,B,[x,w,s| engine(x), Avon(w), s: at(x,w)]) ce2 : assert(B,A,[y,z,s'| boxcar(y), s':hooked-to(z,y), z is x])] (16c) records the occurrence of two conversational events, ce1 and ce2, both of type assert, whose propositional content are separate DRSs specifying the interpretation of
36. Situation Semantics: From indexicality to metacommunicative interaction the two utterances in (16a). The discourse entities ce1 and ce2 can serve as antecedents both of implicit anaphoric references, e.g. in the case of ‘backward’ acts like answers to questions, and of explicit ones. Consider (17): this may be viewed as performing at least two functions here: implicitly accepting the option proposed in ce1, and performing a query. Indeed backward-looking acts – (for the backward/forward-looking dialogue act dichotomy see Core & Allen 1997) such as accept are all implicitly anaphoric to a previous conversational event (ce1 in this case), hence the assumption that conversational events introduce discourse markers just like normal events do. (17) a. A: We should send an engine to Avon. B: Shall we use engine E3? b. [ce1,ce2,ce3| ce1:open-option(A,B,[x,w,e| engine(x),Avon(w),e:send(A,B,x,w)]), ce2: accept(B,ce1) ce3: ask(B,A,[y,e'| engine(y), E3(y), e':use(A,B,y)])] In fact, as mentioned earlier, Poesio and Traum develop their theory on the basis of a strong and dynamicized version of Speech Events as Semantic Entities: an utterance is taken to be a sequence of micro-conversational events (MCEs). On this view, the discourse situation is updated not just when a complete sentence has been observed, but whenever a new event is observed. Psychological research suggests that such updates can take place every few milliseconds (Tanenhaus & Trueswell 1995), so that observing the utterance of a phoneme is sufficient to cause an update; but in practice PTT typically assumes that updates take place after every word. The incremental update hypothesis is not just motivated by psychological findings about incremental interpretation in sentential utterances, but by the fact that in dialogue many types of conversational acts are hardly, if ever, performed with full sentences. A class of non-sentential utterances that quite clearly lead to immediate updates of the discourse situation are those used to perform dialogue control acts such as take-turn, keep-turn and release-turn actions whose function is to synchronize the two participants in the conversation as to whom is holding the floor (Traum & Hinkelmann 1992) and acknowledgements. These conversational actions are sometimes performed by sentential utterances that also generate a core speech act (e.g., the second utterance in (17a)), but more commonly they are generated by single-word discourse markers like ‘mmh’, ‘okay’, ‘well’, ‘now’. In PTT, lexicon and grammar are formulated as defeasible rules characterizing the update potential of locutionary acts. The motivation for defeasibility include psycholinguistic results about lexical access, e.g. work such as Onifer & Swinney (1981) demonstrating that conversationalists simultaneously access all meanings of ambiguous words. Lexical entries and syntactic rules link a precondition stated in terms of the phonological/syntactic characteristics of a micro-conversational event and a possible effect stated in terms of the possible meaning of that event. In particular, syntactic rules enable the construction of compound locutionary events, whose atomic constituents are the MCEs corresponding to utterances of individual words. Each locutionary act la1 sets up the potential for a subsequent illocutionary act il1 (one of whose) effects is to constitute an acknowledgement of la1. This provides the basis for a treatment of grounding and dialogue control particles. I illustrate this for ‘okay’ in its use as an acknowledgement particle; PTT assumes that locutionary acts generate – here in a causal sense introduced by Goldman (1970) – core speech acts. The lexical entry could be specified, roughly, as in (18), where u represents a locutionary and ce an illocutionary act resepctively:
863
864
VIII. Theories of discourse semantics (18) lexical entry for ‘OK’: [u,ce| u: utter (A,‘okay’), ce: acknowledge(A,ce), generate (u,ce)] (A highly simplified view of) the conversational score resulting from such an acknowledgement to an (ongoing) utterance by A in (19a) would be roughly as in (19b). This gives a schematic illustration of the emergence of utterance-related presuppositions – there are four micro-conversational events – each characterized in terms of its phonological syntactic, and semantic characteristics respectively – the events mce1, mce2 of uttering ‘an’ and ‘engine’ respectively, the compound event mce3 of uttering ‘an engine’ and the event mce2 of uttering ‘OK’; mce2 generates a core speech act, the acknowledgement of mce3: (19) a. ... A: an engine B: OK ... b. [mce1,mce2,mce3,mce4,ce4| mce1: utter(A,“an”), cat(mce1) = det, mce1 ↦ λP, Q[x]; P(x); Q(x) mce2: utter(A,“engine”), cat(mce2) = N, mce2 ↦ λx engine(x) mce1 ≺ mce2, Dtrs({mce1, mce2}, mce3), cat(mce3) = NP mce3 ↦ λQ[x]; engine(x); Q(x), generate(mce3,ce3), mce4: utter(B,‘okay’), cat(mce4) = intj, ce4: acknowledge(B,mce3), generate(mce4,ce4)]
4.3. CRification and Meaning Reification The ability to both process and generate clarification questions is vital in all areas of Human-Computer Interaction, ranging from web search to expert systems. This is one reason why interest in integrating CRification into the semantic process is an issue that has attracted significant interest in computational semantic work (see Schlangen 2004; Purver 2006, DeVault et al. 2005). Above and beyond this, developing a theory which can predict the clarification potential of utterances, the possible forms and contents available for their clarification, is an important theoretical challenge. It represents one of the fundamental aspects of interactivity. To date, the main attempts in this direction have been made within the KoS framework (not an acronym) (Ginzburg & Cooper 2004; Purver 2004; Purver 2006; Ginzburg 2011), where a detailed treatment of the phenomena discussed in this section can be found. KoS is formalized in Type Theory with Records. What is crucial for current purposes about this formalism, which takes situation semantics as one of its inspirations, is that it provides access to both types and tokens at the object level. Concretely, this enables simultaneous reference to both utterances and utterance types, a key desideratum for modelling metacommunicative interaction. This distinguishes Type Theory with Records from Discourse Representation Theory, for instance, where the witnesses are at a model theoretic level, distinct from the level of discourse representations. On the view developed in KoS, there is actually no single context, for reasons explained previously – instead of a single context, analysis is formulated at a level of information states, one per conversational participant. The type of such information states is given in (20a). I leave the structure of the private part unanalyzed here, for details on this, see Larsson (2002). The dialogue gameboard represents information that arises from publicized interactions. Its structure is given in the type specified in (20b):
36. Situation Semantics: From indexicality to metacommunicative interaction (20) a. TotalInformationState (TIS): ⎡Dialoguegameboard : DGB⎤ ⎢private : Private ⎥ ⎣ ⎦ b. DGB = ⎡spkr : Ind ⎤ ⎢addr : Ind ⎥ ⎢ ⎥ ⎢c-utt : addressing(spkr, addr) ⎥ ⎢ ⎥ ⎢ Facts : set(Propossition) ⎥ ⎢ Pending : list(locutionary Proposition)⎥ ⎢ ⎥ utionary Proposition) ⎥ ⎢Moves : list(locu ⎢QUD : poset(Question) ⎥ ⎣ ⎦ In this view of context: – The speaker/addressee roles serve to keep track of turn ownership. – FACTS represents the shared knowledge conversationalists utilize during a conversation. More operationally, this amounts to information that a conversationalist can use embedded under presuppositional operators. – Pending: represents information about utterances that are as yet un-grounded. Each element of Pending is a locutionary proposition, a proposition individuated by an utterance event and a grammatical type that classifies that event. The motivation for this crucial modelling decision, which concerns the input to grounding and CRification processes and which carries on to the Moves repository, is discussed below. – Moves: represents information about utterances that have been grounded. The main motivation is to segregate from the entire repository of presuppositions information on the basis of which coherent reactions to the latest conversational move can be computed. For various purposes (e.g. characterizing the preparatory conditions of moves such as greeting and parting) it is actually important to keep track of the entire repository of moves. – QUD: (mnemonic for Questions Under Discussion) – questions that constitute a “live issue”. That is, questions that have been introduced for discussion at a given point in the conversation and not yet been downdated. The role of questions in structuring context has been recognized in a variety of works, including Hamblin (1970), Carlson (1983), van Kuppevelt (1995), Ginzburg (1994), Ginzburg (1996a), Roberts (1996), Larsson (2002). There are additional ways for questions to get added into QUD, the most prominent of which is during metacommunicative interaction, as we will see shortly. Being maximal in QUD (max-qud) corresponds to being the current ‘discourse topic’ and is a key component in the theory. The Dialogue GameBoard, then, constitutes the publicized context in KoS – taking into account that conversationalists’ DGBs need not be identical throughout. Work in KoS (e.g. Fernández & Ginzburg 2002; Fernández 2006; Ginzburg 2011) has shown that virtually all types of non-sentential utterance, ranging from short answers, propositional
865
866
VIII. Theories of discourse semantics lexemes (e.g. ‘yes’, ‘no’), through reprise fragments, can be analyzed as indexical expressions relative to the DGB. Context change is specified in terms of conversational rules, rules that specify the effects applicable to a DGB that satisfies certain preconditions. This allows both illocutionary effects to be modelled (preconditions for and effects of greeting, querying, assertion, parting etc), interleaved with locutionary effects, our focus here. In the immediate aftermath of the speech event u, Pending gets updated with a record of the ⎡sit = u ⎤ form ⎢ ⎥ (of type LocProp (locutionary proposition)). Here Tu is a grammatical sit-type = T u⎦ ⎣ type that emerges during the process of parsing u, as already exemplified above in (6). The relationship between u and Tu – describable in terms of the Austinian proposition (see ⎡sit = u – (6) and article 35 (Ginzburg) Situation Semantics and NL ontology) pu = ⎢ ⎣sit-type = Tu can be utilized in providing an analysis of grounding/CRification conditions: (21) a. Grounding: pu is true: the utterance type fully classifies the utterance token. b. CRification: Tu is weak (e.g. incomplete word recognition); u is incompletely specified (e.g. incomplete contextual resolution). Thus, pending utterances are the locus off of which to read grounding/CR conditions. Without saying much more, we can formulate a lexical entry for CR particles like ‘eh?’ (Purver 2004). Given a context that supplies speaker, addressee and a pending utterance the content expressed is a question querying the intended content of the utterance: (22) ⎡ phon : eh ⎤ ⎢ ⎥ ⎢cat = interjection: syncat ⎥ ⎢ ⎥ ⎡spkr : IND ⎤ ⎢ ⎥ ⎢addr : IND ⎥ ⎢ ⎥ ⎢ ⎥ ⎢c-params : ⎢ pending : utt ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣c 2 : address(addr.spkr, pending)⎦ ⎢ ⎥ ⎢cont = Ask(c-params.spkr, c-params.addr, ⎥ ⎢⎣λx Mean(c-params.addr, c-params.pending, x )) : IllocProp ⎥⎦ (22) is straightforward apart from one point – what is the type utt. This actually is a fundamental semantic issue, one which, as we will see, responds to the underdetermination of information by interpretation desideratum raised in section 1: what is the semantic type of pending? In other words, what information needs to be associated with pending to enable the formulation of grounding conditions/CR potential? The requisite information needs to be such that it enables the original speaker to interpret and recognize the coherence of the range of possible clarification queries that the original addressee might make.
36. Situation Semantics: From indexicality to metacommunicative interaction Meanings – in the sense I discussed earlier of functions from contexts, which provide values for certain parameters (the contextual parameters), to contents – provide a useful notion for conceptualizing grounding/clarification potential (and were exploited for this purpose in Ginzburg 1996b). This is because the range of contextual parameters offers a possible characterization of the contextually variable and hence potentially problematic constituents of utterance content. Note though that if we conceive of meanings as entities which characterize potential sources of misunderstanding, the contextual parameters will need to include all open class sub-utterances of a given utterance type (i.e. including verb, common noun, and adjective, sub-utterances). This is a far cry from the 4 place indices of Montague and Kaplan, from the meanings envisaged by Barwise and Perry, and even from dynamicized meanings in dynamic semantics. (For experimental evidence about which lexical categories are viewed to be clarifiable see Purver 2004.) Ginzburg & Cooper (2004) argue that, nonetheless, even radically context dependent meanings of this kind are not quite sufficient to characterize CR potential. One problem is the familiar one of grain. In terms of the concept or property that they represent, one would be hard pressed to distinguish the meanings of words such as attorney and lawyer. And yet, since knowledge of language is not uniform, it is clear that the clarification potential of the sentences in (23) is not identical. Which word was used initially makes a difference as to how the clarification can be formulated: (23) a. Ariadne: Jo is a lawyer. Bora: A lawyer?/What do you mean a lawyer?/#What do you mean an advocate?/#What do you mean an attorney? b. Ariadne: Jo is an advocate. Bora: #What do you mean a lawyer?/An advocate?/ What do you mean an advocate?/#What do you mean an attorney? Other arguments derive from syntactic and phonological parallelism exhibited by nonsentential CRs (exemplified by (10f,g)) to their antecedent sub-utterance, and the existence of CRs whose function is to request repetition of (parts of) an utterance. Such CRs can, in principle, arise from any sub-utterance and are specified in terms of the utterance’s phonological type. Indeed the fact that any sub-utterance can, in principle, give rise to clarification motivates one one relatively minor enhancement to the standard grammatical representation. Instead of keeping track solely of immediate constituents, as is handled in formalisms such as HPSG the feature dtrs, we enhance the representation itself so it keeps track of all constituents. This is done by positing an additional, set valued field in the type definition of signs dubbed constit(uent)s, illustrated below in Fig. 36.3. In Ginzburg (2011), it is shown that this enhancement plays a key role in capturing cross-utterance parallelism, agreement, and scopal and anaphoric antecedency, though here I will only hint at the role it plays in formulating rules that regulate grounding and CRification. The arguments provided hitherto point to the fact that pending must incorporate the utterance type associated by the grammar with the clarification target. This would have independent utility since it would be the basis for an account of the various utterance presuppositions whose source can only derive from the utterance type (see example (9)). In fact, we encounter here evidence for the assumption Speech Events as Semantic Entities: CRs typically involve utterance anaphoricity. In (24a,b) the issue is not what do you
867
868
VIII. Theories of discourse semantics mean by leaving or who is Bo in general, but what do you mean by leaving or who is Bo in this particular sub-utterance: (24) a. A: Max is leaving. B: leaving? b. A: Did Bo leave? B: Who is Bo? Taken together with the obvious need for pending to include values for the contextual parameters specified by the utterance type, Ginzburg (2011) argues that the type of pending combines tokens of the utterance, its parts, and of the constituents of the content with the utterance type associated with the utterance. An entity that fits this specification is the locutionary proposition defined by the utterance, as introduced before in (21). With this in hand, I formulate in (25) a highly simplified utterance processing protocol, which interleaves illocutionary and metacommunicative interaction: (25) Utterance processing protocol For an agent A with DGB DGB0: if a locutionary proposition ⎡sit = u ⎤ pu = ⎢ ⎥ is Maximal in PENDING: sit-type = T u⎦ ⎣ (a) If pu is true, update Moves with pu. (b) Otherwise: introduce a clarification issue derivable from pu as the maximal element of QUD; use this context to formulate a clarification request. There are a small number of schemas that specify the possible clarification issues derivable from a given locutionary proposition pu. These include the issues ‘What did A mean by u1’ and ‘What did A utter in u1’, where A is the speaker provided in the contextual assignment represented in pu and u1 is a sub-utterance of u. The hypothesis that the context has been incremented with such an issue is taken to be the explanation for how non-sentential CRs such as (10b,f,g) and (12) are interpretable domain independently. To conclude, Fig. 36.3 offers a schematic illustration of how a single utterance – here of ‘Did Bo leave?’ – can lead to distinct updates among distinct participants at the ‘public level’ of context. In this case this arises due to differential ability to anchor the contextual parameters. The utterance u0 has three sub-utterances, u1, u2, u3, given in Fig. 36.3 with their approximate pronunciations. A can ground her own utterance since she knows the values of the contextual parameters, which I assume here for simplicity include the speaker and the referent of the sub-utterance ‘Bo’. This means that the locutionary proposition associated with u0 – the proposition whose situational value is a record that arises by unioning u0 with the witnesses for the contextual parameters and whose type is given in Fig. 36.3 – is true. This enables the ‘canonical’ illocutionary update to be performed: the issue ‘whether b left’ becomes the maximal element of QUD. In contrast, let assume that B lacks a witness for the referent of ‘Bo’. As a result, the locutionary proposition associated with u0 which B can construct is not true. Given this, B increments QUD with the issue ‘who was meant by A as the referent of subutterance u2’, and the locutionary proposition associated with u0 which B has constructed remains in Pending.
36. Situation Semantics: From indexicality to metacommunicative interaction
R
Fig. 36.3: A single utterance gives rise to distinct Updates of the DGB for distinct participants
5. Closing remarks One of the innovative contributions of situation semantics has been the Relational Theory of Meaning, an utterance oriented approach to semantics, which naturalizes meanings as first class entities. The origins of this theory were somewhat philosophical, rooted in a desire for an ecologically realistic semantics, a semantics that takes seriously the resource bounded nature of situated agents. The tools that emerged in the wake of this stance have emerged in recent years as technically significant in the development of semantic analyses of actual conversational speech, specifically in the analysis of metacommunicative interaction, one of the constitutive features of conversation. I would like to thank Robin Cooper and the Editors for many helpful comments on an earlier draft and Noor van Leusen for much helpful advice on finalizing this document.
6. References Allwood, Jens 1995. An activity based approach of pragmatics. Gothenburg Papers in Theoretical Linguistics 76. Reprinted in: H. Bunt & W. Black (eds.). Abduction, Belief and Context in Dialogue: Studies in Computational Pragmatics. Amsterdam: Benjamins, 2000, 47–80. Asher, Nicholas & Alex Lascarides 2003. Logics of Conversation. Cambridge: Cambridge University Press. Barwise, Jon 1987. Noun phrases, generalized quantifiers, and anaphora. In: P. Gärdenfors (ed.). Generalized Quantifiers: Linguistic and Logical Approaches. Dordrecht: Reidel, 1–30. Barwise, Jon 1989. The Situation in Logic. Stanford, CA: CSLI Publications. Barwise, Jon & John Perry 1983. Situations and Attitudes. Cambridge, MA: The MIT Press. Carlson, Lauri 1983. Dialogue Games: An Approach to Discourse Analysis. Dordrecht: Reidel. Clark, Herbert 1996. Using Language. Cambridge: Cambridge University Press. Clark, Herbert & Edward Schaefer 1989. Contributing to discourse. Cognitive Science 13, 259–294.
869
870
VIII. Theories of discourse semantics Cooper, Robin 1996. The role of situations in Generalized Quantifiers. In: S. Lappin (ed.). Handbook of Contemporary Semantic Theory. Oxford: Blackwell, 65–86. Cooper, Robin 2006. Austinian truth in Martin-Löf Type Theory. Research on Language and Computation 3, 333–362. Cooper, Robin & Massimo Poesio 1994. Situation Theory. Fracas Deliverable D8, Centre for Cognitive Science, Edinburgh: The Fracas Consortium. Core, Mark & James Allen 1997. Coding Dialogs with the DAMSL annotation scheme. In: D. Traum (ed.). Working notes of the AAAI Fall Symposium on Communicative Action in Humans and Machines. Cambridge, MA: MIT, 28–35. Dekker, Paul 2004. The pragmatic dimension of indefinites. Research on Language and Computation 2, 365–399. DeVault, David, Natalia Kariaeva, Anubha Kothari, Iris Oved & Matthew Stone 2005. An information-state approach to collaborative reference. In: M. Nagata & T. Pedersen (ed.). Proceedings of the ACL 2005 on Interactive Poster and Demonstration Sessions. Morristown, NJ: Association for Computational Linguistics, 1–4. Dowty, David, Robert Wall & Stanley Peters 1981. Introduction to Montague Semantics. Dordrecht: Reidel. Fernández, Raquel 2006. Non-Sentential Utterances in Dialogue: Classification, Resolution and Use. Ph.D. dissertation. King’s College, London. Fernández, Raquel & Jonathan Ginzburg 2002. Non-sentential utterances: A corpus study. Traitement automatique des languages. Dialogue 43, 13–42. Gamut, Louis 1991. Logic, Language, and Meaning, vol. 2: Intensional Logic and Logical Grammar. Chicago, IL: The University of Chicago Press. Gawron, Mark & Stanley Peters 1990. Anaphora and Quantification in Situation Semantics. Stanford, CA: CSLI Publications. Gibson, James 1979. The Ecological Approach to Visual Perception. Mahwah, NJ: Lawrence Erlbaum Associates. Ginzburg, Jonathan 1994. An update semantics for dialogue. In: H. Bunt, R. Muskens & G. Rentier (eds.). Proceedings of the 1st International Workshop on Computational Semantics. Tilburg: ITK, Tilburg University. Ginzburg, Jonathan 1996a. Dynamics and the semantics of dialogue. In: J. Seligman & D. Westerståhl (eds.). Logic, Language, and Computation. Stanford, CA: CSLI Publications, 221–237. Ginzburg, Jonathan 1996b. Interrogatives: Questions, facts, and dialogue. In: S. Lappin (ed.). Handbook of Contemporary Semantic Theory. Oxford: Blackwell, 359–423. Ginzburg, Jonathan 1997. On some semantic consequences of turn taking. In: P. Dekker, M. Stokhof & Y. Venema (eds.). Proceedings of the 11th Amsterdam Colloquium. Amsterdam: ILLC, 145–150. Ginzburg, Jonathan 2011. The Interactive Stance: Meaning for Conversation. Oxford: Oxford University Press. Ginzburg, Jonathan & Robin Cooper 2004. Clarification, ellipsis, and the nature of contextual updates. Linguistics & Philosophy 27, 297–366. Goldman, Alvin I. 1970. A Theory of Human Action. Eaglewood Cliffs, NJ: Prentice Hall. Groenendijk, Jeroen 1998. Questions in update semantics. In: J. Hulstijn & A. Nijholt (eds.). Proceedings of TwenDial 98, 13th Twente workshop on Language Technology. Twente: Twente University, 125–137. Hamblin, Charles L. 1970. Fallacies. London: Methuen. Healey, Patrick 2008. Interactive misalignment: The role of repair in the development of group sub-languages. In: R. Cooper & R. Kempson (eds.). Language in Flux: Dialogue Coordination, Language Variation, Change and Evolution. London: College Publications. Israel, David & John Perry 1996. Where monsters dwell. In: J. Seligman & D. Westerståhl (eds.). Logic, Language, and Computation. Stanford, CA: CSLI Publications, 303–316. Kamp, Hans & Uwe Reyle 1993. From Discourse to Logic. Dordrecht: Kluwer.
36. Situation Semantics: From indexicality to metacommunicative interaction Kaplan, David 1989. Demonstratives. In: J. Almog, J. Perry & H. Wettstein (ed.). Themes from Kaplan. New York: Oxford University Press, 481–614. van Kuppevelt, Jan 1995. Discourse structure, topicality and questioning. Journal of Linguistics 32, 109–147. Larsson, Staffan 2002. Issue based Dialogue Management. Doctoral dissertation. University of Gothenburg. Levinson, Stephen 1983. Pragmatics. Cambridge: Cambridge University Press. Macura, Zoran 2007. Metacommunication and Lexical Acquisition in a Primitive Foraging Environment. Ph.D. dissertation. King’s College, London. Macura, Zoran & Jonathan Ginzburg 2006. Lexicon convergence in a population with and without metacommunication. In: P. Vogt (ed.). Proceedings of EELC 2006. Heidelberg: Springer, 100–112. Matheson, Colin, Massimo Poesio & David Traum 2000. Modeling grounding and discourse obligations using update rules. In: Proceedings of the 1st Annual Meeting of the North American Chapter of the ACL. San Francisco, CA: Morgan Kaufmann, 1–8. Montague, Richard 1974. The proper treatment of quantification in ordinary English. In: R. Thomason (ed.). Formal Philosophy. Selected Papers of Richard Montague. New Haven, CT: Yale University Press, 247–270. Onifer, William & David A. Swinney 1981. Accessing lexical ambiguities during sentence comprehension: Effects of frequency of meaning and contextual bias. Memory and Cognition 9, 225–236. Partee, Barbara 1985. Situations, worlds, and contexts. Linguistics & Philosophy 8, 53–58. Perry, John 2000. The Problem of the Essential Indexical, Enlarged Edition. Stanford, CA: CSLI Publications. Poesio, Massimo & Reinhard Muskens 1997. The dynamics of discourse situations. In: P. Dekker, M. Stokhof & Y. Venema (eds.). Proceedings of the 11th Amsterdam Colloquium. Amsterdam: ILLC, 247–252. Poesio, Massimo & Hannes Rieser 2009. (Prolegomena to a theory of) Completions, Continuations, and Coordination in Dialogue. Ms. Colchester/Bielefeld, University of Essex/University of Bielefeld. Poesio, Massimo & David Traum 1997. Conversational actions and discourse situations. Computational Intelligence 13, 309–347. Pollard, Carl & Ivan A. Sag 1994. Head Driven Phrase Structure Grammar. Chicago, IL: The University of Chicago Press. Purver, Matthew 2004. The Theory and Use of Clarification in Dialogue. Ph.D. dissertation. King’s College, London. Purver, Matthew 2006. CLARIE: Handling clarification requests in a dialogue system. Research on Language & Computation 4, 259–288. Purver, Matthew, Jonathan Ginzburg & Patrick Healey 2001. On the means for clarification in dialogue. In: J. van Kuppevelt & R. Smith (eds.). Current and New Directions in Discourse and Dialogue. Dordrecht: Kluwer, 235–256. Roberts, Craige 1996. Information structure: Towards an integrated formal theory of pragmatics. In: J.-H. Yoon & A. Kathol (eds.). OSU Working Papers in Linguistics, vol. 49: Papers in Semantics. Columbus, OH: The Ohio State University, 91–136. Rodriguez, Kepa & David Schlangen 2004. Form, intonation and function of clarification requests in German task-oriented spoken dialogues. In: J. Ginzburg & E. Vallduvi (eds.). Proceedings of Catalog’04. The 8th Workshop on the Semantics and Pragmatics of Dialogue. Barcelona: Universitat Pompeu Fabra, 101–108. Schegloff, Emanuel 1987. Some sources of misunderstanding in talk-in-interaction. Linguistics 25, 201–218. Schlangen, David 2004. Causes and strategies for requesting clarification in dialogue. In: M. Strube & C. Sidner (eds.). Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue. Stroudsburg, PA: Association for Computational Linguistics, 136–143.
871
872
VIII. Theories of discourse semantics Tanenhaus, Michael & John Trueswell 1995. Sentence comprehension. In: J. Miller & P. Eimas (eds.). Handbook of Perception and Cognition, vol. 11: Speech, Language and Communication. New York: Academic Press, 217–262. Traum, David & Elizabeth Hinkelmann 1992. Conversation acts in task-oriented spoken dialogue. Computational Intelligence 8, 575–599. Jonathan Ginzburg, Paris (France)
37. Discourse Representation Theory 1. 2. 3. 4. 5. 6. 7. 8. 9.
Introduction DRT at work Presupposition and binding Binding in DRT Lexicon and inference Extensions Direct reference and anchors Coverage, extensions of the framework, implementations References
Abstract Discourse Representation Theory (DRT) originated from the desire to account for aspects of linguistic meaning that have to do with the connections between sentences in a discourse or text (as opposed to the meanings that individual sentences have in isolation). The general framework it proposes is dynamic: the semantic contribution that a sentence makes to a discourse or text is analysed as its contribution to the semantic representation – Discourse Representation Structure or DRS – that has already been constructed for the sentences preceding it. Interpretation is thus described as a transformation process which turns DRSs into other (as a rule more informative) DRSs, and meaning is explicated in terms of the canons that govern the construction of DRSs. DRT’s emphasis on semantic representations distinguishes it from other dynamic frameworks (such as the Dynamic Predicate Logic and Dynamic Montague Grammar developed by Groenendijk and Stokhof, and numerous variants of those). DRT is – both in its conception and in the details of its implementation – a theory of semantic representation, or logical form. The selection of topics for this survey reflects our view of what are the most important contributions of DRT to natural language semantics (as opposed to philosophy or artificial intelligence).
1. Introduction 1.1. Origins The origins of Discourse Representation Theory (DRT) had to do with the semantic connection between adjacent sentences in discourse. Starting point was the analysis of tense, Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 872–923
37. Discourse Representation Theory and more specifically the question how to define the different roles of Imperfect (Imp) and Simple Past (PS) in French. The semantic effects these tenses produce are often visible through the links they establish between the sentences in which they occur and the sentences preceding them. A telling example, which has become something of a prototype for the sentence linking role of tenses, is the following. (1) is the original example, in French; (2), its translation into English, establishes the same point. (1) Quand Alain ouvrit (PS) les yeux, il vit (PS) sa femme qui était (Imp) debout près de son lit. a. Elle luit sourit. (PS) b. Elle lui souriait. (Imp) (2) When Alain opened his eyes he saw his wife who was standing by his bed. a. She smiled. b. She was smiling. The difference between (1a) and (1b) is striking: The PS-sentence in (1a) is understood as describing the reaction of Alain’s wife to his waking up, the Imp-sentence in (1b) as describing a state of affairs that already holds at the time when Alain opens his eyes and sees her: the very first thing he sees is his smiling wife. In the late seventies the study of the tenses in French and other languages led to the conviction that their discourse linking properties are an essential aspect of their meaning and an effort got under way to formulate interpretation rules for different tenses that make their linking roles explicit. In 1980 came the awareness that the mechanisms which account for the inter-sentential connections that are established by tenses can also be invoked to explain the inter- and intra-sentential links between pronouns and their anaphoric antecedents. (An analogy in the spirit of Partee 1973, but within the realm of anaphora rather than deixis.) DRT was the result of working out the details for a small fragment dealing with sentence-internal and -external anaphora. This first fragment (Kamp 1981a) dealt only with pronominal anaphora, but a treatment of temporal anaphora, which offered an analysis of, among others, the anaphoric properties of PS and Imp, followed in the same year (Kamp 1981b). The theory presented in Kamp (1981a) proved to be equivalent to the independently developed File Change Semantics of Heim, which became available to a general audience at roughly the same time (Heim 1982). However, DRT and FCS were inspired by different intentions from the start, a difference that became more pronounced with the advent of Groenendijk and Stokhof’s Dynamic Semantics (Groenendijk & Stokhof 1991; Groenendijk & Stokhof 1990). Dynamic Semantics in the spirit of Groenendijk and Stokhof followed the lead of FCS, not DRT (Barwise & Perry 1983; Rooth 1987). From the very beginning one of the strong motivations of DRT was the desire to capture certain features of the way in which interpretations of sentences, texts and discourses are represented in the mind of the interpreter, including features that cannot be recaptured from the truth conditions that the chosen interpretation determines. This representational aspect of DRT was at first seen by some as a draw-back, viz. as an unwelcome deviation from the emphatically anti-psychologistic methods and philosophy of Montague Grammar (Montague 1973; Montague 1970a; Groenendijk & Stokhof 1990);
873
874
VIII. Theories of discourse semantics but with time this resistance appears to have lessened, largely because of the growing trend to see linguistics as a branch of cognitive science. (How good the representations of DRT are from a cognitive perspective, i.e. how much they tell us about the way in which humans represent information – or at least how they represent the information that is conveyed to them through language – is another matter, and one about which the last word has not been said.) In the course of the 1980s the scope of DRT was extended to the full paradigm of tense forms in French and English, as well as to a range of temporal adverbials, to anaphoric plural pronouns and other plural NPs, the representation of propositional attitudes and attitude reports, i.e. sentences and bits of text that describe the propositional attitudes of an agent or agents (Kamp 1990; Asher 1986; Kamp 2003), and to ellipsis (Asher 1993; Lerner & Pinkal 1995; Hardt 1992.) The nineties saw, besides extension and consolidation of the applications mentioned, a theory of lexical meaning compatible with the general principles of DRT (Kamp & Roßdeutscher 1992; Kamp & Roßdeutscher 1994a), and an account of presupposition (van der Sandt 1992; Beaver 1992; Beaver 1997; Beaver 2004; Geurts 1994; Geurts 1999; Geurts & van der Sandt 1999; Kamp 2001a; Kamp 2001b; van Genabith, Kamp & Reyle 2010). The nineties also saw the beginnings of two important extensions of DRT that have become theories in their own right and with their own names, U(nderspecified) DRT (Reyle 1993) and S(egmented) DRT (Asher 1993; Lascarides & Asher 1993; Asher & Lascarides 2003). SDRT would require a chapter on its own and we will only say a very few words about it here; UDRT will be discussed (all too briefly) in Section 8.2. In the first decade of the present century DRT was extended to cover focus-background structure (Kamp 2004; Riester 2008; Riester & Kamp 2010) and the treatment of various types of indefinites (Bende-Farkas & Kamp 2001; Farkas & de Swart 2003).
2. DRT at work In this section we show in some detail how DRT deals with one of the examples that motivated its development.
2.1. Tense in texts As noted in Section 1, the starting point for DRT was an attempt in the late seventies to come to grips with certain problems in the theory of tense and aspect. In the sixties and early seventies formal research into the ways in which natural languages express temporal information had been dominated by temporal logics of the kind that had been developed from the fifties onwards, starting with the work of Prior and others (Prior 1967; Kamp 1968; Vlach 1973). It became increasingly clear, however, that there were aspects to the way in which temporal information is handled in natural languages which neither the original Priorean logics nor later extensions of them could handle. One of the challenges that tenses present to semantic theory is to determine how they combine temporal and aspectual information and how those two kinds of information interact in the links that tenses establish between their own sentences and the ones preceding them. (1) and (2) are striking examples of this challenge. Here we will look at a pair of slightly simplified discourses which illustrate the same point.
37. Discourse Representation Theory (3) Alain woke up. a. His wife smiled. b. His wife was smiling. We will assume that (3a) and (3b) are related to the first sentence of (3) in the same way as the second sentences of (1) and (2) are related to the first sentences there: in the case of (3b) Alain’s wife was smiling when Alain opened his eyes, in (3a) she smiled as a reaction to that. These respective interpretations may not be as compelling as they are in the case of (2) or (1), but they are there and it is these readings for which we are now going to construct semantic representations. We assume that the first sentence has the syntactic structure given in (4). S(=CP)
(4)
TP
T'
DP NP
T
VP
Alain
PAST
V wake_up
Note that this structure makes the assumption, familiar from syntactic theories based on the work of Chomsky, that the interpretation provided by tense is located at a node T high up in the sentence tree and above the one containing the verb. We will see presently what implications this has for the construction of a semantic representation. We assume that the construction is bottom up (unlike in the first explicit formulations of DRT (Kamp 1981a; Kamp & Reyle 1993) where it is top down, and which for many years were treated as a kind of standard in DRT). Before we describe the construction procedure, we show the resulting DRS in (5), so that the reader will have an idea of what we are working towards. We will then describe the procedure in more detail. (5)
t e x t