176 23 4MB
English Pages 308 [310] Year 2005
Markus Werning • Edouard Machery • Gerhard Schurz (Eds.) The Compositionality of Meaning and Content Volume I: Foundational Issues
LINGUISTICS & PHILOSOPHY Edited by Günther Grewendorf • Wolfram Hinzen Hans Kamp • Helmut Weiss Band 1 / Volume 1
Markus Werning • Edouard Machery Gerhard Schurz (Eds.)
The Compositionality of Meaning and Content Volume I Foundational Issues
ontos verlag Frankfurt I Paris I Ebikon I Lancaster I New Brunswick
Bibliographic information published by Die Deutsche Bibliothek Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliographie; detailed bibliographic data is available in the Internet at http://dnb.ddb.de
North and South America by Transaction Books Rutgers University Piscataway, NJ 08854-8042 [email protected]
United Kingdom, Ire, Iceland, Turkey, Malta, Portugal by Gazelle Books Services Limited White Cross Mills Hightown LANCASTER, LA1 4XS [email protected]
2005 ontos verlag P.O. Box 15 41, D-63133 Heusenstamm www.ontosverlag.com ISBN 3-937202-52-8
2005 No part of this book may be reproduced, stored in retrieval systems or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use of the purchaser of the work Printed on acid-free paper ISO-Norm 970-6 FSC-certified (Forest Stewardship Council) This hardcover binding meets the International Library standard Printed in Germany by buch bücher dd ag
Contents Preface
7
Is Compositionality an A Priori Principle? Daniel Cohnitz
23
Fodor’s Inexplicitness Argument Reinaldo Elugardo
59
Compositionality Inductively, Co-inductively and Contextually Tim Fernando
87
Confirmation and Compositionality Ken Gemes
97
Levels of Perceptual Content and Visual Images. Conceptual, Compositional, or Not? 111 Verena Gottschling Recognitional Concepts and Conceptual Combination Pierre Jacob
135
How Similarities Compose Hannes Leitgeb
147
The Structure of Thoughts Menno Lievers
169
Intensional Epistemic Wholes: A Study in the Ontology of Collectivity 189 Alda Mari Impossible Primitives Jaume Mateu
213
Is Compositionality an Empirical Matter? Jaroslav Peregrin
231
The Compositionality of Concepts and Peirce’s Pragmatic Logic Ahti-Veikko Pietarinen
247
Semantic Holism and (Non-)Compositionality in Scientific Theories Gerhard Schurz
271
Right and Wrong Reasons for Compositionality Markus Werning
285
Preface The two volumes published by Ontos – The Compositionality of Meaning and Content: Foundational Issues and The Compositionality of Meaning and Content: Applications to Linguistics, Psychology and Neuroscience – bring together scientists from the disciplines that presently constitute cognitive science – psychology, neuropsychology, philosophy, linguistics, logic, and computational modelling. The purpose was to cast some light on a shared topic of interest, compositionality. To our knowledge, this is the first time that researchers from almost the whole spectrum of cognitive science unite their efforts to understand this important phenomenon. The contributions to these volumes originated in presentations to two conferences organized by the editors of these volumes. The first conference, CoCoCo – Compositionality, Concepts, & Cognition, was organized in March 2004 at the Heinrich-Heine University of D¨usseldorf, Germany. The second, NAC 2004 – New Aspects of Compositionality, was organized in June 2004 at the Ecole Normale Sup´erieure, Paris, France. A dynamic group of cognitive scientists, including philosophers, psychologists, linguists, computer scientists and neuroscientists, took part in these two lively conferences and shared their insights concerning compositionality with the other conference participants. Both conferences were very well-attended and gave rise to discussions, exchanges, and controversies that were, we are convinced, fruitful for all participants. We doubt anybody left these conferences without having improved her grasp on the issues connected to compositionality. These two conferences resulted from a frustration shared by the editors of these two volumes a couple of years ago. Cognitive science is essentially an interdisciplinary enterprise. And compositionality is a salient issue that is common to most disciplines in cognitive science, as is amply illustrated by these two volumes. In computational modelling and in artificial intelligence, compositionality had been an important issue, at least, since Fodor and Pylyshyn’s (1988) well-known criticism of connectionist models. This article sparked an intensive research on whether and how non classical models of cognition, primarily connectionist models, could accommodate compositionality (e.g., Smolensky, 1988, 1991; Van Gelder, 1990; Levy & Gayler, 2004; Werning & Maye, 2004, Werning, 2005). Among linguists, the methodological status of compositionality in semantics has been intensively scrutinized (e.g., Partee, 1984; Janssen, 1986, 1997). In the The Compositionality of Meaning and Content. Volume I: Foundational Issues. Edited by Markus Werning, Edouard Machery, & Gerhard Schurz. c
2005 Ontos Verlag. Printed in Germany.
8
philosophy of psychology, Fodor has based his case against philosophical and psychological theories of concepts on compositionality (Fodor & Lepore, 1992, 1996; Fodor, 1998a). In philosophy of science since Quine (1951), Carnap (1956) and Kuhn (1962), it has often been argued that the meaning of theoretical concepts is dependent on the background theory (Schurz, 2005). Kuhn and many other philosophers of science have therefrom inferred a holistic theorydependence of meaning to the effect that the meaning of theoretical concepts seems to be non-compositional (Fodor, 1987). There have also been some recent major advances in issues concerned with compositionality in several disciplines. The formal treatment of compositionality has been considerably advanced by the work of Wilfrid Hodges (1998, 2001; see also Zadrozny, 1994; Westerst˚ahl, 1998, Werning, 2004). In psychology, new models of concept composition were proposed in the nineties (e.g., Wisniewski, 1996; Costello & Keane, 2000). Finally, both experimental and modelling issues connected to compositionality have been important in the vibrant field of cognitive neuroscience (Shastri & Ajjanagadde, 1993; Werning, 2003, 2005; Werning & Maye, 2004; van der Velde & de Kamps, forthcoming). In spite of all this work on issues connected to compositionality, there had been little contact across the disciplines that constitute cognitive science. To remedy this interdisciplinary deafness and to further the exchange of views across disciplines in cognitive science, we decided to gather a group of cognitive scientists actively working on compositionality. Several contributions to the resulting conferences in D¨usseldorf and Paris have been put together in the two volumes. We are not under the illusion that these two volumes cover all the issues that are connected to compositionality. To illustrate, there is no contribution from classical computer science, although compositionality is also a significant issue in this discipline. In linguistics, the recent debates concerning direct compositionality – the idea, roughly, that semantics works in tandem with the syntax, i.e., that each expression is directly assigned an interpretation as it is syntactically constructed (Jacobson, 1999) – are not represented here. Nonetheless, we think that the contributions gathered in these two volumes cover a substantial fraction of the issues connected to compositionality in cognitive science. We hope that cognitive scientists will find these two volumes helpful and inspiring. We wish that they will further the development of an interdisciplinary approach to issues around compositionality. The two conferences on compositionality would not have been possible without the support of various institutions. Compositionality, Concepts, & Cognition in D¨usseldorf was generously funded by the Thyssen Foundation and the Heinrich-Heine University, D¨usseldorf, while New Aspects of Compositionality in Paris was financially supported by the RESCIF (R´eseau des Sciences
Preface
9
Cognitives en Ile-de-France), the department of cognitive studies at the Ecole Normale Sup´erieure in Paris, the University of Paris-Sorbonne, and the Institut Jean-Nicod. We, the editors, would like to thank these institutions for their support. We are also pleased to express our gratitude to those who have trusted us and who have supported our efforts. We would particularly like to thank Daniel Andler, director of the department of cognitive studies at the Ecole Normale Sup´erieure, who backed this project from the outset, as well as Pascal Engel (University of Paris-Sorbonne, philosophy department) and Pierre Jacob (Institut Jean-Nicod, Paris). We would like to thank the prestigious scientific board for the two conferences in Paris and D¨usseldorf, which consisted of the following distinguished scholars: Daniel Andler (department of cognitive studies, ENS, Paris), Peter Carruthers (department of philosophy, University of Maryland), James Hampton (department of psychology, City University London), Douglas Medin (department of psychology, Northwestern University), Jesse Prinz (department of philosophy, University of North Carolina, Chapel-Hill), Francois Recanati (Institut Jean-Nicod, Paris), Philippe Schlenker (department of linguistics, UCLA), and Dag Westerst˚ahl (department of philosophy, University of Gotenborg). We would also like to thank the referees for the two volumes: Claire Beyssade, Daniel Cohnitz, Fabrice Correia, David Danks, J´erome Dokic, Chris Eliasmith, Manuel Garcia-Carpintero, James Hampton, Heidi Harley, Paul Horwich, Theo Janssen, Kent Johnson, Ingrid Kaufman, Marcus Kracht, Hannes Leitgeb, Pascal Ludwig, Alexander Maye, Thomas M¨uller, Reiner Osswald, Elisabeth Pacherie, Peter Pagin, J´erome Pelletier, Josh Schechter, Benjamin Spector, Su Xiaoqin, and Dan Weiskopf. Finally, in the making of the D¨usseldorf conference a number of helpful hands were involved: in particular, those of Myung-Hee Theuer, Marc Breuer, Jens Fleischhauer, Hakan Beseoglu, Markus Stalf, Eckhart Arnold, Nicole Altvater, Marco Lagger, Sven Sommerfeld, and Celia Spoden. The Parisian conference would not have been possible without the help of Evelyne Delmer at the department of cognitive studies of the Ecole Normale Sup´erieure. To all those we cordially express our thanks. Compositionality – a plurality of ideas It is time to say a few words about what compositionality is. In what follows, we briefly review the issues connected to compositionality in the disciplines represented in the two volumes. Before going any further, we would like to emphasize an important caveat. Although “compositionality” has a rather precise
10
definition, particularly in logic and linguistics, it is used in various ways in philosophy, psychology, neuroscience, etc. It is thus always worthwhile wondering what an author means when she uses this notion. In spite of this caveat, “compositionality” is usually taken to refer to a property of some representational systems, primarily of languages – be they natural languages like French or English, artificial languages like mathematical or logical languages, or hypothetical languages like the language of thought. As a first approximation, an interpreted representational system R is compositional if and only if for every complex representation r of R, the meaning of r is determined by the structure of r and the meaning of the constituents of r. It is well-known that there are many, more or less precise variants of this principle (Hodges, 1998), but, for present purposes, this formulation will do. The principle of compositionality applies to a representational system R that contains simple representations and complex representations. Complex representations are made out of other representations – their constituents – according to rules of composition. These rules of composition – the syntax in the case of languages – determine the structure of the complex representations. Simple and complex representations of R are interpreted. R is compositional if and only if the interpretations of the simple and complex representations of R obey the principle of compositionality proposed above (or one of its variants). Logic In logic, there has been an intense work on the formal properties of compositional languages (for an introduction, see, particularly, the Journal of Logic, Language and Information, 10(1), edited by P. Pagin and D. Westerst˚ahl in 2001). Logicians have focused primarily on the following issues. Precise formalisms have been developed to rigorously formulate the principle of compositionality (as well as its variants). In this area, the work of Wilfrid Hodges is remarkable (see, particularly, Hodges, 2001). Following Montague (1970/1974), he has developed an algebraic framework, which allows him to specify various notions of compositionality. On this basis, Hodges (2001) has shown that any syntactically specified language can be provided with a compositional semantics. This result falsifies Hintikka’s claim that Independence-Friendly Logic is non-compositional (Hodges, 1997; Sandu & Hintikka, 2001). Nonetheless, it remains that when the meaning of the primitive elements of the vocabulary of the logical language is constrained, not every syntactically characterized language has a compositional semantics. Hodges’ results also raise the question of the putative triviality of the principle of compositionality (Zadrozny, 1994; Kazmi & Pelletier, 1998; Westerst˚ahl, 1998). Two important areas of research are also worth pointing to, the relation
Preface
11
between compositionality and context and, from an historical point of view, the place of compositionality in Frege’s logic and philosophy (e.g., Janssen, 2001). In our two volumes, Tim Fernando, Ahti-Viekko Pietarinen, Oleg Prosorov, and Gerhard Schurz focus on logical issues connected with compositionality. Interestingly, although the first three papers are committed to three different formal frameworks, they all focus on the relation between compositionality and context. Fernando’s article follows and develops Hodges’ and Westerst˚ahl’s approach. He shows that compositionality can be approached in two different ways, which he calls “inductive” and “co-inductive.” Pietarinen and Prosorov explore new territories. Pietarinen’s article centers on the relation between compositionality and Peirce’s contributions to logic and semantics. Of particular interest is his discussion of how Peirce’s logic integrates context-sensitivity and compositionality. Prosorov develops a new formalism to deal with compositionality. Like Fernando and Pietarinen, he discusses in detail the relation between context-sensitivity and compositionality. Finally, Schurz critically discusses Hodges’ theorem, which proves that under certain natural conditions, a compositional semantics always exists. Schurz argues that this does not imply that humans actually compute the meanings of terms along a compositional function. Rather, one should distinguish between logical and epistemic compositionality. He argues that the Ramsey/Lewis account of theories provides a semantics for theoretical terms that is compositional in the logical, but not in the epistemic sense. Linguistics Compositionality has fuelled many debates in linguistics. The principle of compositionality was an important element of Davidson’s and Montague’s approaches to semantics (Davidson, 1965/2001; Montague, 1970/1974). However, its methodological status was left unclear. Since, the question of the methodological status of compositionality has been a constant bone of contention in semantics (Partee, 1984). One can wonder, for instance, whether it is an empirical or a methodological (heuristic) principle. One can also question the utility of this principle, if it is viewed as a methodological principle. This methodological debate is often articulated as an empirical debate: Are natural languages compositional? It has indeed been occasionally claimed that compositionality is more or less frequently violated in natural languages (Higginbotham, 1986; Pelletier, 1994; Fodor, 2001), sparking a debate about the correct linguistic analysis of the putative counter-examples to compositionality. Recently, three issues connected to compositionality have attracted linguists’ attention – direct compositionality, impossible words, and the evolution of compositional languages. As noticed above, the direct compositionality approach is,
12
roughly, the idea that every syntactic operation is interpreted (Jacobson, 1999). The impossible words debate can be put as follows (Hale & Keyser, 1993, 1999; Fodor & Lepore, 1999, 2005; Johnson, 2004). According to many linguists in lexical semantics, many monomorphemic lexical items like “boy” or “table” have a linguistic structure. For instance, to use a classical example, the linguistic structure of “kill” could be “cause to die.” There is no consensus on the correct lexical decomposition, but linguists agree that when we understand monomorphemic lexical items, we decompose them into simpler elements. Fodor and Lepore (1999) disagree. They argue that by and large, lexical items like “dog” are primitive: They do not decompose. The impossible words argument is supposed to support the decomposition approach of lexical linguists. The idea is, roughly, that the decomposition approach explains why, as Fodor and Lepore recently put it (2005, p. 354), there is no verb in English “like blik such that ‘The desk bliked Mary’ means Mary broke the desk (i.e. why there aren’t transitive verbs whose subjects are their thematic patients).” Finally, there is a growing body of research on the evolution of compositional natural languages. In recent years, many models have been proposed to account for the appearance of compositional natural languages (e.g., Brighton, 2002; Smith, Brighton, & Kirby, 2003). In our two volumes, several articles bear on the issues connected with compositionality in linguistics. The compositionality of natural languages is closely scrutinized by Gayral, Kaiser and L´evy. They argue that natural languages are not compositional and conclude that traditional linguistic frameworks are inadequate for explaining linguistic understanding. Taking an opposite stance, Reinaldo Elugardo critically focuses on Fodor’s (2001) argument that natural languages are non-compositional and thus that linguistic meaning is derived from original, compositional mental content. Jaume Mateu’s article engages in the controversy about impossible words in lexical semantics, siding with Hale and Keyser. Jaroslav Peregrin’s and Daniel Cohnitz’ articles focus on the status of the principle of compositionality. Peregrin rejects the idea that it is an empirical generalization concerning natural languages and proposes instead that compositionality and meaning are conceptually linked. On the contrary, Cohnitz rebuts the arguments to the effect that compositionality is a priori true. He argues for a substantial notion of compositionality. Shelley Ching-yu Hsieh, Chinfa Lien and Sebastian Meier’s and Olav M¨ullerReichau’s articles are more empirical. Hsieh et al. focus on the meaning of expressions for plants, vegetables and trees (e.g., “garlic”, “cabbage”) in Mandarin Chinese and German. M¨uller-Reichau focuses on the relation between semantic composition and the type/token distinction. Henry Brighton’s article focuses on the evolution of compositional natural languages. Finally, Filip Buekens is concerned with the linguistic processing of aberrant sentences.
Preface
13
Philosophy The topic of compositionality has given rise to various debates in philosophy. We here review two, to some extent related issues – compositionality and the nature of meaning, compositionality and the nature of concepts. The nature of meaning, first. Since compositionality is a property of interpreted representational systems, one can wonder whether (or to what extent) compositionality constrains what meaning is in general or what meaning is, for a given language or representational system. This is a very controversial area. Particularly, Fodor and Lepore (1992) have argued that compositionality is inconsistent with most views concerning what meaning is or what meaning supervenes upon (Fodor & Lepore, 1992, 2002). For instance, Fodor and Lepore argue that the meaning of a symbol cannot be identified with, nor supervene upon the use of this symbol (or its functional role if it is a mental symbol). Others have taken diametrically opposed views. Particularly, Horwich (1997) has argued that compositionality does not constrain the nature of meaning (for a different reply to Fodor and Lepore, see Pagin, 1997). The results in logic about compositionality bear on this debate. Debates between Zadrozny, Hintikka, Hodges and Westerst˚ahl have shown that compositionality does not by itself constrain the meaning of logical languages (see also Fodor & Lepore, 2001). Compositionality has also been used in the debate around the nature of concepts. Fodor has argued that compositionality is inconsistent with most psychological and philosophical theories of concepts (Fodor, 1998a, 1998b, 1998c; Fodor & Lepore, 1996). As he puts it (1998a, p. 104): “compositionality is a sharp sword that cutteth many knots.” It is now common to distinguish two arguments from compositionality. The first argument says, as a first approximation, that save for a finite number of exceptions, whoever possesses the concept of x (or understands pxq) and the concept of y (or understands pyq) is able to entertain the concept of x that is y (or understands px that is yq). To use Fodor’s (in)famous example, whoever possesses the concept of pet and the concept of fish is able to entertain the concept of pet fish. This is supposed to be inconsistent with how concepts are individuated according to current psychological and philosophical theories of concepts. For instance, if concepts are recognitional capacities (or, mutatis mutandis, if concepts are prototypes, or are individuated by means of their functional role, etc.), one should be able to recognize pet fish, when one possesses the concept of pet and the concept of fish, which is not the case – or so Fodor argues. It is often replied that complex concepts do not have to be similar to simple concepts. For instance, complex concepts do not have to be recognitional, even if simple concepts are.
14
The second argument, sometimes called “the argument from reverse compositionality,” is supposed to rebut this reply. Fodor argues that necessarily, whoever possesses a complex concept X THAT IS Y (or understands the complex expression px that is yq) possesses the concepts X and Y (or understands pxq and understands pyq). For instance, necessarily, whoever is able to entertain the concept of pet fish possesses the concept of pet and the concept of fish. But, if a complex concept does not have to be recognitional even when its constituents are recognitional, one could entertain a complex concept, without entertaining its constituents, which is impossible – or so Fodor claims. Many papers have critically engaged in these two arguments (e.g., Robbins, 2001, 2002; Prinz, 2002, chapter 12; Recanati, 2002; Werning, 2002; Peacocke, 2004). Notice, importantly, that in this context, “compositionality” is taken in a slightly different sense. Instead of bearing on meaning, it concerns the properties that are constitutive of concepts or of the possession of concepts. Several articles in our two volumes focus on these issues. Markus Werning’s paper critically reviews the three most commonly cited reasons for compositionality – productivity, systematicity and inferentiality – and looks for alternative justifications. Menno Lievers critically discusses the arguments for Evans’ Generality Constraint. Denis Bonnay relies on Hodges’ formal approach to compositionality (Hodges, 2001) to argue for molecularism, roughly, the idea that the meaning of an expression is determined by its use in some meaning-fixing sentences. Pierre Jacob provides a detailed criticism of Fodor’s compositionality argument. Interestingly, several articles deal with other philosophical topics connected to compositionality. Alda Mari’s paper, at the intersection of formal ontology, metaphysics and linguistics, focuses on the nature of groups. Hannes Leitgeb examines formally how similarities could compose – which is of interest both for philosophers of science and for cognitive scientists. Kenneth Gemes’ article belongs to the philosophy of science. Gemes shows how the content of a theory can be compositional, meaning that it can be broken up into natural content parts. These content parts are of special importance for the empirical confirmation of theories by empirical evidence. Finally, Verena Gottschling’s article and Pierre Poirier and Benoit Hardy-Vall´ee’s article belong to the philosophy of psychology. Gottschling focuses on the nature of the content of perceptual representations. Poirier and Hardy-Vall´ee focus on what they call the spatialmotor view of cognition, in brief the idea that we are thinking with analogical representations. Psychology “Compositionality” is used in a slightly different way in psychology. Since the ground-breaking article by Osherson and Smith (1981), there has been a lot
Preface
15
of experimental and modelling work on how people create complex concepts (for reviews, see Hampton, 1997; Murphy, 2002, chapter 12). Roughly, the idea is that people in long-term memory have some bodies of information, i.e., concepts, about categories like dogs, tables, and so on. However, people do not, in long-term memory, have any concepts for categories like small dogs, square tables, Harvard graduate students who are carpenters, and so on. Rather, we are on the fly able to create a concept for, say, Harvard graduate students who are carpenters out of our concepts of Harvard graduate students and of carpenters, which themselves are stored in long-term memory. Research on this topic in psychology is indifferently called “concept composition” or “concept combination.” In this context, “compositionality” is usually used to refer to the fact that a complex concept XY , for instance the concept of graduate students who are carpenters, is created exclusively on the basis of the concepts X and Y , for instance the concept of graduate students and the concept of carpenters. When this is not the case, for instance because we rely on our background knowledge to create a complex concept, psychologists often speak of compositionality violations. In the eighties, Smith and colleagues’ selective modification model (Smith, Osherson, Rips, & Keane, 1988) attracted a lot of critical attention (Murphy, 1988). Hampton’s work on the relation between prototypes and concept combination is also noticeable (Hampton, 1987, 1988). In the nineties, new models were developed. Of particular interest were Wisniewski’s model and the model developed by Costello and Keane (Wisniewski, 1996; Costello & Keane, 2000; see also Gagn´e & Shoben, 1997). To a large extent, this psychological work is continuous with the work in linguistics concerning how people understand complex expressions, particularly noun-noun compounds like “dog newspaper” (e.g., Levi, 1978). Three articles in our two volumes focus on the psychological issues connected to compositionality. Nick Braisby investigates how his own view of concepts, the Relational View of Concepts, deals with the composition of concepts. He argues that his model, partly inspired by the so-called classical view of concepts, deals well with some problematic cases of concept composition. Building on their model (2000), Costello and Keane examine why the understanding of noun-noun compounds (such as “cow spoon”, “grass bird” or “rose gun”) seems to be non compositional. They insist on the importance of pragmatic factors in linguistic understanding. In his article, George Dunbar comes to a similar conclusion. He provides some empirical evidence that the comprehension of noun-noun compounds depends centrally on pragmatic factors.
16
Brain and cognitive modelling Cognitive science, including neuropsychology, is by and large representationalist. It assumes that cognition consists in manipulating representations. Of course, there is room for disagreement concerning the nature of these representations. Famously, Fodor and Pylyshyn (1988) have argued (1) that compositionality was a necessary property of our mental representational systems and (2) that in connectionist models, representations did not possess this property. In this context, “compositionality” refers to a host of properties – primarily to the capacity to produce structured complex representations out of simpler representations. This article fuelled an intense research on how to model representational systems, particularly among connectionists. This work has progressively merged with the modelling and experimental efforts in computational neuroscience. Models in cognitive neuroscience have particularly focused on what is known in this field as the binding problem (von der Malsburg, 1981; see the issue of Neuron, 24, 1, 1999). Roughly, the binding problem is the following: How to represent the co-instantiation of properties? To use a simple example, the binding problem consists in explaining how the brain represents the fact that a unique object is both red and square by contrast to the fact that an object is red and another is square. To deal with the binding problem, experimental and modelling work has focused particularly, but not exclusively, on neural oscillations and on synchronous signals. Two contributions in our two volumes bear on these issues. Ralf Garionis analyzes the properties of generative models in the context of unsupervised learning. Frank van der Velde describes the main properties of the model he has been developing with Marc de Kamps – what he calls “a neural ‘blackboard’ architecture of compositional sentence structure” (van der Velde and de Kamps, forthcoming). This model particularly addresses the binding problem. Because of their true interdisciplinary nature, many papers gathered in our two volumes cannot be clearly identified with a specific academic discipline. Most of these papers are at the intersection of two and sometimes three disciplines. To illustrate, several articles are at the border of linguistics and psychology. Braisby’s, Costello and Keane’s and Dunbar’s papers bear equally on how people create complex concepts and how people understand noun-noun compounds. Indeed, they are explicitly inspired by the literature in both fields. Hodges’, Westerst˚ahl’s and others’ result are increasingly used outside logic. Bonnay uses their approach to bear on the philosophical issues about the nature of meaning (see also Peregrin’s and Cohnitz articles). Mari’s and Mateu’s articles are at the intersection of linguistics and philosophy. Gottschling’s and Poirier and Hardy-Vall´ee’s articles are at the intersection of philosophy and psychology. And so on.
Preface
17
It is obvious that “compositionality” is not always used with the same meaning in all these disciplines and that there is room for misunderstanding between and sometimes within disciplines. We attempted above to pin down some of these differences in the use of “compositionality.” It remains that across disciplines, there is plenty of convergence on some issues connected to compositionality. It is our hope that these two volumes will foster new exchanges across the disciplines of the cognitive sciences. D¨usseldorf and Pittsburgh, July 2005 E.M., G.S., & M.W. References Brighton, H. (2002). Compositional syntax from cultural transmission. Artificial Life, 8(1), 23–54. Carnap, R. (1956). The methodological character of theoretical concepts. In H. Feigl & M. Scriven (Eds.), Minnesota studies in the philosophy of science (Vol. I, pp. 38–76). Minneapolis: University of Minnesota Press. Costello, F. J., & Keane, M. T. (2000). Efficient creativity: Constraint guided conceptual combination. Cognitive Science, 24(2), 299–349. Davidson, D. (1965/2001). Theories of meaning and learnable languages. In Inquiries into truth and interpretation (pp. 3–16). Oxford: Clarendon Press. Fodor, J. A. (1987). Psychosemantics. Cambridge, MA: MIT Press. Fodor, J. A. (1998a). Concepts: Where cognitive science went wrong. Cambridge, MA: MIT Press. Fodor, J. A. (1998b). There are no recognitional concepts – not even RED. In In critical condition: Polemical essays on cognitive science and the philosophy of mind (pp. 35–48). Cambridge, MA: MIT Press. Fodor, J. A. (1998c). There are no recognitional concepts – not even RED; part 2: The plot thickens. In In critical condition: Polemical essays on cognitive science and the philosophy of mind (pp. 49–62). Cambridge, MA: MIT Press. Fodor, J. A. (2001). Language, thought and compositionality. Mind & Language, 16, 1–15.
18
Fodor, J. A., & Lepore, E. (1992). Holism: A shopper’s guide. Oxford: Blackwell. Fodor, J. A., & Lepore, E. (1996). The pet fish and the red herring: Why concepts aren’t prototypes. Cognition, 58, 243–276. Fodor, J. A., & Lepore, E. (1999). Impossible words. Linguistic Inquiry, 30, 445–453. Fodor, J. A., & Lepore, E. (2001). Why compositionality won’t go away: Reflections on Horwich’s ‘deflationary’ theory. Ratio, 14(4), 350–368. Fodor, J. A., & Lepore, E. (2002). The compositionality papers. Oxford: Clarendon Press. Fodor, J. A., & Lepore, E. (2005). Impossible words: A reply to Kent Johnson. Mind & Language, 20(3), 353–356. Fodor, J. A., & Pylyshyn, Z. (1988). Connectionism and cognitive architecture: A critique. Cognition, 28, 3–71. Gagn´e, C. L., & Shoben, E. J. (1997). Influence of thematic relations comprehension of modifier-noun combinations. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23, 71–87. Hale, K., & Keyser, S. J. (1993). On argument structure and the lexical expression of syntatic relations. In K. Hale & S. J. Keyser (Eds.), The view from building 20: Essays on linguistics in honor of Sylvain Bromberger (pp. 53–109). Cambridge, MA: MIT Press. Hale, K., & Keyser, S. J. (1999). A response to Fodor and Lepore: Impossible words? Linguistic Inquiry, 30(3), 453–466. Hampton, J. A. (1987). Inheritance of attributes in natural concept conjunctions. Memory & Cognition, 15, 55–71. Hampton, J. A. (1988). Overextension of conjunctive concepts: Evidence for a unitary model of concept typicality and class inclusion. Journal of Experimental Psychology: Learning , Memory and Cognition, 14, 12–32. Hampton, J. A. (1997). Conceptual combination. In K. Lamberts & D. Shanks (Eds.), Knowledge, concepts, and categories (pp. 135–162). Hove: Psychology Press. Higginbotham, J. (1986). Linguistic theory and Davidson’s program in semantics. In E. Lepore (Ed.), Truth and interpretation: Perspectives on the philosophy of Donald Davidson (pp. 29–48). Oxford: Blackwell.
Preface
19
Hodges, W. (1997). Compositional semantics for a language of imperfect information. Journal of the Interest Group in Pure and Applied Logics, 5, 539–563. Hodges, W. (1998). Compositionality is not the problem. Logic and Logical Philosophy, 6, 7–33. Hodges, W. (2001). Formal features of compositionality. Journal of Logic, Language, and Information, 10, 7–28. Horwich, P. (1997). The composition of meanings. Philosophical Review, 106, 503–533. Jacobson, P. (1999). Towards a variable free semantics. Linguistics and Philosophy, 22, 117–184. Janssen, T. (1986). Foundations and applications of Montague grammar. Part 1: Philosophy, framework, computer science. Amsterdam: Centrum voor Wiskunde en Informatica. Janssen, T. (1997). Compositionality. In J. van Benthem & A. ter Meulen (Eds.), Handbook of logic and language (pp. 417–73). Amsterdam: Elsevier. Janssen, T. M. V. (2001). Frege, contextuality and compositionality. Journal of Logic, Language, and Information, 10, 115–136. Johnson, K. (2004). From impossible words to conceptual structure: The role structure and processes in the lexicon. Mind & Language, 19(3), 334–358. Kazmi, A., & Pelletier, F. J. (1998). Is compositionality formally vacuous. Linguistics and Philosophy, 21, 629–633. Kuhn, T. S. (1962). The structure of scientific revolutions. Chicago: University of Chicago Press. Levi, J. N. (1978). The syntax and semantics of complex nominals. London: Academic Press. Levy, S. D., & Gayler, R. (Eds.). (2004). Compositional connectionism in cognitive science. Menlo Park: AAAI Press. Maye, A., & Werning, M. (2004). Temporal binding of non-uniform objects. Neurocomputing, 58–60, 941–8. Montague, R. (1970/1974). Universal grammar. In R. Thomason (Ed.), Formal philosophy (pp. 222–246). New Haven: Yale University Press.
20
Murphy, G. L. (1988). Comprehending complex concepts. Cognitive Science, 12, 529–562. Murphy, G. L. (2002). The big book of concepts. Cambridge, MA: MIT PRess. Osherson, D. N., & Smith, E. E. (1981). On the adequacy of prototype theory as a theory of concepts. Cognition, 9, 35–58. Pagin, P. (1997). Is compositionality compatible with holism? Mind & Language, 12, 11–33. Partee, B. (1984). Compositionality. In F. Landman & F. Veltman (Eds.), Varieties of formal semantics (pp. 281–312). Dordrecht: Foris. Peacocke, C. (2004). Interrelations: Concepts, knowledge, reference and structure. Mind & Language, 19(1), 85–98. Pelletier, F. (1994). The principle of semantic compositionality. Topoi, 13, 11–24. Prinz, J. J. (2002). Furnishing the mind. Cambridge, MA: MIT Press. Quine, W. V. O. (1951). Two dogmas of empiricism. Philosophical Review, 60, 20–43. Recanati, F. (2002). The Fodorian fallacy. Analysis, 62(4), 285–289. Robbins, P. (2001). What compositionality still can do. Philosophical Quarterly, 51, 328–336. Robbins, P. (2002). How to blunt the sword of compositionality. Nous, 36(2), 313–334. Sandu, G., & Hintikka, J. (2001). Aspects of compositionality. Journal of Logic, Language and Information, 10, 49–61. Schurz, G. (2005). Theoretical commensurability by correspondence relations. In D. Kolak & J. Symons (Eds.), Quantifiers, questions, and quantum physics (pp. 101–126). New York: Springer. Shastri, L., & Ajjanagadde, V. (1993). From simple associations to systematic reasoning: A connectionist representation of rules, variables, and dynamic bindings using temporal synchrony. Behavioral and Brain Sciences, 16, 417–494. Smith, E. E., Osherson, D. N., Rips, L. J., & Keane, M. (1988). Combining prototypes: A selective modification model. Cognitive Science, 12, 485–527.
Preface
21
Smith, K., Brighton, H., & Kirby, S. (2003). Complex systems in language evolution: The cultural emergence of compositional structure. Advances in Complex Systems, 6(4), 537–558. Smolensky, P. (1988). On the proper treatment of connectionism. Behavioral and Brain Sciences, 11, 1–23. Smolensky, P. (1991). Connectionism, constituency, and the language of thought. In B. Loewer & G. Rey (Eds.), Meaning in mind: Fodor and his critics (pp. 201–227). Oxford: Basil Blackwell. van der Velde, F., & de Kamps, M. (Forthcoming). Neural blackboard architectures of combinatorial structures in cognition. Behavioral and Brain Sciences. van Gelder, T. (1990). Compositionality: A connectionist variation on a classical theme. Cognitive Science, 14(3), 355–384. von der Malsburg, C. (1981). The correlation theory of brain function (Internal Report Nos. 81–2). MPI Biophysical Chemistry. Werning, M. (2002). How to compose content. Psyche, 8. Werning, M. (2003). Synchrony and composition: Toward a cognitive architecture between classicism and connectionism. In B. Loewer, W. Malzkorn, & T. Raesch (Eds.), Applications of mathematical logic in philosophy and linguistics (pp. 261–278). Dordrecht: Kluwer. Werning, M. (2004). Compositionaltity, context, categories and the indeterminacy of translation. Erkenntnis, 60(145–78). Werning, M. (2005). The temporal dimension of thought: Cortical foundations of predicative representation. Synthese, 46(1/2), 203–24. Werning, M., & Maye, A. (2004). Implementing the (de-)composition of concepts: Oscillatory networks, coherency chains and hierarchical binding. In S. D. Levy & R. Gayle (Eds.), Compositional connectionism in cognitive science (pp. 76–81). Menlo Park: AAAI Press. Westerst˚ahl, D. (1998). On mathematical proofs of the vacuity of compositionality. Linguistics and Philosophy, 21, 635–643. Wisniewski, E. J. (1996). Construal and similarity in conceptual combination. Journal of Memory and Language, 35, 434–453. Zadrozny, W. (1994). From compositional to systematic semantics. Linguistics and Philosophy, 17, 329–342.
Is Compositionality an A Priori Principle? Daniel Cohnitz When reasons are given for compositionality, the arguments usually purport to establish compositionality in an almost a priori manner. I will rehearse these arguments why one could think that compositionality is a priori true, or almost a priori true, and will find all of them inconclusive. This, in itself, is no reason against compositionality, but a reason to try to establish or defend the principle on other than quasi-a priori grounds. I want to argue in this paper that there is a substantial (non vacuous) notion of compositionality that seems to be of interest for semanticists. There is also an argument for this kind of compositionality that superficially looks like one from the standard battery of arguments for compositionality but that is not quasi-a priori in the same way as the others are. Instead it rests on an empirical hypothesis of which we do not know on quasi-a priori considerations alone whether it is true or false. 1
Introduction
A superficial look at the literature on the principle of compositionality (henceforth ‘The Principle’) could suggest that the discussion is as confused as a discussion can be. This starts already with the question of the proper historical origin of The Principle. Although it is often called ‘Frege’s Principle’, it is controversial whether Gottlob Frege subscribed to The Principle throughout his work, or even at certain stages of his intellectual development. The main reason for that controversy is that Frege is also famous for another principle, the so called ‘Context Principle’, or ‘Principle of Contextuality’ which is prima facie in tension with the idea of compositionality. Whereas compositionality seems to explain the meaning of linguistic expressions bottom-up, by saying (in one way or other) that the meaning of a complex expression is a function of the meaning of its subexpressions, the Principle of Contextuality explains the meaning of expressions top-down, by saying that it is the meaning of the whole in virtue Address for correspondence: Department of Philosophy, Heinrich-Heine University D¨usseldorf, D-40225 D¨usseldorf, Germany. E-mail: [email protected]. The Compositionality of Meaning and Content. Volume I: Foundational Issues. Edited by Markus Werning, Edouard Machery, & Gerhard Schurz. c
2005 Ontos Verlag. Printed in Germany.
24
Daniel Cohnitz
Figure 1: A guided tour to the different interpretations of Frege’s two principles.
Is Compositionality an A Priori Principle?
25
of which the subexpressions have a determinate meaning. The subtleties of Frege exegesis are not our concern here, which is why figure 1 might suffice as a guide through the different positions that were defended with respect to this controversy.1 However, a quick look at figure 1 already highlights a second difficulty of any discussion of The Principle: all parties that do not disagree with respect to the first question disagree in their interpretation of at least one of the principles. They, for example, disagree on whether the principles concern reference or meaning or both or on whether they are meant in an epistemological sense or rather ontologically. This suggests that the prima facie understanding of The Principle and the Principle of Contextuality are misleading. Although the principles seem to be talking about the same and seem to be inconsistent, some argue that they not really are talking about the same or not really are inconsistent. Obviously there are different ways to understand The Principle, ways that are not only relevant for an adequate interpretation of Frege, but also ways that are relevant for a proper assessment of the epistemic status of The Principle. This leads us to the second aspect which is unclear about The Principle: what is it about and what exactly does it say? Whereas we will exclusively consider The Principle as intended to apply to (spoken) natural languages, the principle of compositionality is often intended to be about the mental representations correlated with elements of natural languages and concerns their syntax and their semantics (if mental representations have these). Such mental representations might be – for example – prototypestructured and then might or might not be compositional themselves (see Machery, 2001; Fodor, 1998; Fodor & Lepore, 2002). We will just note that the question of whether the ‘language’ of thought (LOT) is compositional might be different from the question of whether natural language is. For example, natural language might be non-compositional because of the fact that it contains synonymous sentences and the possibility to embed them in belief-contexts (Pelletier 1994), whereas it seems arguable, on the other hand, that LOT would not have two representations for the same meaning.2 A falsification of composi1
For a guide to this guide and for page references see Pelletier (2001) where this classification was invented. The references indicate places where one can either find the interpretation in question or a reconstruction of it. Some authors consider alternative possible interpretations, which is why they might occur in positions in the tree which are actually inconsistent. 2 Note, however, that this reply looses plausibility if beliefs are tokened sentences of LOT in a belief-box the way ‘Harvey’ is modeled in part (3.). In this case all synonymous expressions of the spoken language would correspond to one and the same sentence of LOT (that would be synonymy) and therefore embedding them into belief-contexts could not violate compositionality anymore (always the same sentence of LOT would
26
Daniel Cohnitz
tionality of natural language (e.g., if based on the existence of synonymies) is therefore no straightforward falsification of the compositionality of LOT. However, the arguments we will consider that speak in favor of compositionality can sometimes be reformulated to speak in favor of the compositionality of LOT. But even if we want to understand compositionality as applied to natural language, it is still unclear what The Principle actually says. Consider the standard3 formulation of The Principle: PoC 1 (Principle of Compositionality). The meaning of an expression is a function of4 the meanings of its parts and of the way they are syntactically combined. (Partee, 2004, p. 153) As Barbara Partee (2004, p. 154) has emphasized already, this principle can only be made precise in the context of a theory. According to Partee, The Principle is theory-dependent in at least these respects: i. What are the ‘meanings’? Are they considered model-theoretic objects?, linguistic representations?, intensions?, functions from contexts to intensions?, etc. ii. What is assumed about the syntax? Is it independently motivated? Is it constrained by compositionality? What kinds of abstractness and invisibilia are allowed?, etc. iii. How is the ‘is a function of’ relation to be understood? Are there any constraints on what kinds of functions interpret what kinds of syntactic combinations? Is compositionality necessarily purely bottomup? Must the functions be single-valued? Does functionality preclude non-dispensable intermediate levels of representation?, etc. Given that The Principle is not very precise if these questions are not answered beforehand, it is understandable that the The Principle is often regarded to be an be tokened by synonymies and therefore substitution of synonymies in belief-contexts in the natural language would preserve truth (and meaning)). 3 ...if there is any such thing as ‘the’ standard formulation. Zolt´an Szabo (2000a) found six formulations of compositionality in one and the same logic textbook only one of which he considered a stylistic variant of the principle of compositionality. 4 As Pelletier has emphasized, the principle should rather read ‘... is a function of, and only of, ...’. Pelletier’s point is that in the reading given above it seems allowed that the meanings of the parts could only be dummy variables of the meaning-determining function that play no actual role in determining the meaning of the whole (which is instead determined by the ‘real’ arguments of the meaning-determining function). That is of course not intended by the friends of compositionality. What they intend to say is that the meanings of the parts are the real arguments determining the meaning of the whole, hence the ‘only of’-clause. See Pelletier (1994, p. 11).
Is Compositionality an A Priori Principle?
27
obvious, almost self-evident truth. As we will see, rather plausible constraints on the possible answers to the questions above will weaken the standard arguments for compositionality considerably. But let’s consider why one could think that The Principle is true a priori. 2
Formal Vacuity
If it could be shown that compositionality is formally vacuous, compositionality would of course be an a priori principle, but at the same time absolutely uninteresting.5 Just as PoC 2. Human languages have a compositional semantics or human languages don’t have a compositional semantics. does not tell us much about the semantics of human languages, compositionality equally would not tell us much. Some mathematicians/logicians have claimed that compositionality is so vacuous. One example is Zadrozny: We prove a theorem stating that any semantics can be encoded as a compositional semantics, which means that, essentially, the standard definition of compositionality is formally vacuous. (Zadrozny, 1994, p. 329) Another example is van Benthem: [Frege’s principle of compositionality] has been investigated thoroughly in an algebraic setting in Janssen, 1983. The general outcome may be stated roughly as ‘anything goes’ – even though adherence to the principle often makes for elegance and uniformity of presentation. [...] Thus, by itself, compositionality provides no significant constraint upon semantic theory. (van Benthem, 1986, p. 200) As we have said already, The Principle is taken to be the thesis that the meaning of a whole is a function of the meaning of its parts. If we state the principle that loosely, compositionality may not be very substantial. The results obtained by Janssen (1983), van Benthem (1986) and Zadrozny (1994) are sometimes interpreted to have shown that for every language the semantics for that language can be represented as a compositional semantics. What did they do? Janssen was the first to prove that for any language and any meaning, the meaning can be assigned to the language in a compositional way. To get to this result, Janssen exploits ideas from Montague (1970) and the theory of universal 5
Which is not very surprising for a good empiricist: all a priori truths (if there are any) are not going to be informative.
28
Daniel Cohnitz
algebra. The key idea that led to a formal treatment of compositionality is that compositionality requires the existence of a homomorphism between the expressions of a language and their structure and the meanings of those expressions and their structure. The relata of this homomorphism are formally represented as algebras. D 1 (Algebra). An algebra A , consists of a set A called the carrier of the algebra, and a set F of functions (operators) of that set and yielding values in that set. So A = hA, Fi. The elements of the carrier are called the elements of the algebra. If an operator (function) is not defined on the whole carrier, it is called a partial operator. If E = F(E 1 , E 2 , ..., E n ), then E 1 , E 2 , ..., and E n are called parts of E. If an operator takes n arguments, it is called an n-ary operator. Homomorphisms, on the other hand, are defined as follows: D 2 (Homomorphism). Let E = hA, Fi and B = hB, Gi be algebras. A mapping h: E → B is called a homomorphism if there is a mapping h0 : F → G such that for all f ∈ F and all a1 , ..., an ∈ A holds h( f (a1 , ..., an )) = h0 ( f )(h(a1 ), ..., h(an )). Given this definition, we first have a formal account of what syntax does. It consists of rules that take certain inputs and deliver certain outputs, the outputs being complex expressions, the inputs being ‘parts’ of these. The rules are then represented as operators on the set of syntactic subexpressions. If we think of a syntactic algebra as a set of expressions of a language upon which a number of operators (syntactic rules) are defined, requiring that these operators always apply to a fixed number of expression and yield a single expression, and allowing that the operators (syntactic rules) may be undefined for certain expression, we get the following partial algebra E = hE, (F γ )γ∈Γi. Here, E is the set of complex and simple expressions and every F γ is a partial operator on E with a fixed arity. Let us now turn to semantics. A meaning assignment is then defined on such a syntactic algebra as a function m from E to M, the set of possible meanings for the expressions of E. Compositionality is then a property of m. Given one of our F γ from E, F k (an k-ary operator on E), m is compositional with respect to this syntactic rule (or ‘F k -compositional’) only if there is a partial function from a k-ary partial operator Gk on M (the set of possible meanings) such that whenever F k (e1 , ..., ek ) is defined, m(F k (e1 , ..., ek )) = Gk (m(e1 ), ..., m(ek )). We will say that m is compositional simpliciter only if m is F γ -compositional for each F γ of E. Whenever it is compositional simpliciter, m induces the semantic algebra M = hM, (Gγ )γ∈Γi on M, and it is a homomorphism between E and M. The formal vacuity claim for compositionality can then be proved from a semantic approach as well as from a syntactic approach. Janssen followed a
Is Compositionality an A Priori Principle?
29
syntactic route. He proved that if a language is recursively enumerable (there is an algorithm that can generate it), and m: E → M a computable function of the expressions of E into M, then there are algebras E and M for E and M such that m is a homomorphism. This approach was syntactic, for it assumed that no syntactic structure of the language is given beforehand but that we are virtually free to construct the syntactic algebra of the language as we please. Van Benthem (1984, 1986) strengthened this result somewhat, taking a semantic approach. He assumed that we start with a (possibly partial) term algebra (generated from the set of lexical items by the syntactic operators), that m is defined for the lexical items and that the semantic operations in M which are to correspond to the syntactic operations are arbitrarily fixed as well. But then it is provable in universal algebra that m can be uniquely extended to a homomorphism from our syntactic term algebra to the corresponding semantic algebra (here only the meanings of the lexical items are fixed in advance on the semantic side, we are free to fix the meanings of the complex expressions).6 Zadrozny (1994) arrived at his vacuity result also from a semantic perspective. If we begin with a set S of strings generated from an arbitrary alphabet via concatenation (‘s.t’ being the concatenation of strings s and t) and consider a meaning function m which assigns the members of another arbitrary set M of meanings to the members of S, he could prove that we can always obtain a new meaning function µ such that for all s, t ∈ S: µ(s.t) = µ(s)(µ(t)) and µ(s)(s) = m(s). This time the assumption was that we are free to chose new meanings (but restrict the syntax considerably and want to be able to retrieve the ‘old’ meanings from the new ones).7 How could these proofs fail? In what sense could they be unconvincing? I think the vacuity proofs can fail to establish the vacuity of the compositionality principle in two main respects. First of all, all these claims are existence claims. That is they do not by themselves tell us what the compositional meaning function for a language looks like, but only that there is one. Does this establish the vacuity of compositionality? Maybe not necessarily. Here is an argument by Janssen: The challenge of compositional semantics is not to prove the existence of such a semantics, but to obtain one. The formal results do not help in this respect because the proofs of the theorems assume that some meaning assignment is already given and then turn it into a compositional one. Compositionality is not vacuous, because we have no recipe to obtain a compositional meaning assignment, and because several 6
For a discussion see Westerst˚ahl (1998). For a more detailed discussion see Dever (1999), Westerst˚ahl (1998) and Hodges (2001). 7
30
Daniel Cohnitz
proposals are ruled out by the principle. (Janssen, 1997, p. 457) Janssen then goes on to argue that because the theorems establish the existence of a compositional semantics, the compositionality principle is nevertheless no empirical claim. It was still trivial that there is a compositional semantics, but the non trivial part was to construct one. Compositionality was thus a methodological principle that guides our choices between alternative proposals: The challenge of semantics is to design a function that assigns meanings, and the present paper argues that the best method is to do so in a compositional way. Compositionality is not an empirical principle, but a methodological one. (Janssen, 1997, p. 457) I would like to note three things about Janssen’s claim: First of all, there is a recipe for obtaining a compositional semantics. This is the recipe behind Zadrozny’s result. As noted by Kazmi and Pelletier (1998) and Gendler Szab´o (2004), we could simply map every syntax onto itself and thereby obtain an isomorphism (and hence a homomorphism) from syntax to semantics. In case we want the old meanings to be retrievable from the new ones, as in Zadrozny’s construction, we also could follow the recipe by Westerst˚ahl: Suppose A is a partial algebra and m: A → M. Now we generate a new set of meanings M 0 = A × M. Then m0 : A → M 0 defined by D 3. m0 (a) = ha, m(a)i is compositional, as can easily be seen from the fact that m0 is a one-one mapping (from expressions to ordered pairs of expressions and their ‘old’ meaning). In that case we have again an isomorphism between syntax and semantics and thereby, of course, also a homomorphism as is required by the principle of compositionality. Thus the reason Janssen is giving for why compositionality is not absolutely vacuous seems partly misleading. Second, and more important, the challenge of finding a compositional meaning assignment is not as unrestricted as assumed in these proofs and it seems therefore that the principle is not vacuous. As Westerst˚ahl (1998, p. 641) has argued, none of the mathematical claims discussed so far seem to be of much help for the semanticist.8 What semanticists usually are confronted with is data about the meaning of lexical items as well as complex items, data concerning the structure of the language, plus the connections between some syntactic operations and some semantic operations being fixed. Given that, the conditions 8
This doesn’t mean that the mathematical framework introduced would be useless. A number of very interesting results have been achieved with the help of it, and we will come back to it in the next section. See Pagin (2003a), Westerst˚ahl (1999), Hodges (2001), and Hendriks (2001).
Is Compositionality an A Priori Principle?
31
for the proofs of Janssen and Zadrozny to be applicable are never satisfied for the problems of compositionality a semanticist is interested in. For these mathematical proofs we are free in choosing either a semantics or a syntax. But in general we have a (partial) structure on the syntax level, we have intuitions about complex and simple meanings and we have conflicts with compositionality. To show that the language under scrutiny (even if it is human language in general) is compositional is not achieved by throwing the semantic or syntactic data over board (as we would if we took the syntax itself as the semantics for a language), nor by some ad hoc fix by which we would do nothing but construct a new meaning function that has the counter instances for compositionality built in unmotivated. We are not looking for any old meaning function if we evaluate the claim that human languages have or don’t have a compositional semantics. What we want is that the meaning function obeys some additional constraints. The vacuity claims can therefore fail in two respects: (1) They might fail if they are formulated for just some syntactic structure of the language or just some assignment of meanings. We are interested in compositionality given a structure of a language and given certain data about the meanings of simple as well as complex expressions of the language and certain fixed relations between them. (2) They might fail if they do not deliver a recipe how to obtain a satisfying compositional semantics. We will not go any deeper into the question of whether the principle of compositionality is vacuous. I pointed out two reasons why one should think that the mathematical arguments given so far are unconvincing and refer to Westerst˚ahl (1998) and Kazmi and Pelletier (1998) for the further substantiation of these reasons. Of course, there is a trivial sense in which every meaning function for a language can be made compositional. This is the sense in which we can hold to every theory we like come what may. However, that is not a special formal feature of compositionality, but of theories in general. We can always choose between giving up a theory, giving up the conflicting data, and even giving up logic.9 The question that we will discuss in this paper is whether there is any good (a priori) reason not to abandon compositionality in cases of conflicting evidence. The mathematical results do obviously not help here. The third point I would like to note about Janssen’s claim concerns the issue of methodology. If you are a classical philosopher of science10 , you might think that methodological principles are a priori. Maybe that is the way Theo 9
I will take it for granted that Quine and Duhem convincingly argued for this point. A ‘classical philosopher of science’ is a philosopher of science who thinks that there is an a priori demarcation between the context of discovery and the context of justification. Methodology is then only concerned with the latter (and a priori), whereas 10
32
Daniel Cohnitz
Janssen thinks about compositionality. In this case we should construct the semantics of the language under consideration compositionally, not because the language under investigation is compositional in some objective sense, but because constructing semantics compositionally serves some other purpose. That the semantics of the language is compositional will then also be a priori, given the methodology employed when designing the semantics. Sometimes the sole reason for employing compositionality as a methodological principle seems to be that doing so seems always possible. Given that the compositionality constraint is almost always satisfiable, we should satisfy it. This by itself is certainly not a reason to employ a method. If it were trivial that for each physical theory T there is a physical theory T 0 with feature F, and if it were nonetheless non-trivial to construct T 0 from T , there would be no reason at all to construct a theory with feature F on purpose just because of that fact. What is needed instead is a demonstration that having feature F is a good thing, in our case, that compositionality serves some other purpose. It might be sufficient for employing compositionality as a methodological principle if obeying it has in the past led to improvements in the semantics (that seems to be the argument in Janssen, 1997). If that is so, but it cannot be demonstrated how these improvements are systematically11 connected to compositionality, this is clearly not an a priori argument for compositionality (and then not our concern in this paper).12 Then, what could such systematic improvements look like? Sometimes it is claimed that a compositional semantics reduces complexity. This is true in general, but true because compositionality is a form of systematicity and it is systematicity that reduces complexity (we will come back to that in some detail in the next section). Compositionality has the additional advantage that the semantic formation rules mirror the syntactic formation rules which is also a kind of complexity reduction. However, there seems to be at least a trade off between the advantage that the semantic value of a complex expression can be read off, in some intuitive sense of ‘read off’, from its syntactic structure, and the fact that making a non-compositional language compositional might lead to an increase in complexity elsewhere, as seems to be the case when the syntactic structure or formation rules are changed to render a language compositional or when the semantic evaluation is making detours when construed as a function of the meaning of all contingent, a posteriori facts concerning scientific progress belong to the context of discovery. 11 Note, however, that it is not necessary for a good methodological rule to be justified that there is an obvious systematic connection with whatever positive thing the employment of the rule causes. The relevant question is whether the rule causes this reliably. 12 Whether Janssen succeeded to show at all that turning a non-compositional proposal into a compositional one did improve the proposal is doubted by Hodges (1998).
Is Compositionality an A Priori Principle?
33
the parts of an expression and the ways they are combined (this seems to be the point of the Sandu/Hintikka response (2001, p. 60) to Hodges compositional semantics for IF-languages in which sets of sets of sequences replace mere sets of sequences of the non-compositional semantics (the latter construction, moreover, parallels the semantics of ordinary first-order languages)). The question is whether there is any much better a priori reason to offer for obeying compositionality as a methodological principle. Here is what can be found in Janssen: The most valuable arguments [for the compositionality of semantics defined in algebraic terms] are, in my opinion, those concerning the elegance and power of the framework, its heuristic value, and the lack of a mathematically well defined alternative. (Janssen, 1986, p. 38) The heuristic value is something that sounds interesting. In his earlier publication (1986) Janssen considers only the heuristic benefits of the algebraic framework, but later he generalized this to a claim about the heuristic benefits of compositionality as such. This is a quote from Janssen: Compositionality is not a formal restriction on what can be achieved, but a methodology on how to proceed. [...] It helps to find weak spots in non-compositional proposals; such proposals have a risk of being defective. Cases where an initially non-compositional proposal was turned into a compositional one, the analysis improved considerably. (Janssen, 1997, p. 461) Thus far, of course, no argument is given. Every proposal has ‘a risk of being defective’ and the claim about the improvement of analyses is uninteresting as long as there is no systematic connection between compositionality and improvements of a certain kind. Janssen thinks that the connection lies in the fact that compositionality forces you to think about your basic semantic units. Again Janssen: Compositionality requires a decision on what in a given approach the basic semantic units are: if one has to build meanings from them, it has to be decided what these units are. Compositionality also requires a decision on what the basic units in syntax are and how they are combined. If a proposal is not compositional, it is an indication that the fundamental question what the basic units are, is not answered satisfactorily. If such an answer is provided, the situation under discussion is better understood. So the main reason to follow this methodology, is that compositionality guides research in the right direction. (Janssen, 1997, p. 461)
34
Daniel Cohnitz
I am not convinced by this argument. If the methodological guideline was a nontrivial anti-compositionality principle (stated as an imperative) it would equally force the semanticist to make decision about the basic units of syntax and semantics. It seems that even the advice PoC 3. Think about your basic syntactic and semantic units and the way they are combined, respectively. is absolutely sufficient to serve the same purpose. Another argument for the methodological character of the compositionality of meaning departs from a conceptual analysis of meaning. Compositionality might be constitutive of meaning, given that, intuitively, meanings are singled out via the principle of compositionality (in the spirit of David Lewis, 1970: ‘In order to say what a meaning is, first ask what a meaning does, and then find something that does that.’). Some, then, seem to be convinced that whatever meanings do, they do it compositionally which is an intuition that was maybe even shared by Frege: Bedeutungen alone cannot play the role of meanings, for they happen to violate compositionality in intensional contexts. Thus we add Sinne to accompany Bedeutungen in our semantics and thereby save compositionality. This is – roughly – a story one might tell to reconstruct Frege’s theoretical choices. If that is true, there is no empirical question of whether or not all possible human languages are compositional. They could not fail to be. In fact, no language natural or artificial could fail to be compositional, because compositionality is a claim about the relation of syntax and semantics, but if compositionality is constitutive for semantics by being a necessary condition for proper meaning functions, there can’t be a non-compositional meaning function. Does meaning imply compositionality? Hintikka and Sandu (2001) would say it does not, as would Pelletier (1994). Note that everyone who ever denied that a natural language or in fact any language (natural or technical) was compositional fell prey to a conceptual confusion by uttering what is in fact a contradiction in terms. I do not think that this is the correct analysis (by principle of charity).13 3
Is Compositionality a Synthetic A Priori Truth?
If compositionality is not a vacuous principle there is still a way in which it could be considered a priori, given what some philosophers hold about apriority. 13
It seems that the majority of linguists and semanticists would agree that noncompositional meaning functions are conceivable, some – as we know – even think that they are actual phenomena. A theory of concepts that allows that only a minority of an otherwise homogeneous group of competent experts possesses a concept properly seems dubious, to say the least.
Is Compositionality an A Priori Principle?
35
If you are a Kantian, for example, you might still think that compositionality can be a priori although it is not vacuous, for you could think that it is a synthetic a priori truth. Good empiricists might respond that we can deal with this view rather briefly: there are no synthetic a priori truths, thus if compositionality is not vacuous, it is not a priori. I will not argue for anything else. What will be the subject of this part of my paper are ‘quasi-a priori’-considerations for the compositionality of human languages that are reminiscent of transcendental deductions. They are mere quasi-a priori arguments for they do not purport to prove that it is a priori that natural language is compositional. What they try to prove is that natural language must be compositional given empirical facts that are so hard to deny that they are almost as convincing as a priori arguments. What is problematic about transcendental arguments is the status of their premises. Since we are neither interested in Kant nor in the general question of whether transcendental arguments should be regimented that rather than another way, I will take the characterization of transcendental arguments by Roderick Chisholm, show how the standard quasi-a priori arguments for compositionality can be reconstructed as transcendental arguments so characterized, and criticize all of them as unconvincing. I will then try to show that if compositionality is understood as a thesis that quantifies over all possible human languages, it becomes clearer how reasons for or against compositionality could be established by substantial empirical investigation (rather than quasi-a priori arguments). Transcendental arguments characterized According to Roderick Chisholm (1978) there are three central features of transcendental procedures, the results of which are reported in transcendental arguments. A brief characterization is given in figure 2. Whether this characterization meets Kant’s own standards for transcendental deductions seems dubious. Kant would probably not have considered every necessary condition for every subject matter in such an argument. He would rather have considered only subject matters which are given to us in special ways and the necessary conditions considered were necessary conditions for our knowing of the subject matter in this special way. But that is a side issue for our point. We will only be dealing with arguments that satisfy the characterization given by Chisholm and the conditions under which we want to say that these arguments are justified. As you can see from the figure, two types of premises are involved in transcendental arguments. First there is the preanalytic data. In cases of alleged transcendental deductions this simply might be an empirical claim and is therefore in need of justification. As we will see in a minute, the source of this ‘knowledge’ might well be relevant for the assessment of a quasi-transcendental
36
Daniel Cohnitz
Figure 2: Kant’s transcendental procedure (according to Chisholm).
Is Compositionality an A Priori Principle?
37
argument The second kind of premise is formed by the transcendental principles that state the logically necessary preconditions for the truth of the preanalytic data. Thus, if P is the conjunction of all statements of the first kind in the argument, the transcendental principles are of the form P → Q, and express truths of logic, broadly conceived (including analytic truth). Reconstruction of quasi-transcendental arguments Consider the following (pretty bad) arguments for the compositionality of natural language: Argument 1 (A1-1) We understand complex expressions by understanding their parts and the way they are combined. (A1-2) It is necessary for understanding complex expressions in this way that the language is compositional. (A1-3) Therefore our language is compositional. Argument 2 (A2-1) We understand complex expressions. (A2-2) It is necessary for understanding complex expressions that the language is compositional. (A2-3) Therefore our language is compositional. Let us begin with argument 2. This argument is obviously unconvincing. It might easily be dismissed by a holistically motivated reply; for it seems conceivable (and might even be an actual phenomenon) that someone manages to learn the meaning of complex expressions without acquiring an understanding of their parts beforehand. Thus we might throw doubt on the alleged necessity of the second premise, the transcendental principle. Consider this argument by Zolt´an Gendler Szab´o (2000a, pp. 67–68): Arthur might have learned English up to a certain degree and understands the sentences S 1. It is raining. and S 2. This apple is red. He might also have noticed that adults use S1 and S 3. Rain is falling.
38
Daniel Cohnitz
interchangeably. Thus Arthur knows that he can use S1 and S3 under the exact same circumstances to make correct assertions (and he knows which they are). That seems pretty sufficient for Arthur to have understood S1 as well as S3 (if ‘understanding’ is knowing the truth conditions). This story does not preclude the possibility that Arthur, nevertheless, does not understand the sentence S 4. This apple is falling. But in this case S4 is composed of the elements and by the grammatical rules that also compose S1–S3. If that is so, it seems possible that someone comes to an understanding of (at least some) complex expressions without any detour through the understanding of the parts, thus premise (A2-2) seems doubtful.14 Let us instead turn to argument 1. This only argues that in fact we do understand sentences by understanding their parts and the way they are combined. Doing this, quite obviously, requires that the language allows it, that the language is compositional.15 This argument is clearly unconvincing because premise (A1-1) smuggles in what is supposed to be established by the argument: it is part of the preanalytic data that the language is compositional if the preanalytic data states that there is a compositional procedure by which we manage to generate the meaning of complex expressions. The transcendental principles are not doing much work in the argument, because compositionality is already guaranteed by the truth of the first premise. Of course, if the transcendental procedure is successful, the truth of the preanalytic data will entail the compositionality of natural language (via the transcendental principles), but the preanalytic data should not be outright identical with a statement claiming the compositionality of language, nor (more importantly) should our reason for believing the preanalytic data to be true, be based on the conviction that natural language is compositional. For the argument to 14
The example simply falsifies the strong compositionality-assumption (A2-2). One might think that in order to do so, we would need an alternative explanation. This is not so. The flight of the bumble-bee might well falsify parts of aerodynamics simply because it is an instance of an intended application and does not behave the way the theory predicts. 15 One might think that this is only true, because (A1-2) is true: that we process a language in a certain way tells us about the way the language is, only because there is a necessary connection between the two ways. But this is besides the point of the example. Premise (A1-1) is supposed to presuppose the conclusion of the argument in a rather blatant way. In other words, the argument does not try to convince someone who doubts that language really is compositional, even granted that we process it that way, but someone who already believes that the question of whether language is compositional is the question of whether we process it that way.
Is Compositionality an A Priori Principle?
39
be convincing, we need an independent reason to believe the preanalytic data to be true. That is a lesson from the first argument. Concerning the transcendental principles we observe that their weak spot is their alleged necessity. If a transcendental argument is supposed to be successful, the compositionality of natural language must really be a precondition for the preanalytic data to be true. That is a lesson from the second argument. With these two considerations in mind, we can now turn to the quasi-a priori arguments for compositionality. Here is such an argument by Donald Davidson16 : When we regard the meaning of each sentence as a function of a finite number of features of the sentence, we have an insight not only into what there is to be learned; we also understand how an infinite aptitude can be encompassed by finite accomplishments. For suppose that a language lacks this feature; then no matter how many sentences a would-be speaker learns to produce and understand, there will remain others whose meanings are not given by the rules already mastered. It is natural to say that such a language is unlearnable. (Davidson, 1984, pp. 8–9) The argument that this is true for us relies on our limitations, all of which we can easily add to the preanalytic data (that man is mortal, that man has finite storage capacity17 ). Obviously, compositionality is established by an alleged transcendental argument. We are finite beings but understand infinitely many sentences. The only way that this is possible is that our language is compositional. This argument is sometimes called the argument from learnability. Argument 3 (The Argument from Learnability) (A3-1) We are able to master infinitely many sentences with different meanings. (A3-2) We are finite beings. (A3-3) A noncompositional language with that many sentences is unlearnable for finite beings. (A3-4) Our language is compositional. 16
Other such arguments were put forward by Noam Chomsky (1980, pp. 76–78) or Jerry Fodor (1987, pp. 147–153). 17 Davidson was aware of this, of course. The quote continues: ‘This argument depends, of course, on a number of empirical assumptions: for example, that we do not at some point suddenly acquire an ability to intuit the meanings of sentences on no rule at all; that each new item of vocabulary, or new grammatical rule, takes some finite time to be learned; that man is mortal.’ (Davidson 1967, p. 9).
40
Daniel Cohnitz
Some have remarked that it is not clear that we really master infinitely many sentences, but only potentially infinitely many. But that premise is maybe not even necessary for the argument to go through.18 It might be sufficient to establish that we understand far more sentences than we could have learned. To see this, consider the following clearly limited case (taken from Grandy, 1990) for which we only consider two noun phrases flanking a transitive verb, as in S 5. An iguana frightened a tiger. If we only consider a number of, say, 200 nouns and 50 transitive verbs, we get two million sentences of that form, without even variations in the articles or tense of the verb. Learning another noun would add 20,000 more sentences. Thus it seems not to be of too much relevance for the argument from learnability that we actually cannot master infinitely many sentences. So far this seems to be a good argument for the compositionality of natural language.19 However, if we remember our lesson from argument 1, we might ask ourselves what reason we have to believe that premise one of argument 3 is true. That is, what reasons do we have to believe that our language has an infinite or extremely large finite number of meaningful sentences (with different meanings)? As Peter Pagin has argued (1999, 2002), whatever justification we have to believe that natural language is ‘very rich’ (has a large finite ore infinite number of sentences) will undermine the soundness of the argument: If we are allowed to assume that natural languages are infinitely rich, then we do have a good argument for compositionality. [...] The problem is that we cannot just make the assumption. The claim that natural languages are infinitely rich is a strong claim about natural languages. It would be question begging to simply assume that it is correct. And it is not something we get directly from observations of natural language speakers. It needs a more theoretical justification. (Pagin, 2002, p. 164) But what could such a theoretical justification look like? It might, of course, be the case that we think that natural language is very or even infinitely rich because it is compositionally structured20 – that would be the question begging horn of the dilemma. The other horn would be that we believe on the basis of some other feature of our language that it has infinitely many sentences. But in this case, the fact that the language is very rich can’t anymore support that the language is compositional, for in this case we have a reason to believe that some 18
See also Pagin (2002). ... at least for a part of it. Of course the language might still have some noncompositional elements. But this is not at issue here. 20 ... as we just did in the example taken from Grandy. 19
Is Compositionality an A Priori Principle?
41
other mechanism (other than compositionality) is responsible for the richness of our language (generates infinitely many sentences): Now clearly, if we could justify the assumption that speakers speak an infinitely rich language without compositional structure, then we would know in advance that the learnability argument is flawed, because then, if speakers learn such a language from each other we know that it cannot be correctly explained by means of compositionality. On the other hand, if we can justify directly the assumption that speakers do speak infinitely rich compositional languages, then we already have an argument for compositionality, and need not add any extra consideration about learnability. (Pagin, 2002, p. 164) Very similar considerations apply to a second standard argument, the argument from new sentences (or ‘the argument from understanding’). This argument one finds often attributed to Frege (for example in Pagin, 2002, p. 166): It is astonishing what language can do. With a few syllables it can express an incalculable number of thoughts, so that even a thought grasped by a terrestrial being for the very first time can be put into a form of words which will be understood by someone to whom the thought is entirely new. This would be impossible, were we not able to distinguish parts in the thought corresponding to the parts of the sentence, so that the structure of the sentence serves as an image of the structure of the thought. (Frege, 1977, p. 55) This argument by Frege can be reconstructed in two different ways. If we reconstruct it as an argument from new sentences, it will be possible to give the same reply as we did in response to the argument from learnability: Argument 4 (The Argument from New Sentences) (A4-1) Our language has very many sentences with a predetermined meaning. (A4-2) When we encounter a new sentence of our language that we have never encountered before, we are nonetheless able to understand it. (A4-3) If our language were non-compositional that could not be explained. (A4-4) Our language is compositional. Again it could be asked on what basis we believe in the premise (A4-1). Why should we think that our language has very many sentences with a predetermined meaning? If our reason for believing this is that our language is compositional, why do we need the argument? If, on the other hand, we know of a
42
Daniel Cohnitz
mechanism that determines the meaning of very many sentences without being compositional, the argument is flawed. But, as we’ve said already, this is only one way to understand Frege’s little argument given above. An alternative interpretation is The Argument from Communication which is championed by Peter Pagin (1999, 2002, 2003a). The argument is basically an inference to the best explanation (or to the ‘only reasonable’ explanation).21 The preanalytic data it departs from is the observation that communication very often succeeds. Assume that I happen to be in the city centre of D¨usseldorf and for some reason or other do not feel like visiting shoe stores for the rest of the afternoon. To that effect I want to ask my girlfriend (who feels like visiting shoe stores for the rest of the afternoon) to meet me at the town hall half an hour after the shops are closed. I utter the German sentence S 6. Wir treffen uns’ne halbe Stunde nach Gesch¨aftsschluss am Rathaus. Half an hour after the shops are closed on the same day I am standing in front of the town hall, incidentally, my girlfriend happens to be there too. A good explanation for this coincidence is that my girlfriend grasped the thought that I wanted to express with S6. We can assume that she never had heard exactly the same sentence before and that I have never expressed anything with S6 until then, we are both without any previous experience with S6 and no one has explained to us that uttering S6 would be a good way to secure meeting each other in front of a town hall half an hour after the shops are closed. How come then, that by using [S6] I managed to convert my thought? Could it be hints in the context, or charity of interpretation, or empathy? That any of them, or any combination, could provide the solution, save under special circumstances, is wildly implausible. The only workable explanation is the compositionality explanation, or, more cautiously, an explanation which involves compositionality. (Pagin, 2002, p. 167) This time the reconstruction of the argument would look somewhat like this: Argument 5 (The Argument from Communication) (A5-1) Very often, speakers manage to communicate thoughts by way of uttering sentences that the other party in the communication situation had never heard before. (A5-2) The only feasible explanation for the success of communication in very many of these cases is that the language is compositional. 21
In more recent versions of his argument Pagin makes a weaker claim. He now argues that a compositional semantics offers (in relevant cases) a less complex interpretation method and that it is therefore preferable.
Is Compositionality an A Priori Principle?
43
(A5-3) Therefore the language is compositional. The preanalytic data, stated in (A5-1) is not a premise we would need a theoretical reason for. (A5-1) is an empirical fact that we can (and do) have observed. It does not involve any kind of extrapolation or a tacitly built in assumption of compositionality. At least with respect to the first premise, this argument looks much better than the ones we considered so far. However, this time the trouble is with the transcendental principle, A5-2. The problem we shall discuss now does, of course, also obtain with A3-3 and A4-3 (thus, even if the Pagin response to arguments 3 and 4 did not convince you, the following might). All these premises assume that the only possible way to learn, produce, or understand very (or infinitely) many sentences with different and novel meanings is by way of a compositional semantics. Again this is very clearly not true. Consider an argument championed by Markus Werning (2004). If we add a rule for holophrastic quotation to a compositional fragment of English and assume that this fragment of English has synonymous expressions, the language will remain productive, of course, but cease to be compositional, if the semantics for holophrastic quotation assign the expression itself as the semantic value of a quoted expression. Hereby, we simply add a syntactic rule for quotation that puts quotation marks around every expression of the language. D 4. q: T → T , s 7→ ‘s’ The semantic evaluation is then rather simple, we only have to take the expression as the meaning of the expression with quotation marks. D 5. µ(q(s)) = s Let us assume that ‘Lou and Lee are brothers’ is synonymous with ‘Lee and Lou are brothers’ in our compositional fragment of English. Clearly, the extended fragment will be productive, simply because quotation can be iterated (thus it even adds to the productivity), and the meaning function obviously is computable. For the proof of non-compositionality of the extended fragment
44
Daniel Cohnitz
we will use ‘p’ and ‘q’ for meta-linguistic quotation: µ(pLou and Lee are brothers.q) = µ(pLee and Lou are brothers.q) [ass.] Lou and Lee are brothers. 6= Lee and Lou are brothers. [ass.]
(1) (2)
µ(p‘Lou and Lee are brothers.’q) = µ(q(pLou and Lee are brothers.q)) = pLou and Lee are brothers.q [D5]
(3)
µ(p‘Lee and Lou are brothers.’q) = µ(q(pLee and Lou are brothers.q)) = pLee and Lou are brothers.q [D5] µ(q(pLou and Lee are brothers.q)) 6= µ(q(pLee and Lou are brothers.q)) [2, 3, 4]
(4) (5)
If we now assume, that the language in question is compositional, there clearly should be a semantic counterpart function µ q for the syntactic operation q: µ(q(pLou and Lee are brothers.q)) = µ q (µ(pLou and Lee are brothers.q))[comp.]
(6)
But then, substitutivity of identicals and another application of compositionality directly leads to an inconsistency: µ(q(pLou and Lee are brothers.q)) = µ q (µ(pLee and Lou are brothers.q)) [subst., 1]
(7)
µ(q(pLou and Lee are brothers.q)) = µ(q(pLee and Lou are brothers.q)) [comp.]
(8)
⊥ [5, 8]
(9)
Therefore, the extended fragment of English is not compositional.22 But of 22
Of course, there are other ways to analyse quotation, some of which are indeed
Is Compositionality an A Priori Principle?
45
course, this language is learnable in some way (you just did), certainly productive, and obviously not compositional. Explaining mutual understanding of novel sentences As a more detailed counterexample to the necessity of compositionality for the explanation of communication, understanding, and productivity consider Stephen Schiffer’s ‘Harvey’ (1987, pp. 192–207). Schiffer designed his example to show that an explanation of mutual understanding does not even involve reference to semantics. As we will see, this isn’t quite correct. However, Schiffer’s claim that such an explanation does not need to refer to compositionality can be vindicated. Harvey is an information processor whose ability to understand novel sentences uttered in (a fragment23 of) German can be explained without the further assumption that German is compositional. Harvey’s beliefs are represented in his belief-box B in a lingua mentis M (we will take reverse English to represent M for the moment) of which, again, nothing is assumed with respect to compositionality. For every possible belief of Harvey there is exactly one sentence s of M such that Harvey has the belief iff s is in B. What belief is in B is determined by a belief forming mechanism which takes sensory inputs and the present contents of B to produce new beliefs in B. The unique set of sensory inputs and contents of B sufficient for an inner sentence of M to be in B is that sentence’s conceptual role. Given this construction, the question of whether Harvey’s understanding of novel German utterances can be explained without recourse to compositionality is then a question of whether the belief forming mechanism can operate given the sensory input of an utterance in German without relying on the compositionality of German at any step. In more detail, the question is whether Harvey can arrive from a sentence of M, embedding a representation of the sounds of the utterance, like S 7. Leinad derettu ‘Wir treffen uns’ne halbe Stunde nach Geschaeftsschluss am Rathaus.’ via S 8. Leinad dias taht ew teem flah na ruoh retfa eht spohs era desolc ta eht nwot llah. at compositional. See also Markus Werning’s paper in this volume. 23 We will here only consider German without indexicals and without ambiguities. For the extended argument with a fragment of English including these features, see Schiffer (1987, pp. 200–205).
46
Daniel Cohnitz
Figure 3: Schiffer’s ‘Harvey’ while processing the sentence I uttered.
Is Compositionality an A Priori Principle?
47
S 9. Leinad’s ecnarettu si eurt ffi ew teem flah na ruoh retfa eht spohs era desolc ta eht nwot llah. To make this work, it has to be shown that the conceptual roles of ‘dias taht’ and ‘eurt’ do not necessarily presuppose a compositional semantics of either German or M. Schiffer shows that this is the case, if there is a recursive function f from structural descriptions of sound sequences that are well-formed formulae of German to sentences of M, such that (i) f is definable in terms of formal features of the expressions in its domain and range, without reference to any semantic features of any expressions in either M or German; and (ii) if the referent of the structural description, δ , can be used to say that p, then f (δ ) would token the belief that p. It seems there can well be functions that operate purely syntactical and need not assume the compositionality of M or German. In fact, it seems that if M represents the structural descriptions of well-formed formulae of German as quoted German sentences and Harvey’s language of thought were German (instead of reverse English), the relevant conceptual roles were trivial and M clearly could be non-compositional. Again using ‘p’ and ‘q’ for meta-linguistic quotation, we can stipulate the conceptual roles for ‘hat gesagt’ und ‘wahr’ easily (where ‘σ ’ ranges over sentences of inner and outer (fragmentary) German, and ‘Σ’ is a structural description of the sentence σ obtained by (some sort of) quotation): D 6 (Conceptual role of ‘hat gesagt’). If the sentence pα a¨ ußerte Σq is in Harvey’s B-box, then ceteris paribus, so is pα hat gesagt, dass σ q D 7 (Conceptual role of ‘wahr’). If the sentence pα hat gesagt, dass σ q is in Harvey’s B-box, then so is pWas α gesagt hat, (n¨amlich dass σ ) und damit αs ¨ Außerung ist wahr gdw. σ q. Is this a sufficient explanation of the explanandum? That depends on what we take the explanandum to be. If mutual understanding is what is to be explained, the explanation is not yet satisfying. It does explain why Harvey arrives at some interpretation of the uttered sentence, but it does not yet explain why he arrives at the correct interpretation, let alone why he reliably arrives at the correct interpretation. As Pagin (2003b) has emphasized, when we seek for an explanation of mutual understanding, we are interested in the latter, an explanation why the hearer arrives at the correct interpretation and why he does so reliably. But this is not explained by Schiffer, who so far only explained why Harvey arrives at a certain sentence in his B-box, given a certain input. This is still compatible with complete miscommunication or a mere accidentally correct interpretation of the utterance. Note that it wouldn’t even be sufficient if we had an explanation for successful communication in terms of (a) a recursive procedure by which the hearer
48
Daniel Cohnitz
comes to his belief about the content of the utterance, (b) a recursive procedure by which the utterance was composed in accordance with a belief of the speaker, and (c) some connecting fact between the interpretation of the hearer and the intention of the speaker for this utterance token. What we are really looking for, when we ask for an explanation of successful communication, is a deductivenomological or rather deductive-statistical explanation of a lawlike regularity, viz. that utterance tokens get normally interpreted correctly.24 To provide such an explanation makes necessary to complete Schiffer’s explanation with what Pagin calls a ‘Content theory’, a theory that assigns content to neural sentences on the basis of their syntactic properties in the neural language of interpreter and speaker, respectively. The Content theory would have to be systematic in the following sense: it contains one or more general lawlike statements that relate the syntactic properties of neural sentences with their semantic properties. What the Content theory assigns as semantic properties, like truth conditions, to a particular neural sentence then follows from those general lawlike statements together with statements about the particular syntactic properties of the sentence, in conformity with the deductivenomological or deductive-statistical models of scientific explanation. (Pagin 2003b, 44) Given such a Content theory, we could explain why a certain thought of the speaker was represented with certain syntactic properties in his neural language, how this led to an utterance of a sentence in the public language, how hearing it caused Harvey to have a neural representation of this sentence with certain syntactic properties and how this sentence of his neural language, eventually, represented a thought that matched the thought of the speakers such that it counts as a correct interpretation of the utterance. A theory that allows for explanations like these is a theory about the meaning of the public language that speaker and hearer use to communicate their thoughts. Part of the theory would explain how syntactic properties of public language sentences connect with thought contents of speakers and interpreters. The lawlikeness of the theory together with the contingent fact that speaker and interpreter are finite beings, assures that such a theory would induce a systematic assignment of meanings, determined by the syntactic structure of public language sentences. However, this way we can only establish that such a theory must induce a systematic meaning assignment, if there will be a complete explanation for communicative success. This does not establish that this meaning assignment must be compositional, as we will 24
For the type of explanation involved, see Cohnitz (2002). Note that the explanations asked for are explanations of of the robustness of a certain phenomenon, rather than a mere explanation of a phenomenon token.
Is Compositionality an A Priori Principle?
49
show in the next paragraph, nor does this argument establish that there is such a theory. It might well be that we are interested in the explanation of (apparent) phenomena, although there simply is no (complete) explanation available.25 If the difference between mere systematicity and compositionality is well defined, it seems that the transcendental arguments that rest on quasi-a priori considerations are all26 doomed to fail. What they might establish is merely some kind of ‘grounded recursiveness’ (as Pelletier, 1994 would call it27 ) or systematicity (as defined below). It is sufficient for explaining novelty or productivity if there is some recursive procedure by which we get to the meaning of new expressions. A recursive procedure does not imply that the language be compositional, it will be enough if it is systematic. Compositionality and mere systematicity Consider the following example (which is borrowed from Peter Pagin): Suppose we have a language L1 that consists of expressions generated from the two atomic expressions α, β and the operator σ . α, β are grammatical terms and so is for any grammatical term t, σ (t). On the set of grammatical terms we define 25
For a discussion, see Cohnitz (2002). In such cases the ‘phenomenon’ is left unexplained, at least in the way it was described. It might then turn out that we must revise the statement describing the phenomenon (turning an apparent non-statistical law into a statistical law, for example). Sometimes, however, there just might be no explanation, simply because we have been mistaken about the nature or even the existence of the explanandum. In the example here, the trouble might be that there is no complete explanation to be had, because what we described as the Content theory might presuppose that we can reduce the mental to the physical and it might turn out that we cannot do so (Pagin, 2003b). But even in this case, a systematic meaning assignment for the public language could explain how it is at all possible that finite beings often have communicative success (Pagin, 2003b). This would not suffice as a complete explanation, but as a ‘how-possibly’ explanation. For the latter notion see Schurz (1995), Hempel (1965). 26 The only argument I would know of that is not discussed here but might also be counted as quasi-a priori, is the ‘argument from systematicity’. The trouble with that argument is that it is quite unclear what it is supposed to prove and what exactly its premises are. For a discussion see Gendler Szab´o (2004) and the paper by Markus Werning in this volume. 27 The difference between Pelletier’s notion of ‘groundedness’ and our notion of ‘systematicity’ defined below is basically that groundedness allows that (aspects of) the context of an utterances may also be part of the recursive procedure by which the meaning of a complex expression is determined, whereas systematicity is defined without mentioning context. We will not go into the question of exactly what is contributed by the context of an utterance and of exactly how that could be reconciled with our notion of systematicity, but leave it for some other occasion.
50
Daniel Cohnitz
an enumeration with α as its initial term and the successor operation S: D 8 (Successor operation S). 1. S(α) = β , 2. S(β ) = σ (α), 3. S(σ (t)) = σ (S(t)) On the side of the semantics for L1 , we have a domain of Meanings M which is inductively defined in the following way: there is one basic concept, lisa, as well as two primitive functional concepts, Mother(x) and Father(x) which are the concepts of mother or – in the second case – father of what x is a concept of. The following inductively defines the rank of a concept: D 9 (Rank of a concept). 1. rank(ci ) = 0 if ci is lisa 2. rank(ci ) = 1 + rank(c1 ) if ci is Father(c1 ) or Mother(c1 ) Given the rank of concepts, we can define a total ordering on the set of meanings, M: D 10 (Ordering of meanings). 1. c1 < c2 if rank(c1 ) < rank(c2 ) 2. c1 < c2 if rank(c1 ) = rank(c2 ), c1 = Mother(c3 ) and c2 = Father(c4 ), for some c3 and c4 3. c1 < c2 if rank(c1 ) = rank(c2 ), c1 = X(c3 ), c2 = X(c4 ) and c3 < c4 , for some c3 and c4 , where X = Father, Mother On top of that we can now define a successor operator O: D 11 (Successor operation O). O(c1 ) = c2 iff c1 < c2 and there is no ci ∈ M, c1 < ci < c2 . and, finally, a meaning function µ: D 12 (Meaning function). 1. µ(α) = lisa, 2. µ(β ) = lisa, 3. µ(S(t)) = O(µ(t)), for t 6= α.
Is Compositionality an A Priori Principle?
51
The meaning function for L1 is obviously well-defined, and µ(t) can be computed for an arbitrary grammatical term t on the basis of its composition. However, µ is not defined by means of recursion over syntax in the normal sense (t does not need to be an immediate constituent of S(t), e.g. in S(β ) = σ (α)), and µ is clearly not compositional. Although µ(α) = µ(β ), µ(σ (α)) 6= µ(σ (β )), since µ(σ (α)) = µ(S(β )) = O(µ(β )) = O(lisa) = Mother(lisa) whereas µ(σ (β )) = µ(S(σ (α))) = µ(S(S(β ))) = O(µ(S(β ))) = O(O(µ(β ))) = O(O(lisa) = O(Mother(lisa)) = Father(lisa). µ is a merely systematic meaning function. The notions of systematicity, compositional systematicity and mere systematicity (as properties of meaning functions) can be defined in the algebraic framework we used in (2.) above, by introducing a slight modification.28 We will now consider a language algebra SL of a language as a tripel hE, Σ, AiL , where E is the set of expressions of L, Σ a set of functions, each σ ∈ Σ a (usually partial) function E k → E for some k, and A a subset of E such that Σ(A) = E. Σ(A) is the set of expressions generated from A by means of Σ. The term algebra STL is also a triple hT , Σ*, AiL , where T is the set of terms of SL denoting elements of E, Σ* a set of operators associated with Σ, σ ∈ Σ* a (usually partial) function T k → T for some k. The elements of Σ and Σ* are mapped one-one by *. The set of terms T is then defined inductively together with the valuation function val: D 13 (Set of terms T ). 1. every expression e ∈ A is in T and is an atomic term; val(e) = e, 2. if t 1 ... t n are in T and σ i ∈ Σ is defined for val(t 1 ), ..., val(t n ), then σ i *(t 1 ... t n ) is in T , and val(σ i *(t 1 ... t n )) = σ i (val(t 1 ), ..., val(t n )), 3. if σ i *(t 1 ... t n ) and σ j *(u1 ... um ) are both in T and σ i *(t 1 ... t n ) = σ j *(u1 ... um ), then n = m, σ i * = σ j * and t i = ui , 1 ≤ i < n, In the example, µ was computable in principle by a finite being, although the set of expressions of L1 was infinite. If a meaning function allows for that, we will call it systematic. To define this property, we will first have to define systematicity for language and term algebras. D 14. A language algebra SL generates the set E of expressions of L from a subset A of E. SL (and its associated term algebra STL ) will be called systematic iff ΣL and AL are both finite and E L is infinite. In the example above, we had an algebra SL1 = hE, {S}, {α, β , σ }iL1 . Consider the alternative grammatical algebra for L1 , GL1 = hE, {σ }, {α, β }iL1 . 28
The following is also in large part due to Peter Pagin.
52
Daniel Cohnitz
Both algebras are systematic in the sense defined in (D14). Grammatical algebras are a special case of such systematic algebras and are constrained by the grammaticality restrictions of the language. Term algebras which correspond to grammatical language algebras we will call grammatical term algebras (see Hodges, 2001). Given these notions, we can define systematicity, compositional systematicity and mere systematicity of meaning functions in the following way: D 15 (Systematicity). A meaning function µ is systematic iff µ is a homomorphism from a systematic term algebra STL = hT , Σ*, AiL into a meaning algebra M of some kind. D 16 (Compositional systematicity). A meaning function µ is compositionally systematic iff µ is a systematic meaning function from a grammatical term algebra GTL = hT , Σ*, AiL . D 17 (Mere systematicity). A meaning function µ is merely systematic iff µ is a systematic meaning function but is not compositional. Compositionality, as a property of meaning functions, requires that the recursive procedure by which utterances are produced and interpreted is specific; that it is such that it always only looks at the semantic value of the syntactic parts of an expression and the way they are combined syntactically and computes the semantic value of the complex expression from them. Mere systematicity, again, would allow, for example, for a recursive procedure that requires looking at other simple or complex expressions instead and computing the meaning of the original expression from the syntactic parts of some other sentence and the way they are combined. Therefore compositional systematicity is not established by the argument from communication, whereas systematicity might well to be. Compositionality a posteriori If compositionality is a substantial claim that is not a priori, we should have some idea how to find evidence in support of compositionality or against it. Compositionality is a thesis about human languages. It is not a thesis about all languages, not even all languages that have certain formal features. I agree with Gendler Szab´o (2000a) that compositionality is intended to hold for all possible human languages, whereas ‘possible’ is not meant as it was above, i.e. ‘possible’ in the sense of ‘conceivable’, but ‘possible’ in the sense that we human beings could have developed that language to communicate, given our expressive needs and the structure of our brains. Consider the following necessary conditions for a possible human language: D 18 (Possible human language). A possible human language must be at least (i) a language suitable for the expression and communication of a wide range of
Is Compositionality an A Priori Principle?
53
thoughts, and (ii) a language that can be learned by human beings under normal social conditions as a first language. For illustration of these necessary conditions, Gendler Szab´o explains that the language of traffic signs and the language of pure set theory are not possible human languages. Both languages violate (i) in that you can’t say in set theory or the language of traffic signs that you have a headache. On the other hand, a language with only two phonemes and a language in which each expression is at least a hundred phonemes long can’t be possible human languages because the former’s expressions are too easily confused with one another, whereas the expressions of the latter were too hard to keep in mind, thus both would violate (ii). What the detailed necessary and sufficient conditions for possible human languages are is otherwise29 open to scientific investigation. We do not know yet what languages are all possible in the sense given. Compositionality might then be understood as quantifying over all possible human languages (for more details, see Gendler Szab´o 2000a, 2000b): PoC 4. For every possible human language L and for every complex expression e in L, the meaning of e in L is determined30 by the meanings of the constituents of e in L and by the structure of e in L. Clearly, positive support for compositionality depends on whether it is necessary for us to use and learn such a language, that this language has certain formal features. Argument 6 (Advanced Argument from Communication) (A6-1) Very often, speakers manage to communicate thoughts by way of uttering sentences that the other party in the communication situation had never heard before. (A6-2) Given the architecture of the human mind, this can only be explained if the native language of those speakers is compositional. (A6-3) Natural languages (that can serve as a first language) are compositional. This time the argument does not rest on alleged quasi-a priori assumptions but on outright empirical claims which are far from being self-evident. This holds 29
Natural languages will have evolved gradually, so, presumably, there will have been early stages in which you couldn’t say that you have a headache in a language that we might nevertheless wish to count as a ‘possible human language’. Which criteria we will eventually use for the demarcation is not our concern in this paper. 30 Whereas ‘determined’ should be understood in accordance with our notion of compositional systematicity.
54
Daniel Cohnitz
in particular for (A6-2). I do not know whether it is true or false. What we would have to investigate to assess its truth value is whether a language with holophrastic quotation as defined above is a language that can be learned by us as a first language, and is not a language that we can only learn if parasitic upon our mastery of English.31 Compositionality is then a significant but extremely general claim about human beings. 4
Concluding Remarks
If compositionality is a very general empirical claim, the way that Gendler Szab´o suggests, we can accommodate the following findings: • We can explain why compositionality is believed to be a general phenomenon rather than a phenomenon of only single languages, like English or German. It is general because the compositionality principle applies to all possible human languages. • We can explain why not all languages seem to us to be compositional. Some artificial languages are not compositional, some technical extensions of human languages are not compositional. • We can explain why no a priori argument suffices to establish the truth of the compositionality principle. One might think that we cannot explain why the principle of semantic compositionality seemed to us to be true, to some of us even to be conceptually true, if the principle is so substantial and the positive reasons for the principle so far away from our actual epistemic situation. Here is Gendler Szab´o: If compositionality turns out to be true, it will seem even more puzzling: why did we have he inclination to believe in it before the real evidence came in? How is it that, even though we are exceedingly uncertain what meaning is, we are convinced that, whatever it is, the meaning of a complex expression supervenes on the meaning of its parts and of its structure? (Gendler Szab´o, 2000a, p. 150) What is called for is a psychological explanation, but I don’t see why it should be hard to find one. Why, for example, shouldn’t the fact that we understand sentences by understanding their parts and the way they are combined, and a 31
That is not a problem of formal semantics, of course. The question of whether natural languages are compositional is then a question of cognitive psychology.
Is Compositionality an A Priori Principle?
55
sufficient portion of human egocentrism not be enough to account for that, if compositionality is true?32 Acknowledgements I would like to thank Wilfrid Hodges, Theo Janssen, Jim Kilbury, Jeff Pelletier, Christoph Rumpf, Markus Werning, and Dag Westerst˚ahl for very helpful discussions of the topics involved in this paper. Special thanks go to Peter Pagin for his comments on earlier versions and his help with part (3.) of this paper. References Baker, G., & Hacker, P. (1980). Wittgenstein: Understanding and meaning. Oxford: Blackwell. Baker, G., & Hacker, P. (1984a). Frege: Logical excavations. Oxford: OUP. Baker, G., & Hacker, P. (1984b). Language, sense & nonsense. Oxford: Blackwell. Benthem, J. van. (1984). The logic of semantics. In F. Landman & F. Veltman (Eds.), Varieties of formal semantics (pp. 55–80). Dordrecht: Foris. Benthem, J. van. (1986). Essays in logical semantics (Vol. 29). Dordrecht: Reidel. Burge, T. (1986). Frege on truth. In L. Haaparanta & J. Hintikka (Eds.), Frege synthesized (pp. 97–154). Dordrecht: Kluwer. Chisholm, R. (1978). What is a transcendental argument? Philosophie, 14, 19–22.
Neue Hefte f¨ur
Chomsky, N. (1980). Rules and representations. Oxford: Basil Blackwell. Cohnitz, D. (2002). Explanations are like salted peanuts. on why you can’t cut the route towards further reduction. In A. Nimtz, Christian; Beckermann (Ed.), Argument und Analyse: Proceedings of GAP4. (pp. 22–36). Mentis. Currie, G. (1982). Frege: An introduction to his philosophy. Totowa: Barnes & Noble. 32
I’m not saying that there is another argument to the effect that the best explanation for our intuition that The Principle is true is its truth. I’m only saying that if it is true then that might be a good starting point for explaining our intuition, and that explanation might then be the best explanation for our intuition because the explanans is true.
56
Daniel Cohnitz
Davidson, D. (1984). Inquiries into truth and interpretation. Oxford: Clarendon Press. Dever, J. (1999). Compositionality as methodology. Linguistics and Philosophy, 22, 311–326. Dummett, M. (1981a). Frege: Philosophy of language. Cambridge (Mass.): Harvard UP. Dummett, M. (1981b). The interpretation of Frege’s philosophy. London: Duckworth. Fodor, J. (1987). Psychosemantics. Cambridge (Mass.): MIT. Fodor, J. (1998). Concepts: Where cognitive science went wrong. Oxford: Clarendon Press. Fodor, J., & Lepore, E. (2002). The compositionality papers. Oxford: Clarendon Press. Frege, G. (1977). Logical investigations. Oxford: Blackwell. Gendler Szab´o, Z. (2000a). Compositionality as supervenience. Linguistics and Philosophy, 23, 475–505. Gendler Szab´o, Z. (2000b). Problems of compositionality. New York: Garland. Gendler Szab´o, Z. (Fall 2004). Compositionality. In E. N. Zalta (Ed.), The stanford encyclopedia of philosophy. Stanford: The Metaphysics Research Lab, CSLI Stanford. Grandy, R. E. (1990). Understanding and the principle of compositionality. Philosophical Perspectives, 4, 557–572. Haaparanta, L. (1985a). Frege’s context principle. Communication and Cognition, 18, 81–94. Haaparanta, L. (1985b). Frege’s doctrine of being. Acta Philosophica Fennica, 39. Hempel, C. G. (1965). Aspects of scientific explanation. London: The Free Press. Hendriks, H. (2001). Compositionality and model-theoretic interpretation. Journal of Logic, Language, and Information, 10, 29–84. Hodges, W. (1998). Compositionality is not the problem. Logic and Logical Philosophy, 6, 7–33.
Is Compositionality an A Priori Principle?
57
Hodges, W. (2001). Formal features of compositionality. Journal of Logic, Language, and Information, 10, 7–28. Janssen, T. M. V. (1986). Foundations and applications of Montague grammar, part 1: Philosophy, framework, computer science (Vol. 1). Amsterdam: CWI. Janssen, T. M. V. (1997). Compositionality. In A. t. M. J. van Benthem (Ed.), (pp. 417–473). Amsterdam: Elsevier. Janssen, T. M. V. (2001). Frege, contextuality and compositionality. Journal of Logic, Language and Information, 10, 115–136. Kazmi, A., & Pelletier, F. J. (1998). Is compositionality formally vacuous? Linguistics and Philosophy, 21, 629–633. Kluge, E.-H. (1980). The metaphysics of Gottlob Frege. Nijhoff. Machery, E. (2001). Compositionnalite, composition conceptuelle et combinaison prototypique. In Cognito, Revue francophone internationale en sciences cognitives, 22, 5–26. Montague, R. (1970). Universal grammar. Theorie, 36, 373–398. Pagin, P. (1999). Radical interpretation and compositional structure. In U. Zeglen (Ed.), (pp. 59–74). London: Routledge. Pagin, P. (2002). Rule-following, compositionality and the normativity of meaning. In D. Prawitz (Ed.), Meaning and interpretation (Vol. 55, pp. 153–181). Stockholm: Kungl. Vitterhets Historie och Antikvitetsakademien. Pagin, P. (2003a). Communication and strong compositionality. Journal of Philosophical Logic, 32, 287–322. Pagin, P. (2003b). Schiffer on communication. Facta Philosophica, 5, 25–48. Partee, B. (2004). Compositionality in formal semantics. London: Blackwell. Pelletier, F. J. (1994). The principle of semantic compositionality. Topoi, 13, 11–24. Pelletier, F. J. (2001). Did Frege believe Frege’s principle? Journal of Logic, Language, and Information, 10, 87–114. Resnik, M. (1967). The context principle in Frege’s philosophy. Philosophy and Phenomenological Research, 27, 356–365. Resnik, M. (1976). Frege’s context principle revisited. In M. Schirn (Ed.), Studien zu frege iii: Logik und semantik (pp. 35–49). Stuttgart: FrommannHolzboog.
58
Daniel Cohnitz
Sandu, G., & Hintikka, J. (2001). Aspects of compositionality. Journal of Logic, Language, and Information, 10, 49–61. Schiffer, S. (1987). Remnants of meaning. Cambridge (Mass.): MIT. Schurz, G. (1995). Wissenschaftliche Erkl¨arung. Ans¨atze zu einer logischpragmatischen Wissenschaftstheorie. Graz: dbv-Verlag. Shwayder, D. (1976). On the determination of reference by sense. In M. Schirn (Ed.), Studien zu frege iii: Logik und semantik (pp. 85–95). Stuttgart: Frommann-Holzboog. Skorupski, J. (1984). Dummett’s Frege. In C. Wright (Ed.), Frege: Tradition & influence (pp. 227–243). Oxford: Blackwell. Sluga, H. (1977). Frege’s alleged realism. Inquiry, 20, 227–242. Sluga, H. (1980). Gottlob Frege. London: Kegan Paul. Sluga, H. (1987). Frege against the Booleans. NDJFL, 28, 80–98. Werning, M. (2004). The compositional brain: A unification of conceptual and neuronal perspectives. Unpublished doctoral dissertation, Philosophical Faculty, Heinrich-Heine University D¨usseldorf, D¨usseldorf. Westerst˚ahl, D. (1998). On mathematical proofs of the vacuity of compositionality. Linguistics and Philosophy, 21, 635–643. Westerst˚ahl, D. (1999). Idioms and compositionality. In J. Gerbrandy, M. Marx, M. de Rijke, & Y. Venema (Eds.), JFAK. essays dedicated to Johan van Benthem on the occasion of his 50th birthday. Amsterdam: Amsterdam University Press. Wright, C. (1983). Frege’s conception of numbers as objects. Aberdeen: Aberdeen UP. Zadrozny, W. (1994). From compositionality to systematic semantics. Linguistics and Philosophy, 17, 329–342.
Fodor’s Inexplicitness Argument Reinaldo Elugardo
1
Introduction
Jerry Fodor has long held that natural language has no semantic content of its own apart from what it derives from thought.1 He argues that this asymmetry holds, in part, because thought is semantically compositional but language is not (Fodor, 2001).2 That last claim represents a major shift in his thinking, given that he once proclaimed: “We [=Fodor and Lepore] take the doctrine that natural languages are compositional to be, as one says in Britain, nonnegotiable” (Fodor & Lepore, 2001); bracketed expression added). The change in view is made all the more puzzling by Fodor’s contention that both language and thought are productive and systematic and that these features are best explained in terms of a system’s being compositional. So, in spite of the argument presented in Fodor (2001), it would appear that natural language is compositional. The challenge is to explain this apparent discrepancy. The tension arises from two sources. The first is Fodor’s belief that, since sentences express thoughts, compositionality requires that a sentence be explicit about the structure of the thought it expresses (Fodor, 2001, pp. 11–12). On the Address for correspondence: University of Oklahoma, Department of Philosophy, Norman, Oklahoma, 73019–2006, USA. E-mail: [email protected]. 1
See especially Fodor (1998) and Fodor (2001). One might wonder whether Fodor intends his thesis to apply to every natural language sentence or only just to some. I suspect the latter given the kinds of examples he gives. But, as I shall argue below, if his argument works for the sentences he discusses, then it will also work for many other kinds of natural language sentences if not all. 2 Others have argued that natural language isn’t semantically compositional, cf. Pelletier (1994a), Pelletier (1994b), and Schiffer (1986). Their arguments differ from Fodor’s in that theirs involve premises that connect semantic meaning with truthconditions. Fodor’s argument does not depend on that at all. Grandy (1990) is a criticism of Schiffer’s argument. The Compositionality of Meaning and Content. Volume I: Foundational Issues. Edited by Markus Werning, Edouard Machery, & Gerhard Schurz. c
2005 Ontos Verlag. Printed in Germany.
60
Reinaldo Elugardo
other hand, insofar as language is a productive and systematic system of representation, compositionality does not require linguistic explicitness – according to Fodor, it is not a requirement on linguistic productivity or systematicity that language be compositional; it is enough that language be related in some (albeit, unsystematic) way to a compositional representational system, namely, thought. The second source of the tension is Fodor’s Gricean belief that the semantic content of a sentence is the thought that a speaker expresses by her utterance of the sentence, which in turn is the utterance’s content (Fodor, 2001, p. 11). In this chapter, I want to re-consider the shift in his belief about linguistic compositionality, and offer some reasons for rejecting it. I want to defend the following three claims. First, independent of whatever roles sentences serve in the communication of thoughts, compositionality does not require that complex symbols be explicit about their semantic contents. Second, ellipses do not constitute compelling evidence for the (alleged) noncompositionality of natural language since the content of an utterance need not be the semantic content of the sentence uttered. Third, even if semantic content were the same as utterance content, ellipses might still not constitute compelling evidence for linguistic non-compositionality – that will depend on what the correct syntactic theory of ellipsis is. This chapter consists of four sections. In the first section, I present Fodor’s principle of compositionality and, in the second, I present his argument against linguistic compositionality. In the third section, I argue that compositionality imposes no explicitness-of-content constraint on complex symbols. The role of ellipses in Fodor’s argument for the non-compositionality of language, and his view that the semantic content of a sentence is a communicated thought are also examined in the third section. In the final section, I consider and respond to two objections to my arguments. 2
Fodor on Compositionality
Fodor states his version of the principle of compositionality in the following two passages:3 [A] Compositionality is the property that a system of representation has when (i) it contains both primitive symbols and symbols that are syntactically and semantically complex; and (ii) the latter inherit their syntactic/semantic properties from the former. (Fodor & Lepore, 2002, p. 1; italics added; cf. Fodor, 2001, pp. 6,7) 3
It is well known that the so-called “Principle of Compositionality” has several, nonequivalent, formulations. For an excellent survey of the different versions of the Principle of Compositionality, see Pelletier (1994a) and Szabo (2004).
Fodor’s Argument
61
[B] Compositionality . . . says . . . that the semantic value of a thought (/sentence) is inherited from the semantic values of its constituents, together with their arrangement. (Fodor, 2001, p. 6) By “inherit”, I take Fodor to mean built from – the idea is that the whole meaning of a complex symbol is built exclusively from the meanings of its constituents and their syntactic arrangement. Evidence for this interpretation can be found in his remark that “compositionality requires that the content of a thought contain all of the contents of its constituents” (Fodor, 2001, p. 11). It requires that “nothing belongs to a host concept except what it receives from its constituents”, which Fodor says is the opposite of “reverse compositionality” (“nothing belongs to a constituent concept except what it contributes to its hosts”) (Fodor, 2000, p. 371). Thus, Fodor holds a notion of compositionality that is stronger than the standard functional version (“the meaning of a complex symbol is a function only of the meanings of its constituent symbols and its syntactic structure”). Fodor’s version is stronger because it entails – but the standard functional account does not – that the semantic content of a complex symbol is a structured whole. For present purposes, I will formulate Fodor’s principle of compositionality thus: Compositionality. The semantic content of each [non-idiomatic] complex symbol of a system of representation is built entirely from, and includes only, the contributed semantic values of its constituent symbols and is fixed by the way those constituents are arranged. It follows that, on Fodor’s view, the component meanings and the constituent structure of a complex symbol jointly determine its meaning and that nothing else does. So, if we have evidence that the component meanings and constituent structure of a complex linguistic symbol are insufficient for determining the symbol’s meaning, or that some third factor is essential to determining its meaning, then we have evidence of non-compositionality, which is a point that will figure importantly in Fodor’s argument. Before I discuss Fodor’s position on the compositionality of natural language, I want to make two additional important points. First, as stated, Compositionality is neutral on what semantic rules can operate on what parts of a sentence’s form, and it is also neutral on what counts as a syntactic constituent, a syntactic structure, etc. Nor does it require that every meaningful constituent of a meaningful complex symbol have a semantic value. It only requires that whatever content a complex symbol has, it gets it exclusively from the semantic values of its constituents (plus syntax) alone. It is entirely possible, then, for a complex form-content pairing to be compositionally fixed even though the semantics of the system of which it is a symbol doesn’t assign a semantic value to one of its
62
Reinaldo Elugardo
constituent symbols. I will explore this idea more fully in the third section. Second, Compositionality is a thesis about the complex symbols of a system of representation. So, if natural language is compositional, it is so qua system-of-symbols. We are assuming, then, an expression (type)-centered picture of language rather than a speech-act centered notion of language. That is an important distinction since the principle says nothing about the meaning or content of a speaker’s utterance or, in the case of thought, the content of a thinker’s internal Mentalese tokens. Nor does the principle guarantee that, in fixing the constituent structure of a complex linguistic symbol and its component meanings, one will have thereby fixed what a speaker would have conventionally stated or expressed in using the symbol to convey some thought even if the component meanings are concepts. Thus, one can’t legitimately infer the non-compositionality of language qua representational system-of-symbols from a premise about the non-compositionality of that language qua system-ofsymbols-in-use. This takes us to our next question: does Fodor think that English (qua systemof-symbols) is semantically compositional? Unfortunately, the question isn’t easy to answer based on the available evidence. On the one hand, there are many places where Fodor clearly thinks that English is compositional. Here are just a few: [C] . . . it is part and parcel of the compositionality of English that the symbol ‘dogs’ is complex, that it has the symbols ‘dog’ and ‘s’ as constituents, and that its meaning (viz. dogs) is inherited from the meanings of ‘dog’ (viz. dog) and ‘s’ (viz. plural). (Fodor & Lepore, 2002, p. 1) [D] So non-negotiable is compositionality that I’m not even going to tell you what it is. Suffice it that the things that the expression (mutatis mutandis the concept) ‘brown cow’ applies to are exactly the things to which expressions ‘brown’ and ‘cow’ apply. Likewise, the things that ‘brown cow in New Jersey’ applies to are exactly the ones that ‘brown’, ‘cow’, and ‘in New Jersey’ apply to. Compositionality is the name of whatever exactly it is that requires this kind of thing to be true in the general case. (Fodor, 2001, p. 6) [E] . . . it is part and parcel of the compositionality of English that the symbol ‘John jumps’ is complex, that it has among its constituents the symbols ‘John’ and ‘jumps,’ and that its meaning (viz. John jumps) is inherited from the meanings of these subsentential parts. (Fodor & Lepore, 2002, p. 1)
Fodor’s Argument
63
Furthermore, it is well-known that Fodor defends the compositionality of language and thought on abductive grounds: “human thought and human language are, invariably, productive and systematic: and the only way they could be is by being compositional” (Fodor, 2001, p. 6). With respect to English, he is even more explicit: [F] For present purposes, we can collapse systematicity and productivity together, and make the point like this: There are . . . indefinitely many things that English allows you to say about pigeons and the weather in Manhattan . . . English being compositional is what explains why so many of the sentences that you can use to say things about pigeons and the weather in Manhattan, share some or all of their vocabulary. (Fodor, 2001, p. 7) It would appear, then, that Fodor holds that English is semantically compositional. Appearances can be misleading, however. For, on the other hand, there are other places where Fodor clearly thinks English isn’t compositional.4 For example, in Concepts, he says that “English has no semantics”, and he draws that conclusion from his thesis that “English inherits its semantics from the contents of the beliefs, desires, intentions, and so forth that it’s used to express, as per Grice and his followers” (Fodor, 1998, p. 9). Clearly, if English has no semantics, then it has no compositional semantics either. Of course, Fodor may just be being polemical when he says that “English has no semantics”. But, by the same token, he also says that “if language is compositional then how a sentence is put together must be very explicit about how the corresponding thought is put together” (Fodor, 2001, p. 12). He then goes on to argue that natural language is pervasively inexplicit about content and draws the proper inference that “[a]s a matter of empirical fact, language is pretty clearly not compositional” (Fodor, 2001, p. 11). Which of these two conflicting views about language, then, does Fodor actually hold? I take him to hold the second and I believe this for two reasons. First, the claim that natural language is not compositional serves as a premise in his argument for the priority of thought-content (Fodor, 2001). Very briefly, his argument is this: at least thought or language must be compositional; if 4
In private correspondence, Fodor states that he never argued in Fodor (2001) that English was not compositional but only that it might not be compositional given certain empirical data and also that we should not prejudge the issue one way or the other. However, as I shall argue, Fodor’s modest take on his own position is not well supported by the relevant texts. Moreover, Fodor’s moderate reading of his thesis is too weak to sustain his argument for his claim that thought rather than language has content in the first instance, which he presents in Fodor (2001).
64
Reinaldo Elugardo
at most one them is compositional, then that’s the one that has content in the first instance; language is not compositional but thought is; ergo, thought has content in the first instance. One can only presume that Fodor believes what his premises say. Second, the claim that language is not compositional follows from his Gricean-inspired view that natural language sentences get their meanings from Mentalese, which he says is compositional and is distinct from any learnable natural language. According to him, Mentalese expressions can be mapped into English expressions without there being any systematic relation between words and sentences of the sort that natural language semanticists describe. In that case, thoughts can’t be mapped into language compositionally, (Fodor, 2001, p. 11). If English gets its semantics from thought but thoughts can’t be compositionally assigned to English sentences, then it follows that English isn’t semantically compositional. Fodor’s argument for that claim is important and deserving of careful scrutiny. I will now turn to it.
3
Fodor’s Inexplicitness Argument
Fodor presents his argument for the non-compositionality of language in the following passage: [G] . . . it couldn’t be true that language is strikingly elliptical and inexplicit about the thoughts that it expresses if language were compositional in anything like strict detail. For, if it were (and assuming that the content of a sentence is, or is the same as, the content of the corresponding thought) the structure of a sentence would indeed have to be explicit about the structure of the thought it expresses; in particular, the constituents of the sentence would have to correspond in a straightforward way to the thought’s constituents. For, if there are constituents of your thought that don’t correspond to constituents of the sentence you utter, then since compositionality requires that the content of a thought contain all of the content of its constituents, it must be that there was something in the thought that the sentence left out. So you’ve said less than you intended. And, likewise, if there’s some constituent of the sentence that doesn’t correspond to a constituent of the thought, then it must be that there’s something in the content of the sentence that isn’t in the content of the thought. So you’ve said more than you intended. (Fodor, 2001, pp. 11–12; italics added.) Call Fodor’s argument, “The Inexplicitness Argument”. Notice, first, that Fodor gives two necessary conditions on semantic explicitness:
Fodor’s Argument
65
Constraint [1]. Each syntactic constituent of a meaningful (non-idiomatic) complex symbol corresponds to a constituent of the symbol’s semantic content. Constraint [2]. Each constituent of the semantic content of a meaningful (non-idiomatic) complex symbol corresponds to some syntactic constituent of the symbol. The rest of the argument can now be formulated as follows: [P1] If English has a compositional semantics, then Constraints [1] and [2] hold for each complex (non-idiomatic) English symbol that has semantic content. [P2] If Constraints [1] and [2] hold for each complex (non-idiomatic) English symbol that has semantic content, and if the content of a sentence is the thought that it is used to express, then all (non-idiomatic) English sentences (after disambiguation and reference-fixing) are explicit about the thoughts that they are used to express. [P3] Few, if any, non-idiomatic English sentences are explicit about the thoughts they express (even after disambiguation and reference-fixing) – indeed, many if not most are inexplicit.5 Therefore [C1] Either English doesn’t have a compositional semantics or the content of a sentence is not the thought it is used to express. (from [P1],[P2], and [P3]) [P4] The content of a sentence is the thought that it is used to express. Therefore [C2] English doesn’t have a compositional semantics. (from [C1] and [P4]) Fodor justifies premise [P1] indirectly on the grounds that “compositionality requires that the content of a thought contain all of the content of its constituents” 5
In other words, many form-content pairings of English have unarticulated constituents. Premise [P3] is supported by the examples given below only if one assumes that a condition on P’s being the content semantically expressed by a sentence/utterance is that P be a complete proposition (as opposed to a propositional radical) with fully determinate, context-invariant, truth-conditions. Many linguists and philosophers who think language is highly context-sensitive will reject that assumption.
66
Reinaldo Elugardo
(Fodor, 2001, p. 11). By parity, if English is compositional, then the content of an English sentence must contain all of the content of its constituents; in which case, Constraints [1] and [2] supposedly follow on that assumption. Premise [P2] is uncontroversial while [P3] is well supported. (I will discuss Fodor’s examples for [P3] in more detail later.) Finally, [P4] is said to be true because “the function of sentences is to primarily express thoughts” (Fodor, 2001, p. 11). [P4] also figures in Fodor’s account of our generative capacity to understand and produce novel sentences, assuming that thoughts are semantically compositional and have concepts as their constituents, which in turn are the meanings of the words we learn (cf. Fodor, 1990a). In my view, premises [P1] and [P4] are false. In the next section, I will try to explain why Compositionality does not require Constraints [1] or [2]. I will also try to show that the content that a sentence semantically expresses (if any) is not always the thought that a speaker communicates in uttering the sentence. 4
Objections
Objections to premise [1] Consider sentence [1]: [1] Some boy fell According to Fodor, if the meaning-form pairing of [1] is compositionally determined, then the semantic content of [1] is built solely from the semantic values of its constituent symbols, (‘some’, ‘boy’, and ‘fell’), and the way those syntactic parts are put together. We may represent its structure as [1a]: [1a] [S [[NP [[DET [some]],[N [boy]]],VP [V [fell]]]] According to [P1], if [1] is semantically compositional, then the way its parts are put together must mirror the way the semantic constituents of [1]’s content are put together. Given [1a], the quantificational noun phrase (QNP), ‘some boy’, and the verb, ‘fell’, are the immediate constituents of [1]. So, [1] inherits its content exclusively from the semantic values of those parts plus its syntactic structure if [1] is compositional. Here, then, is a first pass on the semantic structure of [1]. We begin by treating the nominal, boy, and the verb, fell, as predicates and by interpreting them (relative to a domain of discourse) in terms of [S1] and [S2], respectively: [S1] [[FELL]]={x : x fell} [S2] [[BOY]]={x : x is a boy}
Fodor’s Argument
67
To mimic the structure of the NP, some boy, as represented in [1a], we want the quantificational determiner, some, to combine with a predicate, not with two predicates at a time.6 On the standard generalized quantifier view, quantificational determiners are expressions that combine with predicates to yield predicates that combine with other predicates. Accordingly, the semantic rule for the NP, a boy, is [S3]: [S3] [[SOME(BOY)]]={X : the intersection of X and [[BOY]] is nonempty}7 The semantic value of [1], which will be a truth-value, is fixed by reaching down to the semantic values of its immediate consequences, the NP a boy and the VP fell, and by applying the following semantic clause: [S4] [[(SOME(BOY))(FELL)]]=the True iff the intersection of [[SOME(BOY)]] and [[FELL]]is non-empty; otherwise, [[(SOME(BOY))(FELL)]]=the False Using [1a] as a guide, we can explain how the semantic value of a boy combines with the semantic value of fell to yield the semantic content of sentence [1]. Suppose that the relevant domain of discourse includes Marc, who is a boy who once fell. Marc is, then, a member of the set of things (from the domain in question) that fell and is also a member of the set of boys (from the same domain). Each set is the semantic value of FELL and BOY, respectively. Thus, by substitution of equals for equals in [S1] and [S2], respectively, Marc is a member 6
This will not work for QNPs when they occur as the grammatical object of a sentence. For, in those cases, there are no one-place predicates for them to take as arguments from their base positions, e.g., ‘Mary kissed some boy’. And yet, ‘some boy’ has the same denotational meaning, if it has any, in the sentences ‘Mary kissed some boy’ and ‘Some boy fell’. Similarly, the standard function-application interpretation of QNPs fails to account for scopal ambiguities in sentences that contain QNPs as grammatical subjects and objects, e.g., ‘Some boy kissed every girl’. In Fox (forthcoming), Danny Fox argues that, in such cases, one can derive an argument for a QNP provided that the syntax licenses its covert movement from its base position to one in which it has a sister For purposes of this paper, I shall assume that these problems can be solved in a way that is consistent with a compositional semantics for English. 7 To avoid any possible misunderstanding, let me just note that [S3] does not treat ‘some boy’ as a primitive. [S3] is instead an instance of a general semantic rule for interpreting all expressions of the form, ‘Some N’, for any English nominal N. Such a general rule is needed since recursion can occur within common noun phrases, e.g., ‘boy who likes some girl who . . . ’, but we cannot have an infinite number of semantic rules for interpreting infinitely many such constructions given that English is learnable. I don’t mean to suggest, by my use of [S3], that each N gets its own semantic rule for forming an indefinite NP.
68
Reinaldo Elugardo
of [[FELL]] and [[BOY]], respectively. [S3] says, in effect, that SOME(BOY) denotes a set of predicates of predicates, namely, the set of predicates that at least one boy or other falls under. So, by [S3], and by substitution of equals for equals, Marc is in the extension of SOME(BOY) by virtue of being a member of a set that intersects with [[BOY]] of which he is also a member, namely, [[FELL]]. Since Marc is a member of both [[SOME(BOY)]] and [[FELL]], the intersection of the two extensions is non-empty. Thus, by [S4], the semantic value of sentence [1] is the True. Although our example is very simple, it does have an interesting feature: no semantic value is directly assigned to the determiner SOME by [S1], [S2], or [S3], apart from its combinatorial role with a predicate. This is a case, then, of a sentence’s having a primitive syntactic part that doesn’t correspond on its own to any semantic constituent relative to a semantic interpretation. And yet, relative to [S1], [S2], and [S3], [1]’s syntactic form, as represented in [1a], is paired with a compositionally determined meaning given the meanings of its immediate constituents and its syntactic structure. Thus, a sentence can be semantically compositional in Fodor’s sense, relative to a semantic interpretation of its immediate constituents, even though it fails to meet Constraint [1]. Thus, premise [P1] is false. One could reply that compositionality requires that the semantic content of a complex symbol be exclusively determined by the semantic values of all of its primitive constituents and its syntactic structure. Semantic clauses, [S1], [S2], [S3], and [S4] are insufficient, then, for establishing the compositionality of [1] since neither one assigns a semantic value directly to the determiner alone in [1], which is a primitive constituent of [1]. Consequently, so goes this objection, the above example fails to show that a natural language sentence could be compositional in Fodor’s sense and yet fail to meet Constraint [1]. Here, then, is our second pass of the semantic structure of [1]. Suppose we think of BOY and FELL as denoting functions (rather than sets) from objects to truth-values, that is, they are of the semantic type he,ti. In particular, [[BOY]] is the function that returns the truth-value true for a given objectinput if and only if the object is a boy; otherwise, it returns truth-value false.8 Similarly, [[FELL]] is the function that returns the truth-value true for a given object-input if and only if the object fell; otherwise, it returns the truth-value false. We can think of SOME as denoting a function from objects-to-truthvalues that returns a function from a function from objects-to-truth-values to truth-values. In other words, SOME is of the semantic type hhe,ti, hhe,ti,tii. Then, SOME(BOY) is of the type hhe,ti,ti: it denotes a function from the function that yields the True for some function-input from objects to truth8
’[[x]]’ denotes the semantic value of expression x.
Fodor’s Argument
69
values that returns the True for all and only objects that are boys; otherwise, [[SOME(BOY)]] returns the False. Given these functional interpretations of the constituents of [1], it is straightforward that sentence [1] has the semantic type hti: [[SOME(BOY)]]([[FELL])=true, that is, just in case at least one boy fell; otherwise, [[SOME(BOY)(FELL)]]=false. So, on this reading, [1] is semantically compositional in Fodor’s sense, as determined by the denotations of its primitive constituents and their syntactic arrangement. Things get a little complicated for Fodor’s view about explicitness, however, when we consider QNPs whose constituent nominal is an intersective ADJ+N combination, as in [2]: [2] Some tall boy fell Since we want to our semantic interpretation of SOME to remain the same, it is standard to treat intersective adjectives when they have an accusative occurrence as being of the semantic type hhe,ti, he,tii, and to treat intersective ADJ+N compounds as nominals. The idea would be that TALL (in the accusative position) is of the type hhe,ti, he,tii, BOY(like FELL) is of the type, he,ti; in which case, TALL BOY is also of the type he,ti. On this picture, SOME(TALL BOY) has the same semantic type as SOME(BOY), namely, hhe,ti,ti. Now compare [2] with [3]: [3] Some boy who fell is tall [2] and [3] are semantically equivalent since they entail each other. Notice, though, that TALL undergoes a type-shift in [3] since it occurs as a predicate rather than as predicate modifier as in [2]. In [3], it is not of the type hhe,ti, he,tii; but rather, it is of the type he,ti, given that it is the argument for the QNP, SOME(BOY WHO FELL), whose semantic type is WHO FELL), whose semantic type is hhe,ti,ti. (For similar reasons, FELL undergoes a typeshift in [2] and [3].) Now all of this is well-known, but the point I wish to make here is that evidence of semantic type-shifting undermines Fodor’s claim that “if language is compositional then how a sentence is put together must be very explicit about how the corresponding thought is put together”, (Fodor, 2001, p. 12; my emphasis.) The syntactic operations defined over the constituents of sentences [2] and [3] are inexplicit about the semantic structure of these sentences. On the one hand, their syntactic structure permits semantic type-shifting in the case of intersective adjectives. On the other hand, their syntax is also compatible with the idea that TALL denotes a function from objects to truth-values in both its accusative and predicative occurrences. On that reading, if one is to semantically interpret TALL BOY on the basis of the semantic interpretations of TALL and BOY, respectively, one will have to add a semantic rule that takes in, as an
70
Reinaldo Elugardo
argument, a pair of objects-to-truth-values functions, and that outputs an objectto-truth-value function. So, the syntax of sentences [2] and [3] does not settle the semantic issue, at least not in any explicit way. But, even so, both sentences are semantically compositional in Fodor’s sense either way. One can simply apply the corresponding function-application rule at each syntactic level, from the terminal nodes of a sentence’s syntactic tree to its trunk, to arrive at the sentence’s semantic content. Furthermore, the way we “read off” a sentence’s content from its parts may be very different from the way those parts are put together. For example, suppose that the way we understand sentences like [1]is by implicitly employing a metalinguistic rule like [R]: [R] If a sentence is of the form, [S [NP [[DET [a]], [N [F]]]],[VP [V [G]]]]], then it means that there is at least one thing which satisfies both ‘F’ and ‘G’. Although [R] assigns no semantic values to the constituents, ‘some’, ‘some boy’, and ‘fell’, they all affect the meaning of [1]. By [R1], we get: ‘Some boy fell’ means that there is at least one individual of whom ‘boy’ and ‘fell’ are true. We can then derive the content of [1], namely, that at least one boy fell, by using the standard semantic base clauses, [S1] and [S2], for ‘fell’ and ‘boy’, respectively. Nothing more is needed for us to fix the semantic content of [1]. Maybe this is the way we understand sentences like [1]. But if it is, it probably doesn’t correspond to the way the parts of [1] are put together. Moreover, a language that incorporated a rule like [R] for sentences like [1] could be semantically compositional, even though Constraint [1] fails on the metalinguistic reading of indefinites.9 One might object that, even though one could always construct a phonological representation of a sentence in which some bit of the sentence is treated syncategorematically, it hardly follows that that the represented item is not a constituent of the sentence and thus is not an open to semantic compositional processing. Any syncategorematic treatment merely masks semantic constituency.10 Although this objection is correct, it misses the mark. On a strong reading of Constraint [1], syncategorematic elements pose a problem for Fodor’s view insofar as we recognize them as syntactic constituents of sur9
My argument doesn’t depend on there being a correct theory about indefinites. Constraints [1] and [2] are supposed to hold necessarily for any possible compositional system of representation. If so, then my argument needs only the plausible assumption that there could be a correct compositional semantics for sentences like [1] that treated certain constituents as syncategoremata at the level of logical form. For an interesting discussion on Russell’s view on this matter, see Barber (2005) and Botterell (2005). 10 I am grateful to the anonymous referee who raised this criticism.
Fodor’s Argument
71
face form. The existence of a possible syncategorematic semantics for QNPs in English undermines his strong claims about the necessity of compositionality in any productive and systematic representational system. Of course, one could always weaken Constraint [1] so that only contentful syntactic constituents must correspond to semantic constituents. However, such a proposal runs contrary to Fodor’s view since he proposes full explicitness conditions in the interest of showing precisely that English doesn’t satisfy them. In fact, many complex English expressions violate both Constraints [1] and [2] even though they are arguably compositional.11 For example, expletive uses of ‘it’, as in ‘It is true that Bush was elected to office in 2000’, and ‘there’, as in ‘There appears to be a cat hiding behind the couch’, are syncategorematic since they don’t designate anything from any semantic category. In both cases, nothing in the content of the thoughts expressed by those sentences (relative to a context) corresponds to ‘it’ and ‘there’, respectively. Prepositions are another example. For instance, the ‘to’ in ‘Mary wants to leave’, is purely a marker of infinitival form and doesn’t correspond to any semantic constituent in the content of the thought expressed by the sentence (relative to a context). The ‘of’ of ‘Hugh’s knowledge of Greek history’ is only there because nouns don’t take NP complements in English. And yet, the content expressed by ‘Hugh’s knowledge of Greek history’ is inherited from the contents expressed by the contents of its immediate constituents, ‘Hugh’ and ‘knowledge of Greek history’, under the genitive mode of operation. It cannot be, then, an a priori condition on any possible compositional system of representation that it lack syncategorematic elements.12 If that is right, then we have another good reason for rejecting [P1]. Suppose, however, that the elements just mentioned are not constituents of the sentence at the level of representation relevant to semantic interpretation, e.g., LF. Perhaps the expletive ‘there’, infinitival prepositions, the genitive ‘of’, etc., are only inserted in the process of deriving surface forms. The LF structures would be ones in which these elements don’t exist; in which case, they cannot pose a problem for compositionality. Thus, even if there are semantically empty but phonologically and syntactically realized elements, as long as 11
I want to thank Barbara Abbott and Heidi Harley for the examples used in this paragraph. 12 For all we know, Mentalese representations might have syncategorematic elements. For instance, I could think the thought IT IS BETTER TO GIVE THAN TO RECEIVE. If that thought has any syncategorematic elements, then it is not compositional in “strict detail” by Fodor’s standards. So, either Fodor’s “explicitness-of-content” constraint on compositional systems of representation is too restrictive or else it must be unpacked in a way that does not advert to Constraint [1] and Constraint [2]. Either way, The Inexplicitness Argument fails if some thoughts are Mentalese tokens that have syncategorematic elements and a semantically compositional structure.
72
Reinaldo Elugardo
there is some level of linguistic representation at which a sentence has a compositional semantics, that may be sufficient for Fodor’s explicitness constraint. I will grant the objection. Still, Fodor has the burden of specifying the level at which explicitness must obtain. Although he does not say it in so many words, his view seems to be that explicitness must obtain at the sentence’s surface form. According to him, natural language is not semantically compositional precisely because many of its sentences have semantic constituents to which none of their surface form-constituents – lexical, phrasal, or clausal – corresponds, and vice versa. If that really is his view, then Fodor would have to say (mistakenly) that the complex NP, ‘Hugh’s knowledge of Greek history’, is not compositional because it contains a semantically empty primitive constituent in its surface form. On the other hand, if Fodor specifies LF as the level of representation at which explicitness must obtain, then (contrary to what he argues) his examples of alleged inexplicit semantic content may meet his explicitness constraints after all. Suppose that the only way that context can affect the truth-conditions of an utterance of a sentence is through the sentence’s rich covert syntactic structure. Then, even though no locational variable is marked in the surface form of a sentence like ‘It is 3 o’clock’, that doesn’t mean that no free locational variable is assigned to it at LF, one whose semantic value is contextually supplied. If the syntax of English does assign covert free variables to sentences at LF, then Fodor hasn’t shown that his examples of inexplicit content are cases of nonexplicitness (at the level of LF) – that will be the main point of the next section. In presenting The Inexplicitness Argument, Fodor moves from “evidence of ellipsis” to “evidence of an absence of isomorphism” and then to “evidence of non-compositionality”. If my arguments in this section are successful, then they block the second inference. In the next section, I will argue that Fodor’s first inference can also be blocked.13 Ellipses are inconclusive evidence for non-compositionality In his defense of premise [P3], Fodor gives the following example of linguistic inexplicitness: [H] . . . you ask me what’s the time, and I say ‘it’s three o’ clock’. This reply is inexplicit in two different ways. For one thing, the sentence is syntactically inexplicit; presumably what I’ve uttered is an abbreviated form of something like ‘it’s three o’clock here and now’. Second, al13
As passages [G] and [H] show, Fodor assumes that a speaker’s communicative intentions are part of the context in the sense relevant to semantics, namely, they determine the proposition or truth-conditional content that is semantically expressed by an utterance of a sentence in a given context. This assumption is controversial. For a criticism of this assumption, see Bach (1994b).
Fodor’s Argument
73
though the thought I intended to convey is that it’s three o’clock in the afternoon, Grice requires me to leave some of that out. That’s because, even though you don’t know exactly what time it is, you do know what time it is plus or minus twelve hours; and I know that you know that; and you know that I know you know that; and so on . . . . What’s obvious in the shared intentional context is generally not something that one bothers to say, even if it is part of what one intends to communicate. (Fodor, 2001, p. 12)14 In other words, some sentences abbreviate the contents they express either because they are syntactic ellipses or because the background assumptions that speakers share with their audiences about the context of utterance makes it unnecessary for speakers to be explicit about the propositions they mean to communicate. In either case, Fodor thinks that the form-meaning pairing has unarticulated constituents. Evidence of ellipsis, whether syntactic or pragmatic, is then evidence of a lack of isomorphism between form and content and thus, he thinks, it is also evidence of non-compositionality. Assume for the sake of argument that, in Fodor’s example, the sentence he uttered – ‘It is three o’clock’ – in response to the question, ‘What time is it?’, is a syntactic ellipsis. What should we conclude from that? Well, we should conclude that the surface form of the sentence used is inexplicit about the complete content of the thought that Fodor expressed. But we should not infer that the sentence lacks a form-content isomorphism. To see why, consider the view of ellipsis developed in Morgan (1973). According the Morgan’s theory, Fodor’s sentence literally contains ‘here’ and ‘now’, phonological appearances not withstanding, whose semantic values are fixed relative to the context of utterance. (In fact, Fodor mentions this view in presenting his example.) The syntactic structure of ‘It is three o’clock’ is just the structure of ‘It is three o’clock here and now’. Some constituents of ‘It is three o’clock [here and now]’ are therefore pronounced while others are not. So, on this view, all of the constituents of the sentence, ‘It is three o’clock’, (ignoring the expletive ‘it’) correspond to the semantic constituents of the thought that Fodor expresses, namely, the thought that it is three o’clock in the afternoon (relative to the time and place of Fodor’s utterance). Conversely, on Morgan’s view of ellipsis, all the semantic constituents of Fodor’s expressed thought correspond, in the context in question, to constituents (pronounced or unpronounced) of the sentence uttered. Thus, on 14
Linguists generally take the present tense to be a syntactic constituent of the sentence ‘It is 3 o’clock’. The meaning of ‘now’ is thought to be encoded in the semantics of the present tense and as composing with meaning of the verb phrase. On this view, then, ‘It is 3 o’clock’ needn’t be seen as a case of a from-meaning pairing that has an unarticulated constituent.
74
Reinaldo Elugardo
Morgan’s theory of syntactic ellipsis, Fodor’s example isn’t an example of a sentence that fails to meet Constraints [1] and [2]. Consider now the theory of syntactic ellipsis developed in Williams (1977). On Williams’s account, syntactic ellipses do not contain any unpronounced ordinary linguistic expressions , such as ‘here’ and ‘now’ , but rather they contain special covert anaphors whose semantic values are supplied by prior linguistic material and by context, cf. Barton (1990). No ordinary “elided” linguistic material is present in the syntax of the elliptical sentence but there is extra structure there.15 If we apply Williams’ theory to Fodor’s example, then ‘It is three o’clock’ has empty anaphors in its constituent structure. Their semantic values are contextually fixed, in part, by some prior linguistic material provided by the context of utterance – in this case, an utterance of the interrogative sentence, ‘What is the time?’. On Williams’ view, both sentences contain two implicit null anaphors – one to mark the time of utterance and the other to mark the location of the utterance (or not even null in the case of the temporal anaphor). Context supplies the values for both in each case. In the case of Fodor’s utterance, it supplies the property of being three o’clock at the moment Fodor uttered ‘It is three o’clock’ in response to his interrogator’s question. It also supplies the property of being three o’clock at the place where Fodor was located when he uttered ‘It is three o’clock’ in response to that (same) question. The main point is that, on Williams’ “special syntax” theory of syntactic ellipsis, Fodor’s sentence (or his utterance of the sentence) is semantically compositional since its content is completely fixed once the contents of the covert syntactic elements are contextually fixed given its structure.16 Here is another example of the same idea. In the appropriate context, I can utter ‘a boy fell’ to convey the domain-restricted quantificational proposition that among the children who played tag in my local school playground yesterday one was a boy who fell. Fodor is right to point out that the sentence-type, ‘a boy fell’, does not explicitly encode the structure of the more restrictive propositions that speakers conventionally express by their utterances of the sentence. After 15
In Sag and Hankamer (1977), Sag and Hankamer argue that VP-ellipses result from the deletion of an ordinary VP under identity to some prior linguistic material. Hearers are able to reconstruct the elided material of a VP-ellipsis given the prior linguistic material without much difficulty. There must, then, be something in the syntax of a VP-ellipsis that calls out for completion under syntactic processing. If so, then Fodor’s Constraint [2]need not be violated on the Sag-Hankamer account since the semantic constituents of a VP-ellipsis will correspond to the elided syntactic constituents that are recovered under syntactic processing. This analysis applies also to gapping, sluicing, and other kinds of recovered null elements, e.g., ‘Which book did John shelve before reading?’ 16 See Szabo (2001).
Fodor’s Argument
75
all, no one would say that it is a feature of English that the sentence I used means in English the proposition I meant. But, once again, that doesn’t show that the sentence used is not semantically compositional once it is relativized to a context for semantic interpretation. One could adopt the variable-rich semantic view, advocated by Jason Stanley and Zoltan Szabo, according to which the nominal ‘boy’ has its usual denotation at the lexical level (the set of boys) but has a covert contextual variable when it occurs in sentences, cf. Stanley and Szabo (2000).17 The idea is that ‘boy’ in ‘a boy fell’ has (at LF) a covert variable, f(i), where the value of i is some object provided by the context in which the sentence is used, and the value of f is a contextually supplied function that maps a set of objects into quantifier domains. In my example, the f-function maps the set of boys into the domain of children who played tag at my local school playground yesterday, since that is the contextually relevant group I meant when I used ‘a boy fell’. The function will yield, for that argument, the intersection of those two sets. The semantic value of ‘a boy’ is the result of applying the f-function, as supplied by the context, to some individual i or other, as supplied by the context, who is a member of the intersection set. The restrictive proposition expressed by the sentence relative to that context and function assignment will be true (relative to that context and assignment) if the boy in question fell and will be false if no member of that intersection set fell. Notice that, on the Stanley-Szabo view, the proposition that the sentence semantically expresses relative to the context is fixed solely by the semantic values of their primitive parts (as supplied by the context) and by its rich covert structure. 18 Different utterances of the same sentence will express different propositions, from context to context. But the propositions expressed share the same form and compositional structure. The general moral is this: absent an empirical theory of syntactic ellipsis, we cannot infer that a sentential form-content pairing must have unarticulated constituents from the fact that the form is a syntactic ellipsis – the cogency of that inference will depend on what the true story of ellipsis is.19 17
Bach, Cappellen, and Lepore deny that nominals have a rich constituent structure, such as a covert null variable that ranges over a domain-restricted class of entities. Jason Stanley argues that they do, cf. Stanley (2000), Stanley (2002a), and Stanley (2002b). For replies, see Bach (2000), Cappelen and Lepore (2005), Elugardo and Stainton (2004), and Neale (2000), As I argue in the third section of this paper, even if Stanley is right, that won’t help Fodor’s case. 18 An alternative approach is to supply the contextual information to the QNP rather than to its head nominal, cf. Pelletier (2003). 19 In Dalrymple (2005), Mary Dalrymple argues that syntactic ellipses have no hidden “special” anaphors or ordinary elided linguistic material. If she is right, then all syntactic ellipses have unarticulated constituents or else there is no such thing as a syntactic
76
Reinaldo Elugardo
Objections to premise [4] I now want to discuss Fodor’s claim that the semantic content of a sentence is the thought that it is used to express. But, first, I will introduce some important technical notions that will help facilitate the discussion. First, let us call what a speaker asserts/claims/states in the utterance of a sentence, relative to a context of use, the (literal) assertion-content of the utterance. Second, let us distinguish the assertion-content of an utterance from the semantic content of the sentence used. The latter is the product of two factors: the type-meaning that a sentence encodes in the language independently of any context, and the referential values that are assigned to the context-sensitive elements of the sentence (if any) after disambiguation. I shall assume that the assertion-content of an utterance is determined, in part, by the semantic content of the sentence used and by the pragmatic features of the utterance. Now, according to Fodor, a sentence may be semantically inexplicit about what a speaker asserts and he takes that to show, to some degree, the noncompositionality of language. He has in mind cases in which semantic content underdetermines assertion-content.20 This doesn’t show, however, that the semantic content of the sentence used is not compositionally fixed. On the contrary, the examples defeat his claim that the semantic content of a sentence is the thought that it conventionally expresses. The reason is that the semantic content of a sentence and the assertion-content of an utterance can come apart. If this is right, then premise [P4] is false. Consider an example from Bach (1999). Suppose, for the sake of argument, that sentence [4] contains no hidden variables whose values are filled in by context: [4] I am ready. [ready for what? ready to do what? ready to go where?] A speaker – call him ‘Jones’ – utters [4], as he puts on his coat. Suppose that the thought that he then expressed – the assertion-content of his utterance – is [5]: [5] I (Jones) am ready to go to the university campus In this case, Jones said more than what is explicitly represented by the semantics and syntax of the sentence he used. We may suppose that Jones’s audience are in the know and are aware of his communicative intentions. They “recover” or “reconstruct” [5] as the thing that Jones asserted by his utterance of [4] – ellipsis. 20 See Bach (1994), Bach (1999), Carston (2002), Clapp (2003), Elugardo and Stainton (2001), Elugardo and Stainton (2004), Recanati (2002), Stainton (1994), and Stainton (1995).
Fodor’s Argument
77
they manage to do so on the basis of linguistic information about [4], contextual information about Jones’s utterance, and shared common knowledge and background beliefs. The underlined portion of [5] represents a constituent of Jones’s expressed thought that is not a semantic value of any syntactic constituent of the sentence Jones uttered. On the contrary, it is one that the hearer interprets Jones as meaning in context after the hearer fixes the reference of Jones’s utterance of ‘I’ in [4] and disambiguates by assigning an adverbial clause. The Bach example counts as evidence against linguistic compositionality only if one assumes that the semantic content of a sentence is simply the assertion-content of an utterance of the sentence. However, that assumption is dubious if certain “pragmatic determinants” accounts of assertion-content are correct. For instance, on Bach’s version, the type-meanings of the constituents of [4], together with its structure compositionally yield (after referenceassignment and disambiguation) a proposition radical – a semantic “blueprint” for constructing a complete proposition – as the encoded semantic content of Jones’s sentence relative to Jones’s context of utterance.21 The semantic content of [4], relative to Jones’s context, lacks truth-conditions since it is an incomplete proposition. According to Bach, the encoded content is presented as input for pragmatic processing: the hearer “completes” what Jones said, as it were, in order to arrive at Jones’ expressed thought as Jones so intended, which is [5]. By contrast, [5] is a complete proposition that has truth-conditions (but see Cappellen and Lepore (2005) for some skeptical arguments to that claim). So, if Bach’s pragmatic account of assertion-content is correct, then premise [P4] is false – the semantic content of a sentence, relative to a context of use, is not always the thought expressed (the assertion-content) since the latter but not the former has non-semantic pragmatic determinants. Bach’s account of the determinants of assertion-content presupposes that natural language sentences are semantically compositional. Thus, if Bach is right, then cases in which a sentence is semantically inexplicit about the assertion-content of the speaker’s utterance are compatible with the view that natural language is semantically compositional. 21
Cappellen and Lepore (2005) argue that sentences like [4] have no covert, rich, syntactic structure. So they agree with Bach on that point. However, they reject Bach’s claim that [4] does not express, relative to a context, a complete proposition. On their view, the semantic content of [4] is simply the proposition that, necessarily, is true relative to a context C just in case the referent of the occurrence of ‘I’ in [4], relative to C, has the property of being ready (if there is such a property) in C. According to them, the semantic content of [4] is compositionally fixed, although what a speaker may have said in uttering [4] in C (and she may have said many things by her utterance in that context) is not compositionally fixed.
78
Reinaldo Elugardo
Let’s switch gears and assume that no non-truth-conditional account of the determinants of assertion-content can be correct. Suppose also that the assertion-content of a speaker’s utterance – the thought that she expresses – and the semantic content of the sentence uttered are one and the same (relative to the context of use). How, then, can we account for the fact that [4] is semantically unambiguous and yet, the same speaker could use [4] to assert different propositions about herself in different contexts that are not truth-conditionally equivalent? One way – and it is one defended in Borg (2005) – is to endow [4] with a structure that contains an implicit, existentially quantified, purpose-clause that serves as an argument for ‘ready’. Context supplies the required arguments of the predicate and their semantic values. Relative to Jones’s context of utterance, [4] semantically encodes a complete proposition that is the content of Jones’s expressed thought, namely, that he is ready to go to the university campus. Said in another context, Jones expresses – and [4] semantically encodes relative to that context – the thought that he is ready to eat dinner. On this alternative view, the context-independent proposition that [4] expresses is represented in [6]: [6a] I am ready to go somewhere (close/interesting/exciting . . . ) [6b] I am ready to do something (fun/interesting/worthwhile . . . ) So, in any context in which Jones uttered [4], he asserted something that is of one of those forms – context supplies the contextually relevant grammatical expression for the implicit clause. The key point is that, on this view, [4]’s semantic content is compositionally determined, relative to a context, by its constituent meanings and its rich syntax. Once context has determined the relevant semantic values of the context-sensitive primitives of [4], both covert and overt, the content of [4] is completely fixed given its structure, even though it semantically underdetermines what a speaker may conversationally implicate by her utterance of [4]. So, if Borg’s view is correct, such cases do not defeat the claim that sentences like ‘I am ready’ are semantically compositional even if there is no distinction between semantic content and assertion-content. Finally, Premise [P4] is subject to the following general objection.22 First, if a sentence that is uttered in context expresses a thought relative to that context, then it linguistically represents that thought(relative to that context). Given [P4], it then follows that the semantic content of a sentence, as fixed relative to a context of utterance, trivially matches the thought that the sentence linguistically represents relative to that context – I say “trivially matches” because, if 22
I owe this objection to the anonymous referee. Thomas Vinci also raised a similar objection in an earlier discussion.
Fodor’s Argument
79
[P4] is true, then it is one and the same content that is linguistically represented in a given context. Now it is a truism that one thing can represent another without having to represent every aspect of the latter – highway maps, for example, don’t represent potholes. Thus, it should be the default case that if a sentence expresses a thought (relative to a context of utterance), then the semantic content of the sentence will not match the thought that it represents – and so, Fodor is right when it comes to sentences qua symbols-in-use. We end up, then, with the following disjunction: either a sentence used in a context fails to linguistically represent any thought that it expresses in that context (if it expresses any), or it does not express any thoughts relative to the context, or its content is not the thought that it expresses relative to the context of utterance. The first two disjuncts are unacceptable for anyone like Fodor who thinks that language represents thought rather than the other way around. In that case, we should accept the third disjunct, which is tantamount to rejecting [P4]. To sum up: speakers often mean things that outstrip the meanings of the sentences they use, but that doesn’t show that language is non-compositional. On the one hand, these examples are ones in which the thought expressed or conveyed by the speaker is not the semantic content of the sentence because the former has pragmatic determinants whereas the latter does not; in which case, premise [P4] is false. On the other hand, if [P4] is true, then it is open for one to argue, on independent grounds, that these are cases in which the sentence used has a rich, covert, syntax, and thus the form-content pairing is compositionally fixed relative to a context of use. I will conclude my discussion of Fodor’s argument by considering two objections. 5
Rejoinders and Replies
The first objection is that, once we admit that the semantic values of the primitive constituents of a sentence – whether covert or overt – are context-dependent, then compositionality must be given up. According to Fodor, “context independence is a necessary condition for compositionality” (Fodor, 2000, p. 355). The idea is that if, in determining the content of a complex symbol, you have to appeal to a third factor – such as context – in addition to the meanings of a complex symbol’s primitive constituents and its syntactic structure, then you have violated both the spirit and the letter of Compositionality. However, that means that the semantic base clauses of a compositional semantics for a language, which define the meanings of its primitive expressions, cannot place any contextually relevant constraints on any meaning-assignments to any primitive symbol. By that line of reasoning, languages containing deictic constructions cannot be compositional, which is implausible.
80
Reinaldo Elugardo
A more powerful objection is this one: [I] My point is that a perfectly unelliptical, unmetaphorical, undeictic sentence that is being used to express exactly the thought that it is conventionally used to express, often doesn’t express the thought that it would if the sentence were compositional. Either (the typical case; see just above [‘It is three o’ clock’ – RE])it vastly underdetermines the right thought; or the thought it determines when compositionally construed isn’t in fact, the one that it conventionally expresses. (Fodor, 2001, p. 12; bracketed material added.) In other words, contents can’t be compositionally assigned to sentences since what a compositional semantics for English would assign as the semantic contents of its sentences won’t be the thoughts that English speakers conventionally express by means of them. For example, English speakers do not express, by their conventional use of ‘The book is on the table’, the false Russellian general proposition that exactly one book and one table exist and that the former is on the latter. And yet, that is precisely what the sentence would mean if it were given a compositional semantics (assuming that Russell’s semantic analysis of sentences containing descriptions is correct). The standard Russellian compositional reading of the sentence will not deliver the right semantic content (where “the right semantic content” is supposed to be, for Fodor, the thought that sentences of that form are used to conventionally express). That last point is not unique to definite descriptions – it applies to all sorts of grammatical constructions: [J] Now, definite descriptions aren’t freaks; to the contrary, the point they illustrate is really ubiquitous. If you read a sentence as though it were compositional, then the thought that it ought to be conventionally used to express turns out not to be the one that it is conventionally used to express. (Fodor, 2001, p. 13) Fodor’s point may be put this way. Suppose that, as a matter of compositional semantics, S means in English that P, where S is any non-deictic, non-elliptical, non-metaphorical sentence. Then, P is the thought that S ought to be used to express in English since P is compositionally assigned to S solely in terms of S’s component meanings and S’s structure. Presumably, the thought that an English sentence ought to express is simply the thought that English speakers in fact conventionally use the sentence to express. However, since P is, by hypothesis, the content that a compositional semantics of English would assign to S, then, chances are that P won’t be the thought that speakers conventionally express by S – more than likely, S will conventionally express some other thought, one that isn’t wholly fixed by S’s syntax and S’s constituent meanings (and the rules for
Fodor’s Argument
81
combining the constituent meanings in order to derive the meaning of S.) Since English speakers would violate no semantic norm if they were to routinely use S to express a thought other than P, S’s semantic meaning is not compositionally fixed. Therefore, if P is the content that a compositional semantics of English would assign to S, then more than likely P is not what S means in English. Fodor assumes, then, that the following principle is true: S means that P in L if and only if it is a conventional norm of L that speakers of L use or should use S to say/express that P. But, when read from right to left, the principle is false. As H.P. Grice noted, we should distinguish what a sentence says in a language (given its component lexical meanings and syntactic structure) from what a speaker uses a sentence to say, cf. Grice (1975). Suppose that, as a matter of convention, speakers generally utter, ‘I broke a fingernail today’, in situations in which they intend to report that they broke their own fingernails (on the day of the utterance). So, when I utter ‘I broke a fingernail today’, I said or conveyed by my utterance that I broke my own fingernail. Still, the sentence does not mean that its speaker broke his or her own fingernail. After all, the sentence would be true rather than false if a speaker had used it to report truthfully that he had broken someone else’s fingernail on the day in question. Fodor’s principle is also false when read from left to right. Consider The Liar Sentence, ‘This sentence is false’. It means (in English) what it says of itself (in English), namely, that it is false. However, it is not a conventional norm of English that English speakers use, or should use, The Liar Sentence to say what that sentence says of itself. For one thing, such a norm would sanction the speech-act of making self-contradictory assertions, which we as truth-tellers should try to avoid doing at all costs if we can. For another, we don’t have to use The Liar Sentence to say of it, or to express, what it says of itself. One can use another sentence to say what The Liar Sentence says of itself – in fact, I did just that a few sentences ago. Another, less controversial, example also undermines Fodor’s normativity principle (when read from left to right), one that involves a misused word. The sentence, ‘While Mary teaches in the daytime, Mary exercises at night’ literally means that Mary teaches in the daytime and, during the same time, Mary exercises at night. English speakers are not required, as a matter of convention, to use the mentioned sentence to say that! At best, the sentence is conventionally used to say that, although Mary teaches in the daytime, she exercises at night. These examples show that Fodor has not established an incompatibility between linguistic compositionality and linguistic normativity. 6
Conclusion
I have argued that Fodor’s Inexplicitness Argument is unsound for two reasons. First, compositionality does not require explicitness-of-structure-of-content.
82
Reinaldo Elugardo
Second, his appeal to pragmatic and syntactic ellipses doesn’t settle the issue about the compositionality of language (as opposed to language use). Examples of pragmatic ellipses can’t settle it unless one conflates assertional content with semantic content of the expression uttered. Examples of syntactic ellipses can’t settle it unless one assumes a particular empirical theory of ellipsis that entails that syntactic ellipses have unarticulated constituents. I noted at least three theories – Morgan (1973), Williams (1977), and Stanley and Szabo (2000) – that are compatible with the claim that syntactic ellipses are semantically compositional. The question that still remains is why Fodor thinks that compositionality requires something as strong as an isomorphism between semantic constituents and syntactic constituents. That is another important topic that will have to be addressed at another time. Acknowledgements Thanks to Fred Adams, Kent Bach, Monte Cook, Jerry Fodor, Heidi Harley, Claire Horisk, David Hunter, Adam Morton, and Jay Newhard for their valuable comments on an earlier draft. I am especially indebted to Barbara Abbott, Lenny Clapp, Robert J. Stainton, and Daniel Weiskopf for providing numerous detailed comments on several drafts, which helped improve the paper significantly. I am also grateful to the anonymous referee for his or her useful comments. A version of this paper was presented to the Conference on Compositionality, Concepts, and Cognition, at the University of D¨usseldorf, in March 2004. I am grateful for the audience comments that I received, especially from Peter Pagin and Jesse Prinz. A later draft was presented to the Philosophy Department at Dalhousie University in April 2004. I benefited from the comments that I received there too. References Bach, K. (1994a). Conversational impliciture. Mind & Language, 9, 124–162. Bach, K. (1994b). Thought and reference. Oxford: Oxford University Press. Bach, K. (1999). The semantics-pragmatics distinction: What it is and why it matters. In K. Turner (Ed.), The semantics-pragmatics interface from different points of view (pp. 65–84). Oxford: Elsvier. Bach, K. (2000). Quantification, qualification and context: A reply to Stanley and Szabo. Mind & Language, 15, 262–283.
Fodor’s Argument
83
Barber, A. (2005). Co-extensive theories and unembedded definite descriptions. In R. Elugardo & R. Stainton (Eds.), Ellipsis and non-sentential speech (pp. 185–202). Dordrecht: Kluwer Academic Publisher. Barton, E. (1990). Nonsentential constituents: A theory of grammatical structure and pragmatic interpreetation. Amsterdam: John Benjamins. Borg, E. (2005). Saying what you mean: Unarticulated constituents and pragmatic interpretation. In R. Elugardo & R. Stainton (Eds.), Ellipsis and non-sentential speech (pp. 237–268). Dordrecht: Kluwer Academic Press. Botterell, A. (2005). Knowledge by acquaintance and meaning in isolation. In R. Elugardo & R. Stainton (Eds.), Ellipsis and non-sentential speech (pp. 165–184). Dordrecht: Kluwer Academic Press. Cappelen, H., & Lepore, E. (2005). Insensitive semantics. Oxford: Blackwell. Carston, R. (2002). Thoughts and utterances. Oxford: Blackwell. Clapp, L. (2003). What unarticulated constituents could not be. In J. Campbell, M.O’Rourke, & D. Shier (Eds.), Meaning and truth (pp. 231–256). New York: Seven Bridges Press. Dalrymple, M. (2005). Against reconstruction in ellipsis. In R. Elugardo & R. Stainton (Eds.), Ellipsis and non-sentential speech (pp. 31–56). Dordrecht: Kluwer Academic Publisher. Elugardo, R., & Stainton, R. (2001). Logical form and the vernacular. Mind & Language, 16, 393–424. Elugardo, R., & Stainton, R. (2004). Shorthand, syntactic ellipsis, and the pragmatic determinants of what is said. Mind & Language, 19, 442–471. Fodor, J. (1990). A theory of content and other essays. Cambridge, Massachusetts: MIT Press. Fodor, J. (1998). Concepts: Where cognitive science went wrong. Cambridge, Massachusetts: MIT Press. Fodor, J. (2000). Replies to critics. Mind & Language, 15, 350–374. Fodor, J. (2001). Language, thought and compositionality. Mind & Language, 16, 1–15. Fodor, J., & Lepore, E. (1991). Why meaning (probably) isn’t conceptual role. Mind & Language, 6, 328–343. Fodor, J., & Lepore, E. (1992). Holism: A shopper’s guide. Oxford: Blackwell.
84
Reinaldo Elugardo
Fodor, J., & Lepore, E. (2002). The compositionality papers. Oxford: Clarendon Press. Fox, D. (forthcoming). On logical form. In R. Hendrick (Ed.), Minimalist syntax. Oxford: Blackwell. Frege, G. (1892/1980). On sense and meaning. In P. Geach & M. Black (Eds.), Translations from the philosophical writings of Gottlob Frege (3rd ed., pp. 56–78). Totowa: Rowman & Littlefield. Grandy, R. (1990). Understanding and the principle of compositionality. Philosophical Perspectives, 4, 557–572. Morgan, J. (1973). Sentence fragments and the notion of ‘sentence’. In B. Kachru (Ed.), Issues in linguistics (pp. 719–751). University of Illinois Press. Neale, S. (2000). On being explicit: Comments on Stanley and Szabo, and on Bach. Mind & Language, 15, 284–294. Pelletier, F. (1994a). The principle of semantic compositionality. Topoi, 13, 11–24. Pelletier, F. (1994b). Semantic compositionality: The argument from synonymy. In R. Casti, B. Smith, & G. White (Eds.), Philosophy and the cognitive sciences (pp. 208–214). Vienna: Hoelder-Pichler-Tempsky. Pelletier, F. (2003). Context dependence and compositionality. Mind & Language, 18, 148–161. Recanati, F. (2002). Unarticulated constituents. Linguistics and Philosophy, 25, 299–345. Russell, B. (1905). On denoting. Mind, 14, 479–493. Sag, I., & Hankamer, J. (1977). Syntactically vs. pragmatically controlled anaphora. In R. Gasold & R. Shuy (Eds.), Studies in language variation. Washington D.C.: Georgetwon University Press. Schiffer, S. (1986). Compositional semantics and language understanding. In R. Grandy & R. Warner (Eds.), Philosophical grounds of rationality (pp. 175–207). Oxford: Clarendon Press. Stainton, R. (1994). Using non-sentences: An application of relevance theory. Pragmatics and Cognition, 2, 269–284. Stainton, R. (1995). Non-sentential assertions and semantic ellipses. Linguistics and Philosophy, 18, 281–296.
Fodor’s Argument
85
Stanley, J. (2000). Context and logical form. Linguistics and Philosophy, 23, 391–434. Stanley, J. (2002a). Nominal restriction. In G. Preyer & G. Peter (Eds.), Logical form and language (pp. 365–388). Oxford: Oxford University Press. Stanley, J. (2002b). Making it articulated. Mind & Language, 17, 149–168. Stanley, J., & Szabo, Z. (2000). On quantifier domain restriction. Mind & Language, 15, 219–261. Szabo, Z. (2001). Adjectives in context. In R. Harnish & I. Kenesei (Eds.), Perspectives on semantics, pragmatics and discourse: A festschrift for Ference Kiefer (pp. 119–146). Amsterdam: John Benjamins. Szabo, Z. (2004). Compositionality. The Stanford Encyclopedia of Philosophy. (http://plato.stanford.edu/archive/fall 2004/entries.compostionalty) Williams, E. (1977). Discourse and logical form. Linguistic Inquiry, 8, 101– 139.
Compositionality Inductively, Co-inductively and Contextually Tim Fernando To say that the meaning [[a]] of a term a is given by the meanings of a’s parts and how these parts are combined is to state an equality [[a]] = . . . [[b]] . . . for b a part of a
(1)
with the meaning function [[·]] appearing on both sides. (1) is commonly construed as a prescription for computing the meaning of a based on the parts of a and their mode of combination. As equality is symmetric, however, we can also read (1) from right to left, as a constraint on the meaning [[b]] of a term b that brings in the wider context where b may occur, in accordance with what Dag Westerst˚ahl has recently described as “one version of Frege’s famous Context Principle” the meaning of a term is the contribution it makes to the meanings of complex terms of which it is a constituent. (Westerst˚ahl, 2004, p.3) That is, if reading (1) left-to-right suggests breaking a term apart (and delving inside it), then reading (1) right-to-left suggests merging it with other terms (and exploring its surroundings). These complementary perspectives on (1) underly inductive and co-inductive aspects of compositionality (respectively), contrasted below by (i) reviewing the co-inductive approach to the Fregean covers of Hodges (2001) anticipated in Fernando (1997) and by (ii) inductively deriving a more recent theorem of (Westerst˚ahl, 2004) on the extensibility of compositional semantics closed under subterms. Choosing between inductive and co-inductive approaches to (1) does not, by itself, determine the meaning function [[·]]. The ellipsis in (1) points to a broader Address for correspondence: Computer Science Department, Trinity College, Dublin 2, Ireland. E-mail: [email protected]. The Compositionality of Meaning and Content. Volume I: Foundational Issues. Edited by Markus Werning, Edouard Machery, & Gerhard Schurz. c
2005 Ontos Verlag. Printed in Germany.
88
Tim Fernando
notion of context capturing background assumptions that shape [[·]]. To square (1) with “dynamic” conceptions of meaning as context change (e.g. Heim, 1983), we shall inject a certain notion of context c inside meanings, and not simply hang them outside [[·]] as subscripts, [[·]] = [[·]]c . We proceed below as follows. Section 1 provides some basic background for sections 2 and 3, where the aforementioned inductive and co-inductive applications to compositionality are then described. Section 4 turns to context change, before section 5 concludes. 1
Background: Congruences and Extensions
The present section records useful definitions and facts reducing meaning functions [[·]] to synonymy relations. We begin by assuming that every element a of some fixed set T is assigned a meaning [[a]], before relaxing this assumption and considering the possibility of extending meaning assignments compositionally. Given an n-ary function f : T n → T on T , a function [[·]] : T → M is f compositional if there is a function [[ f ]] : M n → M allowing us to push [[·]] inward so that [[ f (a1 , . . . , an )]] = [[ f ]]([[a1 ]], . . . , [[an ]]) for all a1 , . . . , an ∈ T . An f -congruence is an equivalence relation ≡ on T such that f (a1 , . . . , an ) ≡ f (b1 , . . . , bn ) whenever ai ≡ bi for 1 ≤ i ≤ n — that is, a1 ≡ b1 · · · an ≡ bn f (a1 , . . . , an ) ≡ f (b1 , . . . , bn ) for all a1 , . . . , an , b1 , . . . , bn ∈ T . Given a family F of multi-ary functions (i.e. functions of various arities) on T , an F -congruence is a binary relation on T that is an f -congruence for every f ∈ F . Similarly, a function [[·]] : T → M is F -compositional if [[·]] is f -compositional for every f ∈ F . The kernel of [[·]] is the set κ[[·]] = {(a, b) ∈ T × T : [[a]] = [[b]]} of [[·]]-synonymous pairs from T . It is well-known that Fact 1. κ[[·]] is an equivalence relation on T , and moreover, [[·]] is F -compositional iff κ[[·]] is an F -congruence. That is, the compositionality of a function [[·]] : T → M reduces to testing that κ[[·]] is a congruence. Indeed, we may assume that meanings are simply subsets
Inductively, Co-inductively, Contextually
89
of T insofar as any binary relation ≡ on T induces the “term1 model” ·≡ : T → Pow(T ) mapping a ∈ T to its ≡-equivalence class a≡ = {b ∈ T : a ≡ b} from which it follows that κ[[·]] = κ(·κ[[·]] ) . We will make do in sections 2 and 3 with equivalences on T , returning to meanings in section 4. But first, let us partialize the preceding notions as follows. Fix a partial n-ary map α : T n + T . A partial map [·] : T + M is α-compositional if there is a function [α] : M n → M such that for all (a1 , . . . , an ) ∈ domain([·])n ∩ domain(α) for which α(a1 , . . . , an ) ∈ domain[·], [α(a1 , . . . , an )] = [α]([a1 ], . . . , [an ]) . Given a subset X ⊆ T and an n-tuple ~a ∈ T n , let dX α (~a) iff ~a ∈ X n ∩ domain(α) and α(~a) ∈ X . An (α, X)-congruence is an equivalence relation ≡ on X such that a1 ≡ b1 · · · an ≡ bn dX α (~a), dX α (~b) α(a1 , . . . , an ) ≡ α(b1 , . . . , bn ) for all a1 , . . . , an , b1 , . . . , bn ∈ T , where ~a = (a1 , . . . , an ) and ~b = (b1 , . . . , bn ). Given a set Σ of partial multi-ary maps on T , a (Σ, X)-congruence is an (α, X)congruence for every α ∈ Σ; and [·] is Σ-compositional if it is α-compositional for every α ∈ Σ. Fact 1 generalizes to Fact 2. [·] is Σ-compositional iff κ[·] is a (Σ, domain[·])-congruence. Henceforth, we write Σ-congruence for (Σ, T )-congruence, and dα for dT α . Next, we introduce some terminology for comparing binary relations ≡ and 0 ≡ on T . We say ≡ refines ≡0 if ≡ ⊆ ≡0 , as the contrapositive a 6≡ b whenever a 6≡0 b states ≡ respects all the distinctions ≡0 makes, so that ≡ is at least as fine as ≡0 , and ≡0 at least as coarse as ≡. For the term model ·≡ of ≡ to be a restriction of It is tempting to equate T with the set of terms generated by F , although we will not need the assumption that the elements of T are terms before section 2. 1
90
Tim Fernando 0
the term model ·≡ of ≡0 , we need to strengthen the inclusion ≡ ⊆ ≡0 a bit. Let us say ≡ extends to ≡0 if for all a, b ∈ domain(≡),
a ≡0 b iff a ≡ b
in which case we call ≡0 an extension of ≡. Clearly, 0
≡ extends to ≡0 iff ·≡ ⊆ ·≡ . Given X ⊆ T , let us call a (Σ, X)-congruence T -extensible if it extends to a Σ-congruence. In the next section, we consider the question: when is a (Σ, X)congruence T -extensible? 2
Finest Extensions and Subterm Extensibility Inductively
Read from left to right, equation (1) in the introduction above suggests a subterm property that very roughly says: to decide if a and a0 are synonymous (i.e., they have the same meaning), it suffices to consider subterms of a and a0 , and how they combine to yield a and a0 , respectively. The present section makes this suggestion precise, fixing, as in the previous section, a family Σ of partial multi-ary functions α on T . Given a binary relation ≡ on T , let ≡Σ be the set of all pairs (a, b) ∈ T × T such that a=b ˙ 2 is derivable from any finite number of applications of (i) the ≡-rule a≡b a=b ˙ guaranteeing that ≡Σ contains ≡ (ii) the (†)-rule a=b ˙ b=c ˙ a=c ˙ making ≡Σ transitive, and (iii) the α-rules a1 =b ˙ 1 · · · an =b ˙ n dα (~a), dα (~b) α(a1 , . . . , an )=α(b ˙ , . . . , b ) n 1 for n-ary α ∈ Σ (n ≥ 0), formalizing the closure condition turning an equivalence relation into a Σ-congruence. 2
We are borrowing here the dot notation used by Feferman to distinguish syntactic relations from the arithmetic relations they denote.
Inductively, Co-inductively, Contextually
91
It is not difficult to see that Lemma 3. If ≡ is an equivalence relation on T , then ≡ Σ is the finest Σcongruence refined by ≡ (that is, the ⊆-least Σ-congruence containing ≡). While ≡ refines ≡ Σ, we cannot assume ≡ extends to ≡ Σ. Nevertheless, the construction of ≡Σ from ≡ leads to a natural approach to answering the question: when is a (Σ, X)-congruence T -extensible? By Lemma 3, a (Σ, X)-congruence ≈ extends to some Σ-congruence iff ≈ extends to ≡Σ, where ≡ is the union ≈ ∪ {(a, a) : a ∈ T } of ≈ with identity on T . But the question remains: when does ≈ extend to ≡Σ? Additional assumptions on T and Σ will prove useful. We assume a distinct symbol α˙ can be associated with each α ∈ Σ such that T is the set of {α˙ : α ∈ ˙ 1 , . . . ,t n )’, Σ}-terms3 and each α is a restriction of the map (t 1 , . . . ,t n ) 7→ ‘α(t ˙ 1 , . . . ,t n )’ whenever (t 1 , . . . ,t n ) ∈ allowing us to confuse α(t 1 , . . . ,t n ) with ‘α(t domain(α). The main result of (Westerst˚ahl, 2004) is Theorem W. A (Σ, X)-congruence is T -extensible if X is closed under subterms (that is, t i ∈ X for 1 ≤ i ≤ n, whenever α(t 1 , . . . ,t n ) ∈ X). For the remainder of this section, let us assume X is closed under subterms, ≈ is a (Σ, X)-congruence, and ≡ is ≈ ∪ {(a, a) : a ∈ T }. Westerst˚ahl (2004) extends ≈ to a Σ-congruence different from ≡Σ. In view of Lemma 3, however, Theorem W says no more and no less than: for all a, b ∈ X,
a ≈ b iff a ≡Σ b
(2)
(under the aforementioned assumptions on ≈ and X). (2) formulates Theorem W as a conservative extension claim about the formal system defining ≡ Σ above. Observe that the transitivity rule (†) is the only rule in the system whose premises may include terms which are subterms of neither terms in the conclusion. That is, a (†)-free derivation of a=b ˙ can only employ subterms of a or of b. Eliminating (†) is the key to (2), just as eliminating Cut is to many conservative extension arguments in proof theory. The left-to-right direction ⇒ of (2) is an immediate consequence of the ≡rule and the inclusion ≈ ⊆ ≡. To establish the converse, ⇐, let us define a= ˙ k b iff ‘a=b’ ˙ can be derived in ≤ k steps. That is, T is generated inductively from Σ by the rule: for n-ary α ∈ Σ and t 1 , . . . ,t n ∈ ˙ 1 , . . . ,t n )’ belongs to T (beginning with n = 0, treating atoms as 0-ary T , the term ‘α(t maps in Σ). 3
92
Tim Fernando
The plan is to derive a contradiction from a k-minimal counter-example to ⇐. Accordingly, fix a k-length derivation D of a=b ˙ with a, b ∈ X, a 6≈ b, and for all 0 0 0 a , b ∈ X and k < k a0 = ˙ k0 b0 implies a0 ≈ b0 . By the minimality of k, the last step of D must be (†) — say, a= ˙ k−1 x and x= ˙ k−1 b. Expanding out uses of (†) within D, we can convert the sequence a, x, b to a sequence t 1 . . .t l of terms occurring in D such that t 1 = a, t l = b and for 1 ≤ j < l, D contains a derivation of t j =t ˙ j+1 ending with an instance of the ≡-rule or of an α-rule (for some α ∈ Σ). We can rule out the ≡-rule, appealing to k’s minimality. As T consists of {α˙ : α ∈ Σ}-terms, it follows that a = α(a1 . . . an ) and b = α(b1 . . . bn ) for the same α ∈ Σ and for some a1 . . . an , b1 . . . bn ∈ T . But X is closed under subterms, so by k’s minimality (again), ai ≈ bi . We then obtain the contradiction α(a1 . . . an ) ≈ α(b1 . . . bn ) from the assumption that ≈ is an (α, X)-congruence. 3
Coarsest Refinements and Fregean Covers Co-inductively
Lemma 3 is the dual of (the proof of) Theorem 6 in (Fernando, 1997, pp. 592– 594), which we briefly sketch below as Lemma 4. This will take us from the subterm property made precise by Theorem W to what Westerst˚ahl (2004) calls “the Contribution Principle” (arguably “one version of Frege’s famous Context Principle”). For orientation, let us tabulate the dualities to be fleshed out presently. S subterm property ≡Σ is least ⊇ ≡ derivation bottom-up T contribution principle ≡ F is greatest ⊆ ≡ constraint top-down Given a family F of multi-ary functions on T and a binary relation ≡ on T , we will define a binary relation ≡ F on T satisfying Lemma 4. If ≡ is an equivalence relation ≡ on T , then ≡ F is the coarsest F -congruence refining ≡ (that is, the ⊆-largest F -congruence contained in ≡). To define ≡ F , a bit of notation is handy. Given a function g : T → T on T and a binary relation R ⊆ T × T on T , let Rg be the subset Rg = {(a, b) ∈ R : g(a) R g(b)} of R preserved by g. Notice that if ≡ is an equivalence relation, ≡ is a g-congruence iff ≡ ⊆ ≡ g
Inductively, Co-inductively, Contextually
93
and the intersection ≡ ∩ ≡ g ∩ (≡ g )g ∩ ((≡ g )g )g ∩ · · · is the coarsest g-congruence refining ≡. But what do we do if instead of a unary function g, we have an (n + 1)-ary function f : T n+1 → T with n > 0? In that case, we form f ’s unary projections: given 1 ≤ i ≤ n+1 and ~a ∈ T n , let f i,˜a : T → T map a ∈ T to f i,˜a (a) = f ((a,~a)i ) where (a,~a)i is ~a with a inserted at the ith position. (For example, (a, b)1 = (a, b) and (a, (b, c))2 = (b, a, c).) Let us collect f ’s unary projections in U ( f ) = { f i,˜a : 1 ≤ i ≤ n + 1 and ~a ∈ T n }. T Now, to satisfy Lemma 4, set ≡ F = k≥0 ≡ F k where ≡ F 0 is ≡ and for k ≥ 0, \ \ F g ≡ F k+1 = f ∈F g∈U (f ) (≡ k ) . S Whereas ≡ Σ (from the previous section) -collects the conclusions of derivaT F -filters ≡ through constraints given by F . tions from a system of rules, ≡ (We can say ≡ is g-constrained if ≡ ⊆ ≡ g .) In practice, we will want to apply Lemma 4 to an equivalence relation ≈ on a subset X of T . To do so, we let ≡ be the union ≈ ∪ ((T − X) × (T − X)) of ≈ not with identity on T (as in the previous section) but with (T − X) × (T − X).4 As it turns out, ≡ F exemplifies what Hodges (2001) calls a Fregean cover of ≡. More precisely, let us write t(a|x) with the understanding that t is an (F ∪ {x})-term, a ∈ T , and t(a|x) ∈ T is t with x replaced by a. Definition (Hodges). Given equivalence relations ≈ and ≈0 on subsets of T , ≈0 is a Fregean cover of ≈ if conditions F(a)-F(c) below hold for X = domain(≈). F(a): if a ≈0 b and t(a|x) ∈ X then t(b|x) ∈ X 4
If ≈ is the kernel of [·] : X → M, this union is the kernel of the 1-point extension [·]⊥ : T → M ∪ {⊥} of [·] mapping a ∈ T to
[a]⊥ = for a fixed object ⊥ 6∈ M.
[a] ⊥
if a ∈ X for a ∈ T − X
94
Tim Fernando
F(b): if a ≈0 b and t(a|x),t(b|x) ∈ X then t(a|x) ≈ t(b|x) F(c): if a 6≈0 b then for some t, t(a|x) 6≈ t(b|x). As an analogue to Theorem W, we have Theorem 5. Given an equivalence relation ≈ on a subset X of T , ≡ F is a Fregean cover of ≈, where ≡ is ≈ ∪ ((T − X) × (T − X)). Moreover, every Fregean cover of ≈ extends to ≡ F . To prove Theorem 5, let ≈, X and ≡ be as given in the theorem. First, we verify that F(a)-F(c) hold for ≈0 equal to ≡ F . Indeed, we can show by induction on the number of occurrences of F -symbols in t that (i) if a ≡ F b then t(a|x) ≡ F t(b|x) and by induction on k, that (ii) if a 6≡ F k b then for some t, t(a|x) 6≡ t(b|x). (The by iterations of ·g for g ∈ S inductions bring out the encoding of t(a|x) 0 f ∈F U ( f ).) Then, given a Fregean cover ≈ of ≈, we deduce for all a, b ∈ domain(≈0 ),
a 6≈0 b implies not a ≡ F b
(3)
from F(c) and (i), and derive the converse of (3) from F(a), F(b) and (ii). 4
Changing the Context
The extensions in Theorems W and 5 of synonymies ≡ to an arbitrary term a ∈ T fall short of determining the meaning [[a]] of a. Term models a≡ fail to connect language with the reality it describes. Talk of the meaning of a presupposes a notion of context, such as in model-theoretic semantics, that given by a model M, underlying the meaning [a] = [a]M of a. For a concrete example, suppose a1 were a well-formed formula saying Pat’s spouse is lucky, and a2 were a wellformed formula saying Pat is married. Relative to a model M where a2 is false, it is tempting to take a1 to be meaningless — that is, to leave [a1 ]M undefined. Abstracting over M, however, we might build context into a richer meaning {[a]} = {(M, [a]M ) : a ∈ domain([·]M )} consisting of pairs (M, [a]M ) such that [a]M is defined. The step from [a]M to {[a]} goes beyond the extensions in the previous sections inasmuch as it may involve different meanings [·]M , [·]M 0 , . . . That said, not only would {[a]} always be defined, but following Karttunen (1974), we would have domain{[a]} = set of contexts satisfying a’s presuppositions.
Inductively, Co-inductively, Contextually
95
A fuller-blooded relational semantics would as in Heim (1983) formulate the meaning ([a]) of a as a context change potential cin ([a]) cout between an input context cin and an output context cout incorporating cin . The contexts here can be formulated as in (Martin-L¨of, 1984) to implement the presupposition filtering in (a) and (b) below5 (Ranta, 1994). (a) If Pat is married, then Pat’s spouse is lucky. (b) Pat is married, and Pat’s spouse is lucky. Moreover, the type-theoretic approach can be adapted model-theoretically so that the same mechanism for presupposition filtering accounts for the conservativity of generalized quantifiers (Keenan and Stavi, 1986) illustrated by the equivalences in (c) and (d), as well as the binding of the donkey pronoun it in (e). (c) Every ant bites. Every ant is an ant that bites. (d) Some ant bites. Some ant is an ant that bites. (e) Every farmer who owns a donkey beats it. The interested reader is referred to (Fernando, 2001) for details, the essential point for the present discussion being that the contextual shift from c to c0 in [ f (a, b)]c = [ f ]([a]c , [b]c0 ) can be formulated compositionally by enriching the notions of meaning and context in [·]c , [·]c0 . 5
Conclusion
Compositionality can be approached inductively from below (as in section 2) or co-inductively from above (as in section 3). Although meaning may under certain assumptions be preserved by extensions, some applications call for an enrichment of meaning reflecting differences in contexts lying behind different meanings (section 4). 5
That is, neither (a) nor (b) presupposes Pat is married, which is locally presupposed by the constituent Pat’s spouse is lucky.
96
Tim Fernando
References Fernando, T. (1997). Ambiguity under changing contexts. Linguistics and Philosophy, 20(6), 575–606. Fernando, T. (2001). Conservative generalized quantifiers and presupposition. In Proceedings of Semantics and Linguistic Theory XI (pp. 172–191). Cornell Linguistics Circle Publications. Heim, I. (1983). On the projection problem for presuppositions. In M. Barlow, D. Flickinger, & M. Westcoat (Eds.), Proceedings of the West Coast Conference on Formal Linguistics (Vol. 2, pp. 114–125). Stanford Linguistics Association. Hodges, W. (2001). Formal features of compositionality. Journal of Logic, Language and Information, 10(1), 7–28. Karttunen, L. (1974). Presupposition and linguistic context. Theoretical Linguistics, 181–194. Keenan, E., & Stavi, J. (1986). A semantic characterization of natural language determiners. Linguistics and Philosophy, 9(3), 252–326. Martin-L¨of, P. (1984). Intuitionistic type theory. Napoli: Bibliopolis. Ranta, A. (1994). Type-theoretical grammar. Oxford University Press. Westerst˚ahl, D. (2004). On the compositional extension problem. Journal of Philosophical Logic, 33(6), 549–582.
Confirmation and Compositionality Ken Gemes
1
Introduction
The aim of this paper is to show that unless we gain a better idea of exactly what are the parts of a theory/hypothesis we cannot have an adequate account of theory/hypothesis confirmation. We show this by, first, outlining two of the most popular accounts of confirmation and some of their major problems, and then showing how those problems can be overcome by addressing the question of what parts a theory/hypothesis is truly composed of. In particular, it is shown that the failure of both hypothetico-deductivism and Bayesian confirmation theory to adequately address the problem of irrelevant conjunction and to guarantee confirmation across the content of a hypothesis can be solved by adequately addressing the problem of what exactly counts as part of the content of a hypothesis. A major finding is that not every consequence of a theory/hypothesis should count as a part of the theory/hypothesis. 2
An Outline of Hypothetico-Deductivism and Bayesian Confirmation Theory
Hypothetico-deductivism is the thesis that confirmation comes from verification of the observational parts of a hypothesis/theory. The notion of parts of a hypothesis/theory is invariably glossed in terms of logical consequences. This yields the following canonical formal expression of hypothetico-deductivism: H-D (observational) e (directly) confirms (non-contradictory) h iff h ` e. Note, h can be here either a single hypothesis, itself possibly a conjunction of statements, or a theory, that is, a set of statements. Hereafter we will use the Address for correspondence: School of Philosophy, Birkbeck College, Malet Street, London WC1E 7HX, United Kingdom. E-mail: [email protected]. The Compositionality of Meaning and Content. Volume I: Foundational Issues. Edited by Markus Werning, Edouard Machery, & Gerhard Schurz. c
2005 Ontos Verlag. Printed in Germany.
98
Ken Gemes
terms ‘claim’ and ‘hypothesis’ ambiguously to denote both single hypotheses and sets of statements. More sophisticated formal versions of hypothetico-deductivism make reference to confirmation of h by e relative to background theory b; (observational) e (directly) confirms (non-contradictory) h relative to background theory b iff (h & b) ` e and b 0 e According to Bayesian confirmation theory e confirms h iff e raises the probability of h above its prior probability, BCT´ e confirms h iff P(h/e) > P(h) More sophisticated versions of Bayesian confirmation theory make reference to confirmation of h by e relative to background theory b, e confirms h relative to background b iff P(h/e & b) > P(h/b). For our purposes the reference to background theories can be omitted in our consideration of both hypothetico-deductivism and Bayesian confirmation theory. Similarly, the stress on observational evidence, typical of canonical formulations of hypothetico-deductivism, will be dispensed with, as it is not germane to the present considerations. 3
H-D Is Both Too Permissive and Too Restrictive and Does Not Capture the Intuitive Import of Confirmation
H-D is too restrictive because there are cases where hypotheses are clearly confirmed by evidence that is not entailed by the hypothesis in question. Consider a typical case of inference to the best explanation. Let evidence e be the conjunctive statement ‘The gun used to murder Jones was owned by Jones’s business partner Smith, and the gun only had Smith’s fingerprints on it, and Smith had ample motive to murder Jones’. Let h be ‘Smith murdered Jones’. Here, prima facie, e confirms h, even though h does not entail e. Also, and perhaps more importantly, at least as regards the confirmation of scientific hypotheses, H-D does not allow for the confirmation of statistical hypotheses by observational evidence, since, typically, in such cases there is no deductive relation between hypothesis and evidence. Thus, let h be ‘The probability of an electron fired from apparatus B having spin up is 50%’ and e be ‘According to accurate measurements, 50% of the many atoms fired from apparatus B between time t −1 and t had spin up’. Here, presumably, e confirms h even though h does not entail e.
Confirmation and Compositionality
99
H-D is too permissive because where h entails e, and, hence, according to HD, e confirms h, e also confirms h0 & h for arbitrary h0 consistent with h. Thus, let e be ‘The first planet of our solar system travels in an elliptical orbit with the sun as one focus of that orbit’. Presumably, this e confirms the claim ‘All the 9 planets of our solar system have elliptical orbits with the sun as one focus of their orbits’. But, presumably, and contra the verdict of H-D, this e does not confirm the conjunctive claim ‘All the 9 planets of our solar system have elliptical orbits with the sun as one focus of their orbits and Sydney has a harbor bridge’. This is often known as the problem of tacking by conjunction, or, alternatively, as the problem of irrelevant conjunctions. It is mainly because HD is not sufficiently selective in where it finds confirmation that Glymour (1980) brands it “hopeless”. If we allow such tacking by conjunction then we must, for instance, endorse the claim that there is evidence that confirms the claim that “Reagan was president in 1982 and was killed by Clinton in 1983”. Intuitively, there is no such evidence, though there is plenty of evidence that Reagan was president in 1982. Furthermore, H-D does not capture the import of confirmation because it does not ensure that confirmation is transmittable to untested content of a hypothesis. Thus, according to H-D, ‘The first planet of our solar system has an elliptical orbit with the sun as one focus of that orbit’ confirms ‘All the 9 planets of our solar system have elliptical orbits with the sun as one focus of their orbits’ but not its content part ‘The second planet of our solar system has an elliptical orbit with the sun as one focus of that orbit’. A large part of the reason we care about confirmation is that we take the fact that a theory is confirmed as giving at least some reason for thinking the theory will hold true in the future. Where confirmation does not give any grounds for thinking a theory’s content concerning as yet unobserved events is true, it is simply unclear why we should care about confirmation. This suggests the following Hempel type adequacy condition for a definition of confirmation: A1 If e confirms h and h0 is part of the content of h then e confirms h0 .1 1
Hempel’s actual adequacy condition demanded that for e to confirm h, e must confirm every consequence of h (Cf. Hempel, 1965, p. 31). Now, presumably, what Hempel had in mind was that if ‘Fa’ is to confirm ‘(x)Fx’, it must confirm such consequences of ‘(x)Fx’ as ‘Fb’, ‘Fc’, etc., but not such consequences as ‘∼ Fa ∨ (x)Fx’, etc. This suggests that the adequacy condition is better expressed in terms of the need to confirm all the content of a hypothesis rather than all its consequences, provided we have a notion of content that is more restrictive than the notion of consequence. Such a notion is provided below. Alternatively, one might argue that Hempel had in mind some purely quantitative notion of confirmation – e.g e confirms h iff the probability of h on e is high. This would allow that where ‘Fa’ confirms ‘(x)Fx’ it also confirms ‘∼ Fa ∨ (x)Fx’.
100
Ken Gemes
Notoriously, A1 and H-D combined yield the horrendous result that any e confirms any h consistent with e. This follows, since for arbitrary h and e such that the conjunction of e and h is not a contradiction, according to H-D, e confirms ‘h & e’, and, from A1, it follows that if e confirms ‘h & e’, e confirms h. So it seems H-D cannot be combined with A1, and therefore H-D cannot provide an adequate account of confirmation. 4
Bayesian Confirmation Theory Is Not As Restrictive As H-D, But Like H-D, It, In Itself, Is Not Genuinely Inductive
BCT, unlike H-D, allows for confirmation in cases of inference to the best explanation. Furthermore, BCT, unlike H-D, also allows for confirmation of statistical hypotheses by observational evidence. This results simply because BCT weakens the relation demanded for confirmation from entailment to makesmore-likely. Where H-D says that e confirms h iff h entails e, BCT says e confirms h iff h increases the likelihood of e, that is to say, P(e/h) > P(e). The fact that the difference between BCT and H-D is simply a weakening of the relation demanded between theory and evidence has been obscured by the fact that BCT is normally expressed in terms of e increasing the probability of h, rather than in terms of h increasing the likelihood of e. This has suggested to many that BCT involves some kind of inductive support that is missing from H-D. But, in fact, it is provable that P(h/e) > P(h) iff P(e/h) > P(e). It has long been recognized that BCT entails H-D, in the sense that where the conjunction of h and e is consistent and h ` e then, provided P(h) > 0 and P(e) < 1, BCT, like H-D, yields the result that e confirms h. What has been missed is the corollary that H-D is, in a sense, just a strengthened version of BCT. Neither H-D nor BCT is inherently inductive. The only difference between the two is in the strength of relation demanded between hypothesis and evidence. Where H-D says e confirms h if, given h, we can be certain that e, BCT weakens this to the demand that if e is to confirm h, then, given h, we have at least some reason to believe e. It is important not to exaggerate the claim being made here. It is not being claimed that BCT is incompatible with inductive confirmation. After all, BCT is compatible with, for instance, the genuinely inductivist claim that ‘The first planet from the sun travels in an elliptical orbit with the sun as one focus of that orbit’ confirms ‘The second planet from the sun travels in an elliptical orbit with the sun as one focus of that orbit’. Indeed later we shall see that hypotheticodeductivism is also compatible with this claim. The point is that BCT in itself contains no commitment to inductivism. An inductive skeptic could adhere to BCT. Indeed an inductive skeptic can even allow that ‘The first planet from
Confirmation and Compositionality
101
the sun travels in an elliptical orbit with the sun as one focus of that orbit’ raises the probability of, and hence by BCT, confirms, ‘All the 9 planets have elliptical orbits around the sun with the sun as one focus of their orbits’. He might do this simply because he takes that evidence to eliminate one of the potential falsifiers to that claim. Such a deductivist rationale need not admit even a whiff of inductivism. 5
BCT Is Too Permissive and Violates A1
If, as we saw above, BCT is simply more permissive than HD, and HD is itself too permissive, then we should expect that BCT is also too permissive. In fact, BCT, like H-D, entails that where h ` e, and neither the probability of h or e are extreme (0 or 1), e confirms h & h0 for any arbitrary h0 consistent with h. This follows because where h ` e, h & h0 ` e, and because, by the probability calculus, where h & h0 ` e, then, provided 0 < P(h & h0 ) and P(e) < 1, P(h & h0 /e) > P(h & h0 ).2 Furthermore, BCT, combined with plausible measure functions, violates the adequacy A1, since there are clearly cases where, according to BCT, e confirms h and h0 is part of the content of h yet on no plausible measure function does e confirm h0 . For instance, BCT, like H-D, entails that ‘The first planet of our solar system has an elliptical orbit with the sun as one focus of that orbit’ confirms the claim that ‘All the 9 planets of our solar system have elliptical orbits with the sun as one focus of their orbits and Sydney has a harbor bridge’, provided none of these statements have extreme probabilities – a necessity for any plausible measure function. Yet, presumably, no plausible measure function would yield the result that ‘The first planet of our solar system has an elliptical orbit with the sun as one focus of that orbit’ confirms ‘Sydney has a harbor bridge’. 6
H-D Need Not Be Too Permissive If We Pay Attention To How Theories Are Composed
For H-D to be less permissive it needs to allow for selective confirmation. That is, where h entails e, e should only selectively confirm parts of h. Intuitively, e confirms only those parts of h active in the derivation of e from h. The 2
While I strongly agree with Glymour that an adequate theory of confirmation should not have the result that where e confirms h, for arbitrary h0 , e confirms the conjunction of h and h0 , recently some authors have rejected this demand on behalf of Bayesian confirmation theory (Cf. Maher, 2004). Fitelson (2002) also contains an interesting discussion of this problem.
102
Ken Gemes
problem now is to say what exactly are the parts of h. Thus, let claim h be ‘(x)Fx & (∃x)Gx’, and e be ‘Fa’. What parts of this h does this e confirm? Presumably not the whole, since it does not confirm ‘(∃x)Gx’. Presumably it just confirms some parts of h, in particular, ‘(x)Fx’. Now in an obvious sense one only needs ‘(x)Fx’ in the derivation of ‘Fa’ from ‘(x)Fx & (∃x)Gx’. But now consider the claim h1 , ‘((∃x)Gx → (x)Fx) & (∃x)Gx’, logically equivalent to h. To derive ‘Fa’ from h1 one needs to use ‘(∃x)Gx’. To accept that logically equivalent sentences stand in different confirmation relations to the same evidence is to violate an adequacy condition of confirmation theory that is near sacrosanct, namely A2 If e confirms h and h a` h0 then e equally confirms h0 . It is tempting here to say we are being too literal about parts. A part of a claim should not be taken to mean any orthographic part. What, then, are the appropriate candidates for parts? One suggestion is that a part of a claim is just any consequences of the claim. But then the only part of ‘(x)Fx & (∃x)Gx’ needed to derive ‘Fa’ is ‘Fa’ which is itself a consequence of ‘(x)Fx & (∃x)Gx’ . An alternative way of thinking of the parts of a claim is to think of the claim as having a natural axiomatization with the parts being simply the axioms of that natural axiomatization. Thus the natural axiomatization of the hypothesis ‘(x)Fx & (∃x)Gx’ is Ax1: (x)Fx Ax2: (∃x)Gx The only axioms we need to derive Fa from this axiom set is, of course, Ax1. For this solution to work we need to rule out Ax1*: (∃x)Gx Ax2*: (∃x)Gx → (x)Fx as a natural axiomatization of the hypothesis ‘(x)Fx & (∃x)Gx’ Now note, if axiom Ax2* were to count as part of the content of ‘(x)Fx & (∃x)Gx’ this would yield the result that ‘∼ (∃x)Gx’ conclusively confirms part of ‘(x)Fx & (∃x)Gx’ since ‘∼ (∃x)Gx’ entails Ax2*. The obvious conclusion here is that Ax2* may be a consequence of ‘(x)Fx & (∃x)Gx’, but it should not count as part of its content. This would involve rejecting the traditional notion of content favored by Popper, Carnap and nearly all philosophers of science. According to the traditional account which TNC h0 is part of the content of h iff h ` h0 . Elsewhere I have developed an alternative notion of logical content which better captures our intuitions about content and also allows progress in many canonical
Confirmation and Compositionality
103
problems in the philosophy of science.3 For the purposes of this paper we can here give a brief version of the new account of logical content: NCT α is part of the content of β iff α and β are contingent, β ` α, and every relevant model of α has an extension which is a relevant model of β . A relevant model of arbitrary wff α is a model of α that assigns values to all and only those atomic wffs relevant to α.4 In the case of quantificational wffs the quantifiers are treated substitutionally in order to determine content parts.5 So, ‘Gb ∨ Fa’ is not part of the content of ‘(x)Fx’ since that relevant model of ‘Gb ∨ Fa’ that assigns ‘Fa’ the value F and ‘Gb’ the value T cannot be extended to a model of ‘(x)Fx’. ‘Fa’ is a content part of ‘(x)Fx’, since the sole relevant model of ‘Fa’, namely that which makes the single assignment of T to ‘Fa’, can clearly be extended to a relevant model of ‘(x)Fx’, by adding the assignment of T to ‘Fb’, ‘Fc’, ‘Fd’, etc. This rules out ‘(∃x)Gx → (x)Fx’ as part of the content of ‘(x)Fx & (∃x)Gx’ since, for instance, none of those many relevant models of ‘(∃x)Gx → (x)Fx’ which assigns F to ‘Fa’ can be extended to a relevant model of ‘(x)Fx & (∃x)Gx’. On the other hand, it, plausibly enough, allows, for instance, that ‘(x)Fx’, ‘(∃x)Gx’ and ‘Fa’ all count as part of the content of ‘(x)Fx & (∃x)Gx’. In a natural axiomatization, then, every axiom must be a content part of the hypothesis being axiomatized. Furthermore, no axiom, or any part of any axiom, should be redundant. In other words no content part of any axiom should be entailed by the conjunction of the remaining axioms. This condition prevents, for instance, ‘Fa’ as occurring as an axiom in a natural axiomatization of ‘(x)Fx & (∃x)Gx’. Finally, in order to prevent conjunctions of what should naturally be separate axioms as themselves counting as an axiom in a natural axiomatization, we also stipulate that a natural axiomatization must maximize axioms. A natural axiomatization, then, is a finite set S of sentences such that each sentence is a content part of S, no content of any member of S is such that it is entailed by the other members of S, and such that there is no other set S0 logically equivalent to S such that every member of S0 is a content part of S0 , no content of any member of S0 is entailed by the other members of S0 and S0 has more members than S. 3
For the account of logical content Cf. Gemes (1994) and (1997). For applications of the new notion of logical content to canonical problems in the philosophy of science see Gemes (1998a), (1998b), (1994b) and (1993). 4 An atomic wff α is relevant to wff β iff there is some model m of β such that where 0 m differs from m only in the value it assigns α, m0 is not a model of β . 5 So the atomic wffs relevant to, for instance, ‘(x)Fx’ are ‘Fa’, ‘Fb’, ‘Fc’, etc.
104
Ken Gemes
More formally, where ‘{S − α}’ designates the set S less its member α and ‘N(S)’ designates the number of members in S, NA A finite set S of sentences is a natural axiomatization iff (i) for any α ∈ S, α is a content part of S, (ii) for any α ∈ S, there is no content part β of α such that {S − α} ` β , and (iii) there is no set of sentences S0 such that S0 a` S, for any α ∈ S0 , α is a content part of S0 , for any α ∈ S0 there is no content part ß of α such that {S0 − α} ` β , and N(S0 ) > N(S). Now we are in a position to formulate a version of hypothetico-deductivism that allows for selective confirmation. Where h is any hypothesis, N.A(h) is a natural axiomatization of h, and h1 is a member of N.A(h), H-D1 e directly confirms axiom h1 of N.A(h) iff (i) N.A(h) ` e and (ii) for any subset S of N.A(h), if S ` e then h1 ∈ S. This last clause of our definition ensures that h1 is indeed needed in the derivation of e from N.A(h). Now we have a version of hypothetico-deductivism that unlike H-D is not too permissive. For instance, according to H-D1, ‘Fa’ directly confirms ‘(x)Fx’ but it does not directly confirm ‘(x)Fx & (∃x)Gx’. 7
H-D can be combined with A1 but is still to restrictive
H-D1, like H-D, is still too restrictive. For instance, while it allows that ‘Fa’ confirms ‘(x)Fx’ it does not allow that ‘Fa’ confirms ‘Fb’. Put more simply, H-D1, like H-D, does not allow for genuinely inductive confirmation. Now one response here is to say that hypothetico-deductivism is not meant to endorse any form of inductive confirmation. Then H-D would simply be akin to Popperian corroboration, merely a report of past performance with no suggestion of future compliance. I don’t think this is what Hempel and other earlier champions of hypothetico-deductivism had in mind.6 That is why so many of them were sympathetic to the adequacy condition A1. The idea was that, for example, the entailment from ‘(x)Fx’ to ‘Fa’ would allow ‘Fa’ to provide direct confirmation of ‘(x)Fx’ and then, through an application of the condition A1, the 6
This represents a change in view from that presented in Gemes (1993) and (1998), where hypothetico-deductivism was seen as treating a deductive relationship between h and e as a necessary and sufficient condition for the confirmation of h by e. Conversations with Gerhard Schurz have led me to believe that the advocates of hypotheticodeductivism are better interpreted as seeing such a relationship as merely a sufficient condition for confirmation.
Confirmation and Compositionality
105
confirmation would be transmitted to ‘Fb’, ‘Fc’, etc. The problem was that A1 combined with H-D had the disastrous consequence that everything confirmed everything. But that was because H-D, unlike H-D1, did not allow for selective confirmation. But now we can combine our revamped version of hypotheticodeductivism, H-D1, and A1 without such disastrous results. This would yield the conclusion that, for instance, ‘Fa’ directly confirms ‘(x)Fx & (∃x)Gx’s content part ‘(x)Fx’, but does not directly confirm ‘(x)Fx & (∃x)Gx’ itself, nor does it directly confirm its content part ‘(∃x)Gx’. Then by A1 we get the desired result that ‘Fa’ confirms ‘(x)Fx’s content part ‘Fb’. All this seems right to me. However as a complete theory of confirmation the combination of H-D1 and A1 is still too restrictive. The problem is taht it does not allow for confirmation in the absence of deductive relations. Such relations are missing in the case of the confirmation of statistical hypotheses and in many cases of inference to the best explanation, as in our murder case above. Nevertheless, I submit that philosophically reflective scientists will welcome H-D1 as a partial account of confirmation. Firstly, it allows for confirmation through natural expressions of theories, that is, it involves breaking theories into parts (axioms) in a plausible non-arbitrary way. H-D1 insists that theories have canonical representations that play an essential part in determining confirmational relations for the theory. Secondly, it allows for selective confirmation. That is, it allows that evidence entailed by a theory need not confirm the theory in toto, but may bear positively only on parts of the theory. Thirdly, it accords with much of experimental practice. That is too say, scientists will recognize as good practice the process of (i) teasing out of observational consequences from theories; (ii) checking those consequences for truth, and (iii) raising their confidence in the truth of the parts of the theory involved in teasing out those observational consequences if those observational consequences are in fact borne out. 8
BCT Need Not Be Too Permissive and Can Be Rendered Compatible Wtih A1
If we want something more permissive than BCT seems the place to look. But we saw that BCT seems too permissive. BCT, like H-D, yields the consequence that, for instance, ‘The first planet of our solar system has an elliptical orbit around the sun with the sun as one focus of its orbit’ confirms the claim that ‘All the 9 planets of our solar system have elliptical orbits around the sun with the sun as one focus of their orbits and Sydney has a harbor bridge’. A perhaps worse case is that BCT yields the result that the evidence ‘May passed her exams or Mary’s brother John failed’ confirms the claim ‘Mary and her brother
106
Ken Gemes
John both passed’. This is not the confirmation hoped for by anxious parents. But note, we can make BCT more restrictive by demanding that for e to really confirm h, e must not just raise the probability of h, but that e must raise the probability of every content part of h. More formally, BCT1 e really confirms h iff for any h0 , if h0 is a content part of h then P(h0 /e) > P(h0 ) BCT1 allows that, for instance ‘Fa’ really confirms ‘(x)Fx’ . ‘Fa’ really confirms ‘(x)Fx’ because it raises the probability all its content parts including ‘Fa’, ‘Fb’, ‘Fc’ etc. BCT1 does not allow that ‘Fa’ really confirms ‘(x)Fx & (∃x)Gx’ because, presumably, it does not raise the probability of ‘(x)Fx & (∃x)Gx’ s content part ‘(∃x)Gx’. BCT1 does not yield that consequence that ‘Mary passed or John failed’ really confirms ‘Mary and John both passed’. ‘In this case the evidence does not raise the probability of ‘Mary and John both passed’s content part ‘John passed’. Moreover, BCT1 is perfectly compatible with A1. Indeed it actually entails A1. 9
BCT and Irrelevance
If, as argued, above standard Bayesian confirmation theory, by backing BCT, gets confirmation wrong, then one would suspect that it would also get related notions such as irrelevance wrong. The standard Bayesian account of irrelevance is IR e is irrelevant to h iff P(h/e) = P(h). Now, given a plausible measure function, where e is ‘Sydney has a Harbor Bridge’ and h is ‘Die A came up odd and die B came up even’ IR renders the sound conclusion that here e is irrelevant to h.7 But let e0 be ‘Die A came up 1 and B came up 1,3,5 or 6’. Here the evidence, contra IR, is not irrelevant to h. At least if this counts as irrelevant evidence then irrelevant evidence is not the type of thing we can safely ignore. Indeed, J.M. Keynes himself says that we must regard evidence as relevant, part of which is favorable and part unfavorable, even if, taken as a whole, it leaves the probability unchanged (Keynes, 1929, p.72). This holds for Carnap’s preferred measure function C∗ – cf. Carnap (1962) pp. 562– 564. Presumably any measure function which did not produce the result that P(Die A came up odd and die B came up even/Sydney has a Harbour bridge) = P(Die A came up odd and die B came up even) would be, prima facie, extremely implausible. 7
Confirmation and Compositionality
107
A better account of irrelevance demands that for e to be really irrelevant to h every content part of e must be probabilistically irrelevant to h. More formally, RIR e is really irrelevant to h iff for any content part e0 of e, P(h/e0 ) = P(h). In this case ‘Die A came up 1 and B came up 1,3,5 or 6’ is not really irrelevant to ‘Die A came up odd and die B came up even’. It is not really irrelevant since clearly its content part ‘Die A came up 1’ is not probabilistically irrelevant to ‘Die A came up 1 and B came up 1,3,5 or 6’. On the other hand ‘Sydney has a harbor bridge’ is really irrelevant to ‘Die A came up 1 and B came up 1,3,5 or 6’. 10
A Coda on Carnap
Both our definitions of real confirmation, BCT1, and real irrelevance RIR, involve strengthening the Carnapian notions of confirmation, BCT and irrelevance, IR. Interestingly, Carnap, in his Logical Foundations of Probability seems aware of the need for stronger notions. Thus, regarding irrelevance, he suggested the following definition CIR e is completely irrelevant to h iff for any e0 such that e ` e0 , P(h/e0 ) = P(h).(Carnap, 1962, p. 415, with some inconsequential alterations of symbols and elimination of redundancies) Consider the difference between CIR and our RIR. For a strong notion or irrelevance RIR demands that every content part of e be probabilistically irrelevant to h. For a strong notion of irrelevance CIR demands that every consequence of e be probabilistically irrelevant to h. Recall that for Carnap, unlike us, every consequence of a sentence counts as part of its content. So Carnap, unlike us, would not even recognize any difference between RIR and CIR. The problem with CIR is that it gives us an empty notion of irrelevance. There is always some consequence of e, namely ‘e ∨ h’, such that that consequence is probabilistically relevant to h, (provided that P(e∨h) < 1 and P(h) > 0). Of course by our notion of content ‘e ∨ h’ does not count as a content part of h, so on that conception of content, CIR and RIR are vastly different. To get an adequate notion of irrelevance we need to be able to be careful in how we decompose statements into parts. 11
Conclusion
Relations of confirmation and irrelevance cannot be properly explicated till we have a good grip on what constitute genuine parts of theories and hypothe-
108
Ken Gemes
ses. Once we settle the question of the composition of theories and hypothesis we can then give a surprisingly workable, albeit, too restrictive, account of hypothetico-deductive confirmation, namely H-D1. This new version of hypothetico-deductive confirmation, unlike its predecessors, allows for selective confirmation and is compatible with the claim that confirmation is transmittable across the contents of a hypothesis. The main problem with this new version is that it, like previous versions, and does not allow for confirmation in the absence of a deductive entailment from hypothesis to evidence.8 And as for Bayesian Confirmation, when expressed in terms of an adequate account of genuine parts, it wins the prize. References Carnap, R. (1962). Logical foundation of probability theory (2nd ed.). The University of Chicago Press. Fittelson, B. (2002). Putting the irrelevance back into the problem of irrelevant conjunction. Philosophy of Science, 69, 611–622. Gemes, K. (1993). Hypothetico-deductivism, content, and the natural axiomatization of theories. Philosophy of Science, 60, 477–487. Gemes, K. (1994a). A new theory of content I: Basic content. Journal of Philosophical Logic, 23, 596–620. Gemes, K. (1994b). Explanation, unification, & content. Nous, 28, 225–240. Gemes, K. (1997). A new theory of content II: Model theory and some alternatives. Journal of Philosophical Logic, 26, 449–476. Gemes, K. (1998a). Hypothetico-deductivism: The current state of play; the criterion of empirical significance endgame. Erkenntnis, 49, 1–20. Gemes, K. (1998b). Logical content & empirical significance. In P. Weingartner, G. Schurz, & G. Dorn (Eds.), The role of pragmatics in contemporary philosophy. Vienna: H¨older-Pichler-Tempsky. Hempel, C. (1965). Aspects of scientific explanation & other essays in the philosophy of science. The Free Press. Keynes, J. (1929). A treatise on probability (2nd ed.). London: Macmillan. 8
Also, though less crucially for cases of confirmation of scientific hypotheses, it does not allow for confirmation where the evidence entails the hypothesis without itself being entailed by the hypothesis
Confirmation and Compositionality
109
Maher, P. (2005). Bayesianism and irrelevant conjunction. Philosophy of Science, 71, 515–520.
Levels of Perceptual Content and Visual Images. Conceptual, Compositional, or Not? Verena Gottschling The analogy between visual perception and mental visual imagery is well established. What is less clear is what kind of perceptual representations images should be identified with and what kind of content – if any – they have. That issue is the focus of this paper. Therefore my concern is the imagery debate and how the analogy of perception and imagery relates to the debate about the representational content of perceptual representations. The specific question I am interested in is: what is the best answer for a proponent of mental pictures? What are minimal constraints for a promising account? 1
Imagery and Visual Perception: The Analogy
The topic of the imagery debate is whether it is necessary or at least highly probable that besides symbolic, ‘language-like’ or descriptive representations there is another kind of mental representations: pictorial, ‘picture-like’ or depictive representations (I will use these terms interchangeably): ‘images’ for short. It is important to distinguish between two kinds of imagery theories: flat theories and hierarchical theories. In flat theories the pictorial representations are assumed to be on the same level as the descriptive or symbolic representations. That raises the question of how the two formats interact. There could be a third mediating format that would also be in need of further clarification, or else they could interact directly, a situation that would also have to be clarified. Another option is to claim that one format – typically the pictorial – is subordinated to the other one. The last type of theory is widely regarded as the more promising and will therefore be the focus of my discussion here. The leading researcher who defends a depictive theory about imagery and posits mental pictorial representations – images – is the psychologist Stephen M. Kosslyn. Even philosophers admit that his “empirical version of the pictorial view [...] seems much more Address for correspondence: Philosophisches Seminar, Jakob-Welder-Weg 18, Johannes-Gutenberg Universit¨at, D-55099 Mainz, Germany. E-mail: [email protected]. The Compositionality of Meaning and Content. Volume I: Foundational Issues. Edited by Markus Werning, Edouard Machery, & Gerhard Schurz. c
2005 Ontos Verlag. Printed in Germany.
112
Verena Gottschling
promising than any of its philosophical predecessors”. (Tye, 1993, p. 357) Note that Kosslyn’s theory is a hierarchical theory: images are considered to fulfill full representational functions only when an ‘interpretive function’ is applied to them. The question we deal with here is, whether that should be understood as a statement about their content. Do images have content and what kind of content do they have? In addition, a central part of Kosslyn’s theory is the combination of the analogies of internal and external pictures and the perception and imagery analogy.1 For Kosslyn mental visual imagery involves the activation of information processing mechanisms at all levels of the visual system. Images are patterns of activations in a medium of visual buffers having the properties of a coordinate space. He states that imagery occurs in a functional buffer, which is also used in the early vision system and uses the same kinds of representations. The central idea in Kosslyn’s theory is that while perception works by bottom-up activation, in imagery there is top-down activation from information in memory. Visual mental images are activations in the visual buffer, which are the result of these top-down activations, i.e. they are not caused by immediate sensory input. Thus they are one form of short-term memory representations. A central feature of the visual buffer is the attention window. Its function is to select the configuration of activity in one region of the buffer for further processing, i. e. analyses such as would be carried out in perception. (Kosslyn, 1994, p. 76f.) Once a pattern of activity is evoked in the visual buffer, it is processed the same way regardless of whether the activation was invoked by sensory input or from memory. This further processing includes analyses in both the dorsal and the ventral system. Obviously, the analogy of perception and imagery plays an important role here. Imagery is often even understood as seeing in the absence of the appropriate input. (Kosslyn, 1994, p. 74) The analogy of perception and imagery suggests that both capacities share representation formats, functional organization, and brain regions. I have argued throughout this book that imaged objects are interpreted using exactly the same mechanisms that are used to interpret objects during perception. Once a configuration of activity exists in the visual buffer, input is sent to the ventral and dorsal system and is processed in 1
Nonetheless, the issue whether something is a picture and the question whether images are like perceptual representations have to be distinguished. The reason is that it is possible that in visual imagery we use representations similar to those used in visual perception, but these representations are not pictorial at all. Nor is the issue whether imagery and perception share processing mechanisms, because these shared mechanisms do not have to rely on depictive representations. But in fact there seems to be empirical evidence that we have structure isomorphism in at least some perceptual representations.
Level of Perceptual Content and Images
113
the usual ways – regardless of whether the activity arose from immediate input from the eyes or from information stored in memory. This notion lies at the heart of the similarities found between imagery and perception. (Kosslyn, 1994, p. 336, italics VG) But there is more than one kind of perceptual representations in visual perception. The obvious question is then with which kind of perceptual representations should images be identified. If imagery is seeing in the absence of an external stimulus (the way I introduced it), it is tempting to suspect that the level where consciousness is located in vision is also the level where it is to be found in imagery too. There are two main approaches in the running: vision is seen to be located either in early or in intermediate vision. Most participants in the imagery debate understand pictorialism as stating one or the other. But in fact recent empirical research suggests that conscious perceptual images are to be identified with the intermediate level perceptual representations. (Kosslyn/Thompson, 2003, Kosslyn/Thompson/Ganis, 2002) The accounts and views I discuss here make allowance for that. The relevant question here is how strictly we should understand this analogy; are perceptual representations and images only ‘alike’, or are they the same kind of states and if so, do they have to have the same sort of content? One central problem in the imagery debate is that the central term – image – is used ambiguously. The term ‘image’ was introduced as a conjunction of two components. One component is the short-term representation, which is pictorial or descriptive in a literal sense. The medium of these representations is the visual buffer. The second component of an image is the information in long-term memory, which is necessary to generate the short-term representation. Later in the debate ‘image’ came to be understood only as the alleged perceptual and depictive component. A mental image is then understood as an (perhaps functional) array, to which descriptive labels are appended. 2
The Options
A standard view regarding the content of visual experiences is that the content of an experience captures the way the world perceptually seems to the subject of the experience. (Byrne, 2002, p. 4) When Joe looks at a green circle in daylight, it seems to Joe that there is a green circle in front of him. If there is in fact a green circle in front of Joe, then the content of the experience is true, if there is no green circle in front of Joe, it is false. It seems the content of Joe’s experience concerns at least the color and shape of the object in front of him. What is controversial is whether it specifies the object before him as a watermelon, for this seems to require the concept of a watermelon. Thus,
114
Verena Gottschling
there is a difference between categorizing simple forms and colors on the one hand and categorizing percepts under more complex concepts (watermelon, car, toy) on the other hand. While the first might be due to mechanisms in early or intermediate vision, the second seems to require categorical information like ‘a fruit’, ‘a vehicle’, ‘something to play with’ etc.). Let us assume that Joe does not perceive but deliberately produce an image of a green watermelon (or a green circle) in front of him. Is the content of his experience identical to the corresponding perceptual experience? Or is it sufficiently similar? If so, how should we grasp the difference? According to the analogy of perception and imagery both should be identical; therefore, they should have the same sort of content, perhaps even the same content. Thus, proponents of pictorialism seem to have several options: (NoCont)
As hierarchical pictorialists they can argue that images themselves do not even have fixed content, but that only the interpretations/descriptions that accompany them have. For that reason the question whether this content is conceptual or not does not even arise. Thus the representational burden lies on the other component.
(NoCo)
They can state that images are paradigm cases of nonconceptual contents. Information yielded by the perceptual systems has been argued to be nonconceptual. Perceptual states are supposed to be exemplary representational domains where we find content that is nonconceptual. It seems consistent to regard images that are recalled perceptual contents from long-term-memory as having nonconceptual contents too. Thus the basic idea is that because perception and imagery are so similar, images like perceptual representations have nonconceptual content. This view comes in two flavors: either the content in question is completely nonconceptual or only partly nonconceptual.
(CoCo)
They can maintain that images are themselves conceptual and therefore compositional. They are recalled percepts and as such necessarily have an entirely conceptual content.
I shall not give much consideration to the NoCont option: if images have no content they could no longer have an important role in cognition. All they accomplish is additional function of low value – in effect we end up with epiphenomenalism and images would have less and less of an explanatory role. Pictorialism of this kind does not seem worth spending too much time on. In addition it is relatively undisputed that perceptual states have representational content – the controversial issue is whether this content is conceptual or not. Therefore, the
Level of Perceptual Content and Images
115
analogy of perception and imagery would be weakened to a large extent; there seems little reason to embark on this strategy. But there are still two remaining options. 3
Nonconceptualism, Contents and Compositionality
How we judge the options NoCo and CoCo obviously depends on our characterization of the central term ‘nonconceptual content’ vs. ‘conceptual content’. Therefore, some preliminary remarks are in order. To begin with, my considerations do not entail a specific theory about concepts.2 Nonconceptualism is typically understood as the claim that one can have an experience with representational content R without possessing any of the concepts which figure in a proper description of this content R.3 A more moderate version states one can have an experience with representational content R without possessing all concepts, which figure of a proper description of R. As we have seen, there are three possible views regarding nonconceptual contents of visual and imaginary experiences: (CoCo)
The representational content of experiences is entirely conceptual.
(EnNoCo)
The representational content of experience is entirely nonconceptual.
(PaNoCo)
The representational content of experience has both nonconceptual and conceptual properties; it is partly nonconceptual.
The relationship between conceptuality, compositionality and meaning is rendered variously in the literature. One reason for this is that there is a variety of characterizations on offer for what makes conceptual contents conceptual. Often mentioned are four necessary conditions for conceptual content (cf. Gunther, 2003, pp. 8–14, for the following): first, complex conceptual contents are compositional; they are functionally determined by their constituents, which are concepts. Second, the conceptual contents are cognitively significant; in the moderate form this condition requires that the content R determines that the Subject S has the corresponding belief that R.4 This criterion gives us the 2
My considerations are restricted to visual mental imagery and visual perception, i.e. to perceptual contents. Therefore, I am only concerned with a subclass of concepts, concepts, which are sensory and can be represented visually like shapes, colors or combinations of these. 3 cf. for example Wright, 2003, 40 4 For a more detailed description, see Gunther, 2003, p. 10f.
116
Verena Gottschling
basis for distinguishing different concepts. Third, conceptual contents are individuated independently of their force; they are force independent. Fourth, the contents determine the reference. If a content is conceptual, the subject has knowledge of its referent and is able to identify, classify and/or recognize the referent of that content. Consequently, nonconceptual contents are contents in which at least one of the mentioned four conditions does not fulfilled.5 In addition, there are different ways to understand the force of ‘non’. It can be understood as the claim that: a) The contents cannot be rendered conceptually. The individual cannot have the concepts required to grasp the content of the states. b) The contents can be represented conceptually but the concepts in question are not in fact grasped by the individual. The concepts can be grasped, but the subject does not have the concepts required to grasp them. c) The subject might even have the relevant concepts but cannot apply the concepts to the states. d) The subject might even have the relevant concepts but does not apply the concepts to the states. My focus here has to be the weakest understanding of the ‘non’: the subject does not apply the concepts to the states – in the case of imagery after generation of the image. If imagery is activation of spatial representations in short-term memory from information in long-term memory, the subject necessarily has the concepts relevant for the generation and is in principle able to exercise these concepts on the states. There is one other reading of nonconceptualism on the market, which I rule out from the beginning, i.e. whether concepts determine the representational content of experience partly or even entirely. We should distinguish between the claim, that the representional content of the state is itself nonconceptual, and that concepts determine the representational content of the state. The second reading would exclude nonconceptual contents of images from the outset, because in imagery we have reactivation from conceptual information. To summarize, the content of an image I with content R is entirely nonconceptual (EnNoCo) iff someone who is in S need not possess any of the concepts that characterize R. (cf. Crane, 1992, p. 143) Correspondingly, the content of an image I with content R is partly nonconceptual (PaNoCo) iff someone who is in S need not possess all of the concepts that characterize R. And finally, the content of an image I with content R is entirely conceptual (CoCo) iff someone who is in S has to posses all of the concepts that characterize R. 5
at least in case this list is exhaustive. Consequently, if the content of images turns out to be nonconceptual, this does not imply that it is not compositional. For all we demand is that at least one of the mentioned four necessary conditions is not fulfilled.
Level of Perceptual Content and Images
117
Another distinction, introduced by Heck (2000), turns out to be important for our question as well. According to the state view, perceptual states are a different sort of states from conceptual ones like beliefs. These states are concept-independent or nonconceptual. The contents of these states deserve to be dubbed ‘nonconceptual’ contents because they are contents of nonconceptual states, not because their content itself is nonconceptual. Visual experiences are understood as nonconceptual states having conceptual contents (which are labeled as ‘nonconceptual’ because they are the contents of nonconceptual states). In contrast, according to the content view, the contents of the states are nonconceptual. The state view is less interesting to us than the content view, for we are interested in the content of images.
4
Images as PANICs: May Images have Entirely Nonconceptual Content?
The debate on nonconceptualism is dominated by the discussion of the contents of visual experiences. The richness of the content of visual experiences, its fineness of grain, and the specificity of this content tempted many to claim that the content of visual experiences is nonconceptual. Experiences during visual imagery are similar, so it seems tempting to assume these contents are nonconceptual as well. This would also maintain a strong analogy between visual perception and imagery. Michael Tye explicitly claimed that images are representational states, which are nonconceptual. He adopts Gareth Evans’s (1982) idea, that the initially nonconceptual information yielded by the perceptual system becomes conscious when it serves as input to a thinking, concept-applying, and reasoning system. In Tye’s terminology, perceptual representations as well as images are cases of PANIC (states with ‘poised abstract nonconceptual intentional content’) I will thus concentrate on his account and discuss the question in which sense PANICs are nonconceptual here and what role being nonconceptual plays in the account. Tye’s ambitious account is reductive in that phenomenal content is reduced to representational content. In addition, he argues for the content view; insofar as the contents of visual experiences are nonconceptual, the states are also nonconceptual. Beliefs, thoughts etc. have conceptual content, perceptions, images, emotions and others have content of another kind – ‘poised’, ‘abstract’ and ‘nonconceptual’ content. (Tye, forthcoming, sect.1) Thereby ‘IC’ simply stands for ‘intentional content’, which is roughly representational content. Thus the explanatory burden lies on the three other characteristics of PANIC states: being poised, being abstract and being nonconceptual. By being ‘poised’ Tye means that these states stand ready and available to make a direct impact on the conceptual system and belief/desires. Being ‘abstract’ means being object-independent (like hallucinations), and finally we have be-
118
Verena Gottschling
ing ‘nonconceptual’. 6 The core idea of Tye’s PANIC theory of phenomenal consciousness in a nutshell is that “phenomenal character is one and the same as Poised, Abstract, Nonconceptual, Intentional Content”. (Tye, 2000, 63; cf. 1995, 137) Intentional states lack phenomenal character precisely when they are not poised, or do not have abstract or nonconceptual content. Consequently, beliefs lack phenomenal character because they lack nonconceptual content.7 Beliefs are unconscious states and their manifestation, ‘thoughts’, are not conscious either. Whenever phenomenal character goes with thought it attaches to associated images – mainly linguistic auditory images. Images as well as perceptual states are a subset of PANICs, they are states located at the interface of conceptual and nonconceptual domains – on the nonconceptual side. (cf. Tye, 2002, p. 33) Like perceptual sensations they feed into the conceptual system, without being part of that system, i.e. they are poised. (Tye, 1995, p. 104) And finally, these representations have abstract contents because their contents are not object-dependent. In other words, these are the characteristics that unify PANIC states – and thereby, amongst other states, images and perceptual states. In my view all of these characteristics are problematic, especially since their linking is disputable: let me start with being nonconceptual. The claim that the contents relevant to phenomenal character must be nonconceptual is to be understood as saying that the general features entering into these contents need not be ones for which their subjects possess matching concepts. (Tye, 1995, p. 139, italics VG) Thus, being nonconceptual is characterized by the subjects’ not needing to possess the relevant concepts, although they may do so. And, as we have seen, this holds for all concepts relevant to the content. What does it mean to posses a concept? The standard view is that the subject needs knowledge of the referent of the concept and is able to identify, classify and/or recognize the referent of that content. In other words, the subject is able to exercise this concept und to categorize things under a concept: to posses the concept ‘cow’ means, to be able to distinguish cows from non-cows and have at least some knowledge about cows for example that they are animals; to have the concept ‘red’ means to be able to distinguish red things from things that are not red and to know that red is a color. Tye has a different view. For him, categorization might be possible without implying that the subject has the concept in question. Rather, to posses the concept means, first, to have a stored memory representation and second, to bring it to 6
For more information about the extensive debate about whether experiences are nonconceptual states having robustly or coarse-grained nonconceptual contents versus fine-grained nonconceptual contents see Tye 2002b, Tye forthcoming. 7 but might be abstract and poised
Level of Perceptual Content and Images
119
bear in an appropriate manner (by, for example, activating the representation and applying it to the sensory input). (Tye, 1995, p. 139; cf. 2000, pp. 61–2) Therefore, his view seems to be that concept possession “requires the ability to represent in thought and belief that something falls under the concept.” (Tye, 1995, p. 108)8 We said that a mental state S with content P is nonconceptual iff someone who is in S need not possess any of the concepts that characterize P. This implies that we might have the ability to represent in thought and belief that something falls under the concept (the first condition is fulfilled), but the content P of that state S in question is nonetheless nonconceptual – because we do not need to have this ability in order to be in that state. Furthermore, Tye not only concedes that conceptualization may happen but also that conceptualization can causally influence phenomenal character. For example, the way we conceive an ambiguous figure can influence how we can break it up cognitively into spatial parts. The shapes we experience may differ from the experience under a different conceptualization. However, the contents are nonconceptual because “at the most basic level the experience of shapes does not require concepts”. (Tye, 2002, p. 60f.) It is not easy to understand what exactly that means. It might be interpreted as meaning that basic shape experiences do not require the corresponding shape concepts. Or it might mean that the shape experience is only nonconceptual at one level, but conceptual on another. What is ‘the most basic level of experience’? In case of visual perception Tye is sympathetic to the second interpretation: “If I see a picture as a duck, then my visual state has a conceptual content, but it doesn’t follow that it lacks any nonconceptual content. There are, it seems to me, many layers of perceptual content; and the possession by a perceptual state of one of these layers, does not preclude it from having others.” (Tye, 2002, p. 32) Since perceptual states are PANIC states this implies that in addition to their nonconceptual content PANIC states may have a conceptual content as well! Thus, Tye states only that one layer of content has to be entirely nonconceptual, not that their content is entirely nonconceptual. This implies that we only have a PaNoCo-view in contrast to what Tye himself claimed. What unifies all PANIC states is that only one layer of their content is nonconceptual. Furthermore, this suggests that the experience might not give us any evidence whether the content P of a state is nonconceptual or conceptual. For I typically experience an ambiguous figure like the duck/rabbit as having a conceptual content, not as having a nonconceptual content. In fact, Tye understands the content of visual experience as operating on not only one level but on a number of levels. (Tye, 2000, p. 74) Hence, it is conceded, that seeing as something is part and parcel of our visual 8
In fact, Tye does not officially adopt this view, but seems sympathetic to it, see also Byrne 2002:14.
120
Verena Gottschling
experiences. We actually experience a green watermelon. But the content of this experience is nonetheless legitimately to be dubbed ‘nonconceptual’. For part of the content of our experience, one basic level, is nonconceptual. A similar problem arises with being abstract. This condition is introduced for imaginary states. In the case of veridical perception it is not clear that the content is object-independent. Tye claims analogously that even if we assume that veridical states have non-abstract contents, they could have in addition abstract contents – again, there are different layers of content. What unifies all PANIC states is again only that one layer is abstract. First, I do not think I fully understand what Tye means by different layers.9 Secondly, for a unified account he needs to argue (in my understanding) for an entirely nonconceptual, abstract and poised content: we need an explanation of what unifies the class of PANIC states. Unfortunatly, the notion of being poised shows no more promise either. Tye distinguishes the sensory systems and the cognitive/conceptual system, the reason is that he understands sensory systems as modular in Fodor’s sense. He holds hat states in the sensory systems cannot play a role in central cognitive capacities. And because states in the sensory systems are nonconceptual (for they have nonconceptual content) they cannot be part of the cognitive system, which is necessarily conceptual. In other words, he is an advocate of a strict division. His solution suggests that PANIC states are suitable simply for being the interface to the conceptual system, because they are defined that way, as being poised. They are neither fish nor fowl. The problem I see is that such different states as all experiences and emotions are PANICs in Tye’s view. What constitutes their unity is merely being poised, abstract and nonconceptual. As we have seen, being nonconcepual and abstract are both problematic as unifying criteria. Thus, we are left with being poised. Tye states that visual experience has a poised content as long as it is “apt for the production of the right beliefs in the right way”. (Tye, 2002, p. 30) As Alex Byrne pointed out (cf. Byrne, 2002: p. 11f.), the poisedness requirement is formulated quite weakly. There is no constraint on the contents of the beliefs or desires that a poised state stands “ready and available” (Tye, 2000, p. 62; 2002, p. 30) to cause. But a characterization in terms of standard results under optimal conditions, in other words a purely dispositional characterization, is clearly unsatisfying. Therefore, Tye owes us an explanation what justifies treating these states he labeled PANICs as members of one class? In particular, he has given us no reason why all PANIC states have to have nonconceptual content. And as we have seen, even regarding perceptual experiences, his view becomes more effectively subsumed under PaNoCo. 9
In a footnote he refers to Peacocke’s (1992) scenario contents, protopropositional contents and conceptual contents. It seems problematic to use this distinction and at the same time argue for EnNoCo.
Level of Perceptual Content and Images
121
At best, all PANIC states have a nonconceptual layer of content. According to Tye, to possess a perceptual concept is to have a stored memory representation acquired through the senses that is available for retrieval. (cf. Tye, 2000, p. 37, p. 176) Reactivations of these concepts, images, are PANICs simply because these are defined accordingly. Phenomenal concepts in his account are defined merely as something that disposes the subjects to generate images. These, in turn, are defined as being nonconceptual, for they have phenomenal character and are not stored in memory. Conceptual content is assumed to lack phenomenal character, therefore, images cannot be conceptual. Thus the experience is nonconceptual, and the phenomenal concepts are characterized as dispositions. It is obvious such an account does not explain why images are nonconceptual; phenomenal concepts as well as natural kind concepts are defined in a way that excludes images and perceptual representations, at least as long as we accept that both of these are poised, abstract and nonconceptual. – Here seems to me the main tension: on the one hand we want to keep the analogy between perception and imagery, and thus a unified account. If we are sympathetic to the thesis of nonconceptual contents of perceptual states we are forced to concede that images are nonconceptual as well. On the other hand, imagery is supposed to be a cognitive capacity we use during learning, in reasoning and to make implicit information available. An account that divides states into cognitive and the noncognitive according to whether or not they are conceptual invites troubles. For it compels us to interpret the states in question as neither part of the cognitive nor of the sensory system but as interfaces to the cognitive system. Without further arguments, this seems unsatisfying. And an argument why images might be treated as having nonconceptual content is still lacking, in particular, because an advocate of this kind of account needs to concede that we indeed see a figure like a watermelon and not simply like a green circle. Perceptual experience is rich and conceptualization plays an important role. Therefore, I do not think that Tye’s account of NoCo is the best answer we could give.
5
NoCo Revised
It thus seems that CoCo remains the only option. This option would also give images an important role in cognitive processes. But it is problematic for reasons discussed sufficiently in the literature: introspectively, images seem to be particular instances of categories or concepts. They cannot abstract away from perceptual differences like concepts do. In addition they seem to be restricted to a particular point of view. Furthermore, if images have conceptual content, we need explanations of how the necessary constraints are fulfilled. In particular, compositionality would be a necessary constraint. Thus we need a theory about
122
Verena Gottschling
the mechanisms used to build up the composed complex contents of images. But we are not committed to the view that the elements in images have to be mapped onto the elements of the composed meaning of the concepts. If you combine blue carnivorous insect, blue, carnivorous and insect are the concepts involved, nonetheless they do not match the relevant visual properties of the image. We could try differentiating components of images and constituents of the content of images. But even then we should have to explain how the content constituents are combined to form a new ‘whole’, how the content is to be represented in images and how the components of images are to be combined. If constituents and components coincide, we would have a shared structure, that is, a structural compositional representation. If they do not coincide, it does not follow they cannot preserve content structure. There is another possibility. There might be complex representations (images) that might still preserve content structure without sharing it and thereby ensure compositionality. For there might exists a recoverability function for such complex contents. Complex images do not share structure with their contents, but the existence of recoverability functions for such complex contents entails that such contents nonetheless preserve structure implicitly.10 In case the recovery function could be shown to be systematic we could ensure systematicity and productivity. We would ensure functional compositionality. (cf. Cummins et al., 2001) This would provide us with a general productive algorithm for mapping the complex representations to its contents or contained concepts. But to work out a proposal along these lines would go beyond the scope of this paper. – Furthermore, we have to commit ourselves to a view about the minimal constituents (basic concepts) of images. In other words, our theory about concepts determines our answers. That would certainly be too ambitious for one paper. The plan announced for the paper was rather to develop minimal constraints for a pictorialist view. The intermediate result was that EnNoCo is implausible for images. But we deal with the question whether NoCo is justifiable. I would like to end by proposing not answers but some signposts pointing in the right direction by arguing for a position that might be understood as intermediate between option NoCo and CoCo. One might understand Tye’s basic intuition as the consideration that even if you can show that concepts in fact correspond to features presented in one’s experience this is not sufficient for a conceptualist position. An advocate of full-blown conceptualism has to show that in perception as well as in imagery 10
Van Gelder (1990) distinguishes between concatenative and functional compositionality. One useful example for a coding scheme that is merely functionally compositional is the G¨odel numbering schema. A crucial feature of this scheme is that it is completely reversible. By using the prime decomposition scheme it is possible to calculate the G¨odel numbers of the (primitive) constituents of any complex expression. (cf. van Gelder, 1990, p. 362.)
Level of Perceptual Content and Images
123
one could not have that particular experience without possessing the relevant concepts. In other words it is a systematic option to state that images have a partly nonconceptual content, but nonetheless are strongly bound to conceptual contents. In trying to trace this intuition, I want to draw attention to another systematic option we lost sight of, a view between option EnNoCo and CoCo, a view that mediates the aforementioned tension. – The special status of images as activation from recalled and generalized perceptual information forces us to concede that conceptual information plays an essential role in imagery. Images are always accompanied by conceptual contents. But that has to be systematically distinguished from the statement that images themselves have a conceptual content. Support for this comes from empirical research. Recent research strongly suggests that images differ from perceptual states in regard to our ability to reinterpret them. This suggests that an image is not a raw display in a buffer having spatial, geometric properties as a widespread reading of the perception analogy seems to suggest. In fact, this reading of the analogy is based on a misunderstanding. Images are rather internally structured, pre-organized and conceptually bound. If images were raw unstructured entities it should be possible to find new interpretations of imagined figures. But in the case of ambiguous figures like the duck/rabbit, there is an important difference between imagined and perceived figures.11 In the case of imagery it is difficult, often even impossible, to find a new interpretation of an ambiguous figure. It is controversial what exactly these findings imply but most participants agree that the aforementioned difficulty has to do both with the complexity of the figures and with the fact that in many ambiguous figures figure/ground, part/whole interpretation and up/down and so on changes. Chambers/Reisberg’s own conclusion was that images contain a specification of properties such as organization and figure/ground organization – a ‘reference frame’. Because this reference frame changes in ambiguous figures they are difficult to reinterpret. This is congruent with the identification of images with perceptual intermediate representations. But it shows something else: the necessary shift of focus in ambiguous figures is assumed to be mediated by high-level vision. The stored category plays a causal role in the selection of information contained in an image and for attention mechanisms. An image has to contain or at least is strongly bound to a descriptive element, a specification of properties such as orientation and figure/ground organization. In imagery processes we necessarily have a descriptive element. Intermediate representations are essentially involved, but mediation from high-level processes plays a role as well. High-level vision may not 11
See Chambers/Reisberg, 1985; Reisberg/Chambers 1991; and Finke et al., 1989 as the stating point for an overview.
124
Verena Gottschling
be the main correlate of images, but is necessarily involved in imagery.12 The whole idea of interpreting and transforming images takes that for granted. This is not to say that intermediate representations alone constitute imagery or the experience of imagery. In fact, we need activation at the intermediate and high levels. Images are not raw displays – as intermediate representations they are pre-organized. Thus, we should not identify the correlates of images as intermediate representations alone: images incorporating organized units provide us with the essential hint: the information has to be adapted by information contained in high-level vision and attention mechanisms must play an important role. In the meantime it is good empirical support for holding that there is little in visual scenes that is encoded directly. Therefore we have to attend to the items in question explicitly. (Henderson/Hollingworth, 1999, O’Regan et al., 2000. O’Regan/No¨e, 2001) Thus, image generation and transformation processes rely on high-level mechanisms. High-level visual processing relies on previously stored information about visual properties. High-level processes play an essential role in imagery. This is consonant both with Kosslyn’s hierarchical theory and with the function of the attention window. (cf. Kosslyn 1994, p. 53) Images are subordinated to descriptive representations. Indeed, there is substantial empirical evidence to indicate that some high-level processes influence behaviors that are traditionally considered low-level or intermediate-level. This is in accordance with Kosslyn’s imagery account, as well. My intermediate option is effectively close in spirit to Jackendoff’s (1987, 2002) considerations of the conceptualized world as divided between the ‘cognitive structure’ (CS) which is approximately propositional, and the ‘spatial structure’ (SpS) which is geometric, bit is nonetheless not restricted to a particular point of view and is more abstract than experienced images and percepts. The spatial structure is part of conceptual knowledge and is to be identified with Marr’s 3D sketch.13 In CS aspects like category membership and background knowledge are encoded. In contrast, in SpS knowledge about visual appearance is encoded; the spatial structure specifies the configuration of the object’s parts relative to one another. These representations are 3D representations, which are roughly understood as images of a prototypic instance of a category. They are not visual in a strict sense but roughly imagistic. For in contrast to our experience during perception and imagery they are not restricted to an observer’s point of view but are more abstract; they must support visual object categoriza12
Chambers and Reisberg see the image as containing a reference frame, hence, my claim is more cautious. 13 Jackendoff’s Spatial structure makes use of Marr’s 3 D sketch and understands Biedermann’s geons as an extension. Jackendoff understands this structure as modality independent – in contrast to Marr, who saw it as part of vision.
Level of Perceptual Content and Images
125
tion and identification. Encodings of possible shape variations of objects are to be found here. These representations are rather image schemas. For they are abstract structures from which a variety of different images can be generated and by means of which in addition a variety of percepts and images can be compared. We do not simply have a three-dimensional object, we have a complex hierarchy of representations which include all parts of the object, including hidden parts. Thus we do not simply have shapes of objects but information on how objects can be regarded as assemblages of parts. Jackendoff’s spatial structure is part of ‘central cognition’. More precisely, it is concerned with judgements and inferences having to do with shapes and locations. According to this view, the image itself is an intermediate representation (in David Marr’s terms a 2.5-D sketch) – a visual representation, but strongly bound to knowledge about visual appearances, knowledge that is encoded in SpS. We have not yet arrived at the view I prefer: consider the mental rotation of an imagined object like my laptop. When the laptop is notionally rotated new portions of the imagined surface of the object come into view (such as the sockets at the rear), while other parts disappear (such as the keyboard). The experienced image has the characteristics of a 2.5-D sketch, but in order to create the an image with new properties, one must also have a constant encoding of the imagined object that is independent of view point, and incorporates all its parts. For that reason we need a constant encoding of information contained in the 3D sketch (in Jackendoff’s terminology the spatial structure) to produce and maintain the 2.5-D sketch. It seems reasonable to take up the initial characterization we started with: images have two components: a short-term representation, which is ‘quasi-pictorial’, and the descriptive information in long-term memory, which is used to generate the short-term memory representation. Therefore it would be hasty to identify one component in the theory, the activation in the buffer, with the required image while neglecting its embedding in the theory as a whole.14 So intermediate perceptual representations are necessary for imagery. This is not to say that intermediate representations alone constitute imagery or the experience of imagery. In fact, we need activation at both the intermediate and higher levels of spatial structure.15 My intermediate option has some advantages. First, we have a clear demarcation between conceptual and nonconceptual aspects, even if we allow ourselves the luxury of blurring the boundary between conception and perception. Second, we are not committed to equip images with different nonconceptual and conceptual layers of content the 14
Moreover, the question whether the short-term representation is to be located in primary or secondary cortex is of secondary importance for this because in both cases high-level processes have effects. 15 In fact, this leaves many consciousness theories as open possibilities.
126
Verena Gottschling
role of which in determining experience is unclear. Instead they are themselves nonconceptual but preorganized and structured as well as necessarily accompanied by conceptual information. Third, the perception-imagery analogy is maintained, although image schemas are part of the conceptual system. Fourth, we get an elegant explanation for why images are in some cases more strongly bound to conceptual information than perceptual representations. Some considerations from Pacherie (2000) fit with this view. Taking up Dretske’s distinction between non-epistemic seeing (simple perception), which does not depend on higher-level cognitive and conceptual abilities and is devoid of cognitive content, and epistemic seeing (cognitive perception), which does, Pacherie recently argued that we need a third intermediate level of perceptual content that is internally structured but nonetheless nonconceptual. This description of levels seems appropriate for characterizing the content of mental images and is congenial to my proposal. Images are structured and might be partly nonconceptual. But in addition there is a close connection to conceptual information and structured representation within the cognitive system.
6
Imagery: Any Arguments in Favor of Nonconceptual Content?
One question remains: are the arguments presented in favor of the view that visual experiences are nonconceptual convincing for visual imagery experiences as well? Let us consider first richness and then fineness of grain. Richness and imagery Richness is often understood as the thesis that visual experiences contain more information than the subject is able to extract cognitively, i.e. in beliefs or judgements. Regarding the perceptual experience, Heck has the following example. Before me now, for example, are arranged various objects with various shapes and colors, of which, it might seem, I have no concept. My desk exhibits a whole host of shades of brown, for which I have no names. The speakers to the sides of my computer are not quite flat, but have curved faces; I could not begin to describe their shape in anything like adequate terms. [..] Yet my experience of these things represents them far more precisely than that, far more distinctively, it would seem, than any characterization I could hope to formulate, for myself or for others, in terms of the concepts I presently possess. The problem is not lack of time, but lack of descriptive resources, that is, lack of the appropriate concepts. (Heck, 2000, pp. 489–490)
Level of Perceptual Content and Images
127
Just for the sake of argument, let us accept that the experience described here is nonconceptual, because it is richer than any concepts I possess. Let us try to transfer the examples to imagery. I spend many hours a day at my desk, thus I can easily imagine being in that situation. Is the experience in imagery richer than any concepts I posses? I have stored information about the shape of my speakers. If they have an extravagant Colani-design I might need to generate an image of them from simpler shapes, and the image thus generated might even be inadequate in detail. Should we conclude that the experience is nonconceptual? What about the exact shade of brown in the upper right corner, which has a stain due to water spotting? I at least have problems recalling the exact shade of brown. According to the debate, that would mean I lack the concepts of brown27 (say) and Colani-speakers. Is the experience in imagery thus nonconceptual? Is the richness argument to be understood as saying that the image carries unexploited content that the subject must exploit in further analysis? This information is not available explicitly in long term-memory but can be exploited by using imagery. Trivially, it must be possible for a system to be able to learn to exploit the information carried by a representation or indicator signal. This is the basic idea of using imagery in reasoning processes or to discover implicit information. (cf. Kosslyn 1994, p. 324) In a way this implies that the information is there prior to its analysis. Unfortunately, this reading does not help. For we are dealing with the question of experience. Should we say that this experience in imagery contains more information than the subject is able to extract cognitively, i.e. the experience is rich? No, for the richness of visual experience in imagery is very limited, because images rapidly fade and are hard to maintain. That is exactly the reason why we need activation at both, the intermediate and higher levels to produce and maintain the representations during imagery. Fineness of grain and imagery Advocates of the nonconceptual content of visual experiences mostly argue that fineness of grain and richness both give us reason to assume nonconceptual content. Tye recently argued that richness is not something we can appeal to on behalf of nonconceptual content of visual experiences.16 (cf. Tye, forthcoming, sect. 4) In contrast, fineness of grain is supposed to entail having a nonconceptual content. Fineness of grain is typically understood as meaning that perceptual experience contains a determinacy of details that goes beyond the concepts I posses. 16
Furthermore, according to Tye, richness does not entail neither fineness of grain nor the possession of a nonconceptual content.
128
Verena Gottschling
Let us assume for a moment this is right for perceptual experience. Would the same argument work for experienced images? No, clearly not. For I cannot generate images for which I have no information in memory. I can only combine information or knowledge: I can imagine a cobalt blue lion hunting a king penguin, even if I never had any such visual experience. Does this experience has fineness of grain? Is the lion represented with a determinacy of detail that goes beyond the concepts I posses? Probably not. For I can only refer to details or information that are there17 prior to their analysis or before I attend to them. To be able to attend to the form of the penguin’s orange ear patches presupposes that I have knowledge that king penguins have these characteristic ear patches. Maybe I could transform images in ways that gives rise to new experience with a determinacy of detail going beyond the concepts I possess. Let us assume I have a concept for cobalt blue and I imagine a cobalt blue square. Now I try to add slightly more yellow and thereby change the color. The result would be a new shade of blue for which I possess no concept. You may try it; I at least am not able to do so. People like artists who constantly work with paint might at least have a fleeting image. But they might be expected to have more color concepts as well. This result is not surprising at all. The experience of an image is greatly impoverished compared with the corresponding perceptual experience. For to maintain an image we need concentration and voluntarily directed attention. There is empirical evidence that images fade within an average duration of 250 ms. (Kosslyn, 1994, p. 101) Effort is necessary to refresh them. This process of image maintenance involves not only storage processes but also active processes. And we have to refresh them, because otherwise they do not remain long enough to be used in imagery tasks, which are normally at least two seconds in duration. Thus image maintenance is fundamental for all the other processes, in generating more complex images with different parts, and also during transformation and introspection of images. When we refresh an image the stored category plays a causal role. In imagery conceptual information is necessary. What does this show?18 Experienced images and perceptional experiences may differ regarding richness and fineness of grain. Even if we accept the arguments for perception, they do not work for imagery. Is there any reason why we should postulate that both visual and imagery experiences have to have nonconceptual content? This has been claimed, but I doubt there is any good reason for it. Even if we concede that the argument works for visual experiences (and 17
in the short-term representation or in long-term memory As we have seen, the explanatory power of these thought experiments is limited. For our experience might not give us any evidence whether the content P of a state is nonconceptual or conceptual. 18
Level of Perceptual Content and Images
129
that is highly controversial) these arguments seem at best problematic in cases of visual imagery. In imagery, conceptual information is necessarily involved and plays a causal role both in the selection of information contained in an image and in attention mechanisms. The only reason to treat imagery and visual perception as parallel is the perception analogy. But this analogy does not force us to say that both, the representations and their contents are identical. It only forces us to say that they are sufficiently similar. The correctness conditions for a scene in visual perception and imagery are different. The experiences differ as well: we are in most circumstances able to decide whether something is a visual perception or an imaginary situation. The correctness conditions also differ. The content of the perceptual state and the content of the corresponding image are thus not identical but similar. They cannot be identical because, according to Tye, any two visual experiences that are exactly alike in their representational contents are necessarily exactly alike in their phenomenal character. Are we forced to say that the type of both experiences is identical? If this is denied, the perception analogy becomes remarkably diluted. It seems odd to claim that both capacities use the same kinds of representation but do not even have the same kind of content: nonconceptual respectively conceptual content. But if we require both of them to have the same kind of states with the same kind of contents, we run into a problem. For images are necessarily accompanied by and are more strongly bound to conceptual information than perceptual representations. The contents of images are necessarily at least partly conceptual. On a closer look, we could suggest that the important difference between the two might even be that images are more strongly restricted than perceptual representations. But this strategy has a risk. For in that case we need an argument that unifies both kinds of representations and contents. An account in which both kinds of representations are assumed to have nonconceptual contents does not work. For the arguments in favor of nonconceptual content are much more convincing in relation to perceptual experience. In the case of images the examples seem much less convincing. As we have seen, being abstract and poised do not work either. This is far from being a decisive argument that perceptual representation cannot have nonconceptual content. It is rather a conditional argument. First, images are identified with perceptual representations. Second, if we also want to classify both images and perceptual representations as having the same sort of nonconceptual content, we need an argument for both kinds of representations being nonconceptual. Whether perceptual representations are nonconceptual is highly controversial. The arguments in favor of nonconceptualism seem at best hard to transfer to imagery. This is not surprising at all. For it is grounded in the difference of both the capacities that make use of these representations. But it means, there is a gap that needs to be filled even for advocates of PaNoCo.
130
7
Verena Gottschling
Conclusions
Let us take stock. We started with the question what kind of perceptual representations images should be identified with and what kind of content – if any – they have. The options available seemed to be that they have no content by themselves (NoCont), that they have nonconceptual content, partly or even entirely (PaNoCo or EnNoCo) or that they have conceptual content (CoCo). I argued that the minimal condition for a pictorialist should be a position between (NoCo) and (CoCo). Images have to have partly conceptual content. Thus I argued for a revised version of PaNoCo as a minimal contraint.19 According to this improved partly nonconceptual option I argued that a part of images in image processes – the stored image schemas – are necessarily conceptual. The experienced images are identified with intermediate perceptual representations, which are structured but might be nonconceptual. The view introduced is a minimal constraint for an explanatory pictorialist view. For my considerations factored out the question what concepts are. Depending on whether you think that perceptual representations are partly or fully conceptual, we have to concede that the classical reading of the perception analogy is diluted. Images are not raw displays. I argued that the analogy needs to be understood at least as presupposing that both kinds of contents are type-identical. In this respect the problems posed here result in a serious problem for advocates of nonconceptual content. We are in need of arguments why images should be nonconceptual, since the arguments used in relation to visual perception do not work in the case of images. Depending on the theory of concepts you favor the result might even be stronger. Advocates of prototype or even proxytype theories understand images as perceptual representations, which constitute concepts. Concepts are combinations of copies of perceptual representations, whereby we have activation from what I called image-schemas to intermediate perceptual representations – the 2.5-D sketches. Thus images and perceptual representations are seen as conceptual, and we have a version of CoCo. Furthermore, we can rule out EnNoCo. Images have to have conceptual content to an important degree. If images and perceptual representation have the same special kind of content, entirely nonconceptual content is the wrong candidate. This does not deserve to be called a positive theory about the content of images because I leave two possibilities open. But I have sought to broker a reconciliation. On the one hand, an adequate theory of images must have sufficient expressive power to accommodate the role imagery plays as a cognitive capacity. On the other hand, 19
In fact, the answer depends on the way you characterize image: if you are an advocate of the two components view, images have partly conceptual content. If not, image processes are nonetheless strongly bound to conceptual knowledge. In both cases, in an important sense images have at least partly conceptual content.
Level of Perceptual Content and Images
131
the perception analogy constrains appropriate accounts but is at the same time the source of many misunderstandings. A promising reconciliation demands at least the concession, that images as well as perceptual states have (at least) a partly conceptual content. We need to blur the boundary between conception and perception. References Aizawa, K. (1997). The role of the systematicity argument in classicism and connectionism. In S. O’Nuallain (Ed.), Two science of mind: Readings in cognitive science and consciousness (pp. 197–218). Amsterdam: John Benjamins. Anderson, J. A. (2003). Consciousness and nonconceptual content. Philosophical Studies, 113, 261–274. Barsalou, L. (1999). Perceptual symbol systems. Behavioral and Brain Sciences, 22(4), 566–660. Bermudez, J. L. (1994). Peacocke’s argument against the autonomy of nonconceptual content. Mind and Language, 9, 402–418. Bermudez, J. L. (2003). Thinking without words. New York: Oxford University Press. Bermudez, J. L., & Macpherson, F. (1989). Nonconceptual content and the nature of perceptual experience. Electronic Journal of Analytical Philosophy, 8, special issue on the philosophy of Gareth Evans edited by Rick Grush. Biederman, I. (1987). Recognition by components: A theory of human image understanding. Psychological Review, 994, 115–147. Byrne, A. (2002). DON ’ T PANIC: Tye’s intentionalist theory of consciousness. In entire symposium: A field guide to the philosophy of mind symposium on tye’s consciousness, color, and content, winter 2002/3. http://host.uniroma3.it/progetti/kant/field/tyesymp.htm. Chambers, D., & Reisberg, D. (1985). Can mental images be ambiguous? Journal of Experimental Psychology: Human Perception and Performance, 11(3), 317–328. Crane, T. (1992). The nonconceptual content of experience. In T. Crane (Ed.), The contents of experience (pp. 136–157). Cambridge: Cambridge University Press.
132
Verena Gottschling
Cummins, R., Blackmon, J., Byrd, D., Poirier, P., Roth, M., & Schwarz, G. (2001). Systematicity and the cognition of structured domains. Journal of Philosophy, 98, 167–185. Dretske, F. (1969a). Naturalizing the mind. Chicago: University of Chicago Press. Dretske, F. (1969b). Seeing and knowing. Chicago: University of Chicago Press. Dretske, F. (1981). Knowledge and the flow of information. Cambridge, MA: MIT Press. Evans, G., & McDowell, J. (Eds.). (2000). The varieties of reference. Oxford: Oxford University Press. Finke, R. A., Pinker, S., & Farah, M. A. (1989). Reinterpreting visual patterns in mental imagery. Cognitive Science, 13, 51–78. Gelder, T. v. (1990). Compositionality: A connectionist variation on a classical theme. Cognitive Science, 14, 355–384. Gottschling, V. (2003). Bilder im Geiste. Die Imagery-Debatte. Paderborn: Mentis. Gunther, Y. H. (2003). Essays on nonconceptual content. general introduction. In Y. H. Gunther (Ed.), Essays on nonconceptual content (pp. 1–20). Cambridge MA: MIT Press: MIT Press. Heck, R. G. (2000). Nonconceptual content and the space of reasons. Philosophical Review, 109, 483–523. Henderson, J. M., & Hollingworth, A. (1999). The role of fixation position in detecting scene changes across saccades. Psychological Science, 10(5), 438–443. Jackendoff, R. (1987). Consciousness and the computational mind. Cambridge, MA: MIT Press. Jackendoff, R. (2002). Foundations of language. Oxford: Oxford University Press. Kelly, S. (2001). The nonconceptual content of perceptual experience: Situation dependence and fineness of grain. Philosophy and Phenomenological Research, 62, 601–608. Kosslyn, S. M. (1994). Image and brain: The resolution of the imagery debate. Cambridge, MA: MIT Press.
Level of Perceptual Content and Images
133
Kosslyn, S. M., Ganis, G., & Thompson, W. L. (2003). Mental imagery: Against the nihilistic hypothesis. Trends in Cognitive Science, 7(3), 109– 111. Kosslyn, S. M., & Thompson, W. L. (2003). When is early visual cortex activated during visual mental imagery? Psychological Bulletin, 129(5), 723– 746. Kosslyn, S. M., Thompson, W. L., & Ganis, G. (2002). Mental imagery doesn’t work like that. Behavioral and Brain Sciences, 25(2), 198–200. Marr, D. (1982). Vision. New York: W. H. Freeman. McDowell, J. (1994). Mind and world. Cambridge, MA: Harvard University Press. O’Regan, J. K., Deubel, H., Clark, J. J., & Rensink, R. A. (2000). Picture changes during blinks: Looking without seeing and seeing without looking. Visual Cognition, 7, 191–212. O’Regan, J. K., & No¨e, A. (2001). A sensorimotor account of vision and visual consciousness. Behavioral and Brain Sciences, 24(5), 939–973. Pacherie, E. (1998). What might nonconceptual content be? Philosophical Issues, 9,, 9. Pacherie, E. (2000). Levels of perceptual content. Philosophical Studies, 100, 237–254. Peacocke, C. (1994). A study of concepts. Cambridge, MA: MIT Press. Peacocke, C. (2001). Does perception have a nonconceptual content? Journal of Philosophy, 98, 239–264. Reisberg, D., & Chambers, D. (1991). Neither pictures nor propositions: What can we learn from a mental image? Canadian Journal of Psychology, 45(3), 336–352. Stalnaker, R. (2003). What might nonconceptual content be? In Y. H. Gunther (Ed.), Essays on nonconceptual content (pp. 95–106). Cambridge MA: MIT Press: MIT Press. (Fiirst published in 1998 in: Philosophical Issues, 9, E. Villanueva (ed.)) Tye, M. (1991). The Imagery Debate. Cambridge, MA: MIT Press. Tye, M. (1993). Image indeterminancy: The picture theory of mental images and the bifurcation of ‘what’ and ‘where’ information in higher-level vision. In N. Eilan, R. McCarthy, & B. Brewer (Eds.), Spatial representation.
134
Verena Gottschling
problems in philosophy and psychology (pp. 356–372). Oxford: Oxford University Press. Tye, M. (1995). Ten problems of consciousness. a representational theory of the phenomenal mind. Cambridge: Cambridge University Press. Tye, M. (2000). Consciousness, color, and content. Cambridge, MA: MIT Press. Tye, M. (2002a). To PANIC or not to PANIC? — Reply to Byrne. In entire symposium: A field guide to the philosophy of mind symposium on tye’s consciousness, color, and content, winter 2002/3. http://host.uniroma3.it/progetti/kant/field/tyesymp.htm. Tye, M. (2002b). Visual qualia and visual content revisited. In D. Chalmers (Ed.), Philosophy of mind: Classical and contemporary readings. Oxford: Oxford University Press. Tye, M. (2005). Nonconceptual content, richness, and fineness of grain. In T. Gendler & J. Hawthorne (Eds.), Perceptual experience. Oxford: Oxford University Press. (Forthcoming) Wright, W. (2003). McDowell, demonstrative concepts, and nonconceptual representational content. Disputatio, 14, 37–51.
Recognitional Concepts and Conceptual Combination Pierre Jacob
In my opinion, there is little doubt that Jerry Fodor’s incisive and insightful writings have more often than not shaped the agenda of recent philosophy of mind. It’s always instructive and formative to disagree with him. This is just what I do in this paper. Over the past decade or so, Jerry Fodor has persistently argued that mere acceptance of compositionality rules out several influential contemporary accounts of the nature of concepts. In particular, he claims that compositionality rules out a family of views about concepts, which he vehemently objects to, and which he labels “concept-pragmatism”. Fodor claims that the only view of concepts that survives the argument based on compositionality is the view which he now calls “concept-Cartesianism” and which he endorses. Like many other philosophers, I find Fodor’s argument based on compositionality unconvincing.1 I even think that it is not consonant with some of Fodor’s other views. In this paper, I shall explain why. The paper falls into five sections. In the first section, I argue that Fodor’s contrast between concept-pragmatism and concept-Cartesianism is spurious. In the second section, I question Fodor’s claim that conceptual atomism is necessitated by informational semantics. In section 3, I discuss the tension between conceptual atomism and the role of content in the psychological explanation of actions. In section 4, I examine Fodor’s notorious compositionality argument against recognitional concepts. In section 5, I examine another of Fodor’s argument that purports to show that the content of thought has priority over the content of language based on the thesis that, unlike the content of thought, the linguistic meaning of sentences is not compositional. Address for correspondence: Institut Jean Nicod, UMR 8129 1bis, Avenue de Lowendal F-75007 Paris, France . E-mail: [email protected]. 1
Horwich (1998), Peacocke (2000, 2004), Recanati (2002).
The Compositionality of Meaning and Content. Volume I: Foundational Issues. Edited by Markus Werning, Edouard Machery, & Gerhard Schurz. c
2005 Ontos Verlag. Printed in Germany.
136
1
Pierre Jacob
Concept-Pragmatism vs. Concept-Cartesianism
Ever since he wrote a paper entitled “Concepts: a potboiler” in 1995 – a precursor to his 1998 book, Concepts, Where Cognitive Science Went Wrong, Fodor keeps drawing a contrast between two competing approaches to concepts: a pragmatist approach, which he rejects and which he associates with much twentieth century philosophy of mind and language, and a classical approach (going back to Hume and Descartes), which he recommends and which he has recently labelled “concept-Cartesianism”. As I understand it, the main issue between concept-pragmatism and concept-Cartesianism is whether conceptindividuation has priority over concept-possession or vice-versa. On Fodor’s account, classical theories – in particular concept-Cartesianism – assumed that concept-individuation comes first and concept-possession is derivative upon concept-individuation. First, you say what a concept is by saying what it is true of, what it is about or what it represents. Concept-possession follows from concept-individuation: to have a concept is to be able to think about what the concept is a concept of. By contrast, according to conceptpragmatism, concept-possession is primary and concept-individuation is derivative. First, you fix the epistemic capacities required of a creature for possessing a given concept, e.g., being able to sort instances of a concept and/or to make inferences linking that concept to a network of surrounding concepts. And then you derive concept-individuation as whatever it is that the creature has in virtue of satisfying the possession conditions for that concept. Frankly, I suspect that the distinction between concept-pragmatism and concept-Cartesianism is a distinction without much difference.2 First of all, Fodor repeatedly claims that according to concept-Cartesianism, to have e.g., the concept DOG is to be able to think about dogs. Period. He does not, however, say much about what, on his view, constitutes the ability to think about dogs (as opposed to the ability to e.g., see, smell or hear them). By Fodor’s lights, unlike concept-Cartesianism, concept-pragmatism assumes that having a concept is being able to do things such as sort dogs from non-dogs, recognize the former from the latter, reason about the former or make inferences about them. Cartesians, he claims, “hold that concept-possession is an intentional state but not an epistemic one” (Fodor, 2004: 31). Is it really? Seeing, hearing and touching are intentional states and arguably (if Dretske is right) not every episode of seeing, hearing or touching is an epistemic state. If Dretske is right, then one may see (hear or touch) a dog and not notice it. If so, then seeing (hearing or touching) a dog may be interestingly different from recognizing one or sorting a dog from non-dogs by visual or other 2
See my (1995) for an early expression of that suspicion.
Recognitional Concepts and Conceptual Combination
137
sensory means. The reason not all seeing (hearing or touching) is epistemic is precisely that a creature lacking the concept DOG might nonetheless see (hear or touch) a dog. And although a creature with the concept DOG might see (hear or touch) a dog, her doing so might fail to trigger the tokening of her concept DOG. No doubt, concept-possession too is an intentional state. But how could it fail to be epistemic? Belief states, I take it, are paradigmatic epistemic states. Are not concepts fundamental building blocks of the contents of belief states? Fodor gestures towards a distinction between, on the one hand, using the concept DOG to think about dogs and, on the other hand, being able to recognize dogs, sort them from other things or make inferences about them, and come to know something about them, thus form beliefs about them. Using the concept DOG to think about dogs is OK by Cartesian standards. But what exactly could it be to have the ability to think about dogs if it were not the joint abilities to recognize dogs, to sort them from other things, to make inferences about them and to come to know things about them? Secondly, in his 1998 book on Concepts, Fodor argues that the pragmatist priority of concept-possession over concept-individuation is biased in favor of an ontology of concepts construed as behavioristic mental dispositions rather than genuine mental particulars. If so, then presumably the state of having a concept would fail to have causes and effects. As he puts it on behalf of the priority of concept-individuation over concept-possession, “concepts are mental particulars: specifically, they satisfy whatever ontological conditions have to be met by things that function as mental causes and effects” (Fodor, 1998: 23). In effect, a cognitive scientific study of concepts would be ruled out by conceptpragmatism. Is concept-pragmatism so biased? I don’t think so. Only tokens, not types, I take it, can be causes and effects. So (assuming, as Fodor does, the language of thought hypothesis) it is wide open, it seems to me, for someone who thinks that the individuation of concepts cannot be dissociated from the use of concepts in such mental processes as sorting and inferring to grant that mentalese tokens of the concept DOG have causes and effects and nonetheless to assume that not unless a creature can sort dogs from other things and make appropriate inferences about dogs can she have a mentalese term-type with the meaning DOG.3 The lesson, I think, one should draw from this preliminary survey is that one should reject – not accept – Fodor’s claim that one must choose between concept-pragmatism and concept-Cartesianism, i.e., between the priority of concept-possession over concept-individuation or vice-versa. In particular, informational semantics can suitably be construed as a theory of the possession 3
As I understand it, this is Peacocke’s (2004) position.
138
Pierre Jacob
conditions of concepts: one could not possess the concept DOG unless states of one’s brain nomically covaried with instances of the property doghood. 2
Informational Semantics and Conceptual Atomism
On the face of it, Fodor accepts classical concept-Cartesianism and he rejects concept-pragmatism because he thinks that, unlike the latter, the former fits with the kind of conceptual (or semantic) atomism to which he is currently drawn by his latest version of the computational representational theory of the mind (RTM), which involves the four following assumptions: (i) psychological explanation involves laws that are both causal and intentional. They are causal in virtue of their being implemented by non-semantic, computational processes. They are intentional because they refer to the contents of an agent’s propositional attitudes. (ii) Mental representations are the primitive (or primary) bearers of intentional content. (iii) Thinking and other mental processes are computations. (iv) Meaning is information, more or less (cf. Concepts, ch. 1). Conceptual atomism is the view that a creature could have a single concept (e.g., DOG) and no other concepts. According to pure informational semantics, a concept (or mental symbol) derives its content from its being nomically locked to instantiations of doghood. Fodor is drawn to conceptual atomism by two conspiring lines of thought. On the one hand, he believes that conceptual atomism is required by assumption (iv), i.e., his acceptance of pure informational semantics. On the other hand, he believes that any alternative to conceptual atomism is a threat to assumption (i), i.e., the role of content in psychological explanation or to there being robust intentional psychological laws that apply to the contents of propositional attitudes of more than one human agent at a time. According to Fodor, one basic constraint on a theory of content is that content be amenable to a naturalistic account. Pure informational semantics (assumption (iv)) is supposed to meet the naturalistic constraint. I would like to make two observations about Fodor’s endorsement of pure informational semantics. First of all, he is, I think, wrong to assume that pure informational semantics imposes conceptual atomism.4 Suppose a creature’s brain would be such that it could not get locked directly onto doghood. What if, as a matter of cognitive architecture, the nomic covariation beetween states of the creature’s brain and instances of doghood was mediated by a pair of concepts, e.g., ANIMAL and BARK? In such a case, the creature could simply not have the concept DOG unless it had the concepts ANIMAL and BARK and conceptual atomism would be violated. If you don’t think my example is plausible, consider the concept ELM. Suppose a creature could not get locked onto elmhood unless it had the concepts TREE, BRANCH, LEAF, TWIG, ROOT, PLANT and BEECH. Would the nat4
See my (1996) Review of Fodor’s (1994).
Recognitional Concepts and Conceptual Combination
139
uralistic constraint be thereby violated? Frankly, I don’t see why. Of course, it is metaphysically possible that some creature could get directly locked onto elmhood (or doghood, for that matter) without any intermediary. But metaphysical possibility is not what is at issue. What is at issue is what is true as a matter of empirical fact of the actual cognitive structures of humans and other non-human animals. Second observation about pure informational semantics. I do not think that Fodor’s own Asymmetrical Dependency Theory (ADT) of content is entirely consistent with his endorsement of concept-Cartesianism. The goal of ADT is to provide a pure informational account of the possibility of misrepresentation. The basic idea is that false tokenings of a concept are asymmetrically dependent upon true tokenings.5 Nomic correlations being imperfect (and my not being infallible), tokens of my HORSE concept are less reliably correlated with instances of horsehood than with the disjunction of instances of horsehood and instances of e.g., donkeyhood in perceptually imperfect conditions. According to Fodor’s ADT, tokenings of my HORSE concept caused by non-horses asymmetrically depend upon tokenings of my HORSE concept caused by horses. Some philosophers have questioned ADT on the grounds that it cannot claim to be a genuinely naturalistic, i.e., non-intentional (non-semantic) account of content since it presupposes the very semantic distinction between veridical and non-veridical tokenings of the concept HORSE that a naturalistic account is supposed to explain. My present observation is that built into ADT is a view of concept possession (of e.g., the concept HORSE) that involves the ability to sort horses from non-horses disavowed by concept-Cartesianism. If ADT is part of an account of what it takes to have the concept HORSE (or to think about horses), then ADT is not compatible with the concept-Cartesian denial that sorting horses from non-horses is part of having the concept HORSE.
3
Conceptual Atomism and Psychological Explanation
One fundamental motivation in favor of RTM is acceptance of a second constraint on a theory of content according to which the contents of an agent’s propositional attitudes ought to be responsive to the demands of psychological explanation, i.e., that the content of e.g., an agent’s belief may contribute to producing its effect in virtue of its content. From his book The Elm and the Expert (1994) on, Fodor has given up his earlier dual view whereby my beliefs about water (or H2 O) and my twin’s belief about twater (or XYZ) both have 5
For further discussion, see my (1997: ch. 3).
140
Pierre Jacob
different broad contents and the same narrow content.6 In effect, he has given up on the Fregean assumption that two tokens of mentalese can have the same reference and different senses. Fodor may still distinguish two coreferential concepts (e.g., WATER and H2 O) in virtue of the syntactic differences between two modes of presentation (or mentalese symbols) of one and the same subsstance. For example, one may have the concept WATER, but not the concept H2 O if one lacks the concepts H and 2. But modes of presentation cannot be senses: they are pure syntactic objects without a semantic role. By embracing semantic (or conceptual) atomism, Fodor is driven towards one of two undesirable consequences or both. Either he must give up the constraint according to which content must be responsive to psychological explanation or he must espouse an implausible view of so-called Fregean cases. Or both. Consider a typical Fregean case, i.e., the case of an agent (e.g., Oedipus), who has two distinct propositional attitudes about one and the same object (e.g., a mug). Suppose that Oedipus believes that the mug contains water. That is why he drinks water from it. But he does not believe that the mug contains H2 O. Although Oedipus’ belief that the mug contains water has exactly the same content (or truth-conditions) as the belief that the mug contains H2 O, the two belief states are different from each other because one could have the former, not the latter, without having the concept H2 O. But if the difference between two belief states with exactly the same content arises from a non-semantic difference, then Fodor’s own approach to the role of content in psychological explanation becomes hardly distinguishable from Stich’s (1983) purely syntactic theory: content becomes entirely irrelevant to psychological explanation. Alternatively, in The Elm and the Expert, Fodor has taken what I take to be the utterly implausible view that Frege cases are exceptions to psychological laws. In fact, Fodor has committed himself to the incredibly strong claim that 6
Arguably, one strong motivation for distinguishing narrow content from broad content is to make room for mental causation. Let us assume the truth of token physicalism. Now, if the content of a mental state is an extrinsic property of an individual’s brain state (if it does not supervene on the intrinsic physical properties of the brain state token), then it may not be one of the properties that play a causal role in the production of the individual’s intentional behavior. The content dualist claims that although broad content may fail to play a causal role, narrow content does play a causal role. Content dualism can only resolve the tension between externalism and mental causation if indeed some intelligible notion of narrow content does supervene on the intrinsic physical properties of an individual’s brain state. Arguably, Fodor came to doubt that this is possible. Furthermore, strictly speaking, he did not need any notion of narrow content to make room for mental causation, since on his view, although psychological laws do refer to the broad informational content of mental symbols, nonetheless the causally efficacious properties of mental symbols are their formal syntactic properties, not their semantic properties.
Recognitional Concepts and Conceptual Combination
141
“any intentional psychology has to take for granted that identicals are generally de facto intersubstituable in belief/desire contexts for those beliefs and desires that one acts on” (Fodor, 1994: 40). In effect, this claim amounts to denying the opacity of those beliefs and desires on which an agent acts. Fodor cannot be right when he assumes, as he does, that belief-desire psychology is “committed to treating Frege cases as aberrations”. Not knowing an identity is just lacking a piece of knowledge. Not being omniscient is not being irrational.7 4
Fodor’s Compositionality Argument Against Recognitional Concepts
Fodor assumes – and I assume that it is common ground – both that conceptual content is, as he calls it, productive and systematic and that the explanation of the productivity and systematicity of conceptual content derives from the compositionality of primitive (or undefinable) concepts. According to Fodor’s definition, a recognitional concept is a concept whose possession requires the ability to recognize its instances. In fact, recognition is merely one among a number of potential epistemic properties of concepts. Color concepts, e.g., RED, are presumably good instances of recognitional concepts. I do not know about the concept PET. Fodor sets up an argument for the claim that there are no recognitional concepts based on two premisses. The conclusion purports to be broad enough so as to encompass the claim that no epistemic property can be a constitutive property of concept-possession. First premiss: concepts are compositional, i.e., a complex concept (or “host”) derives its content (or semantic value) from the contents (or semantic values) of its constituents. Second premiss: recognitional capacities (or epistemic properties in general) are not compositional. I will grant Fodor his second premiss. As Fodor points out, there are many reasons why recognitional capacities might not compose: you might be an expert at recognizing pets and at recognizing fish. And still you might be poor at recognizing instances of pet fish. Alternatively, the conditions for recognizing instances of one constituent might be inappropriate for recognizing instances of the other constituent. For example, consider Fodor’s (2004: 38) fanciful example of the Night-Flying Bluebird, namely a blue bird that sings after dark: the favorable conditions for recognizing instances of one constituent are never favorable for recognizing instances of the other and vice-versa. Fodor’s conclusion is that no concept can be recognitional or that no epistemic property can be constitutive of any concept. As several philosophers (Horwich, 1998, Peacocke, 2004, Recanati, 2003) have pointed out, according to the standard version of the principle of compositionality, the semantic value of a host concept must be a function of the 7
See my (1996) Review of Fodor’s (1994).
142
Pierre Jacob
semantic values of its constituents. If one accepts a Fregean view of content, then the principle of compositionality will apply at two distinct levels of content: the sense of a complex concept (or expression) is a function of the senses of its constituents and the reference of the complex concept is a function of the references of its constituents. Now, if it is a possession-condition on the constituent concept BLUE that one be able to recognize instances of blue things, then one could not have the complex (or “host”) concept BLUE DOG unless one could recognize instances of blue things. In order to form the complex concept BLUE DOG in accordance with the standard version of the principle of compositionality, whatever the conditions on the possession of the concept BLUE are, it is sufficient that one be able to put together the two constituent concepts BLUE and DOG. From the fact that one could not have a recognitional concept unless one could recognize instances of that concept, it does not follow from the standard version of the principle of compositionality that one could not have the complex (or “host”) concept unless one had a complex recognitional capacity, which is itself a joint product of the separate recognitional capacities required for having the constituent concepts. But Fodor is not satisfied to accept the standard version of the compositionality principle according to which a complex concept inherits its semantic value from the combination of the semantic values of its constituents via some syntactic rule of combination. In fact, Fodor explicitly endorses a biconditional version of the principle of compositionality according to which nothing can be a property of a complex (or host) concept unless it is a property of its constituent and vice-versa. As he puts it in one among many places, “the connection that compositionality imposes on the relations between the possession conditions of constituent concepts and the possession conditions of their hosts goes in both directions. That is compositionality requires not just that having the constituent concepts is sufficient for having a host concept, but also (and even more obviously) that having the host concept is sufficient for having its constituents. Or to put it slightly differently, compositionality requires that host concepts receive their semantic properties from their constituents, and also that their constituent concepts transmit all of their semantic properties to their hosts” (Fodor, 2001: 8–9). In other words, nothing can be a property of some constituent concept unless it is also a property of the host (or complex) concept of which it is a constituent. Nothing could be a possession condition of a constituent concept unless it is a condition possession of the host (or complex) concept of which it is a constituent. Notice that Fodor’s formulation of the biconditional version of the principle of compositionality oscillates between talk of “condition possessions on concepts” and talk of “semantic properties”.8 But the standard version of the principle of 8
Surely (as noticed by a referee for this paper), Fodor does not intend that the prop-
Recognitional Concepts and Conceptual Combination
143
compositionality merely requires that the semantic value of a host concept be a function of the semantic values of the constituent concepts. Whatever the possession-conditions of a color concept such as RED, what RED contributes to its host is its semantic value, namely the property red. Fodor uses his biconditional version of the principle of compositionality to argue that if the constituent concept BLUE has possession-conditions ABC and if possession-conditions do not compose, then it is possible that the possessionconditions of the complex concept BLUE DOG be ABEFG, i.e., it is possible that the possession-conditions of the complex concept do not result from the composition of the possession-conditions of the constituent concept (ABC). If so, then one could have the complex concept BLUE DOG and lack its constituent concept BLUE, whose possession-conditions are ABC. But in fact, in accordance with the simple version of the principle of compositionality, one could not have the concept BLUE DOG unless one had the constituent concept BLUE. If the possession-conditions for the concept BLUE include ABC, then it follows, from the simple version of the principle of compositionality, that one could not have the concept BLUE DOG unless one satisfied conditions ABC. To see why Fodor’s biconditional version of the principle of compositionality is unacceptably strong, I now want to suggest that it could equally be turned against his own version of informational semantics. According to pure informational semantics, one could not possess the concept DOG unless states of one’s brain were nomically locked onto the property doghood. But consider the part of the biconditional version of the principle of compositionality according to which nothing can be a possession-condition of a constituent concept unless it is also a possession-condition of the host (or complex) concept of which it is a constituent. Clearly, no state of any creature has actually been locked onto particular blue dogs9 . Locking onto a property is, therefore, not a compositional property of concepts. If the constituent concept DOG has possession conditions ABC (locking onto doghood) and if possession-conditions do not compose, then it becomes possible that the possession-conditions of the complex concept BLUE DOG be ABEFG. If so, then one could have the complex concept BLUE DOG and lack its constituent concept DOG, whose possessionconditions are ABC (locking onto doghood). So no concept has pure informational possession-conditions, not even DOG. I suppose that this is a reductio of erty of a complex (or host) concept of being complex also be a property of its noncomplex constituents. Arguably, this unintended consequence can be avoided if the biconditional version of the compositionality principle is stated, not as a constraint on the semantic properties of concepts, but rather as a constraint on the condition possessions for concepts. 9 I am ruling out dogs that have been painted blue and I am considering only dogs that are naturally (or genetically) blue.
144
Pierre Jacob
the argument based on the biconditional version of the compositionality principle.10 5 The Compositionality of Language and the Underdetermination of Thought by Language Interestingly, Fodor (2001) also appeals to the biconditional version of the compositionality principle in an argument in favor of the thesis of the priority of the content of thought over language to which I presently turn. Although, as it will turn out, I don’t accept Fodor’s argument, I have no qualms with the content of his conclusion: I accept the thesis of the priority of the content of thought over language. But I find Fodor’s argument for this conclusion unconvincing. The argument is in four steps: (i) Either language or thought but not both must be compositional. (ii) Whichever of language or thought is compositional has original, underived or primitive content (or intentionality). (iii) Unlike thought, language is not compositional. (iv) Conclusion: Therefore, the content of thought has priority over the content of language. As I said, I have no problem with the conclusion, nor with the fact that Fodor embraces it. What is puzzling, however, is both Fodor’s first premiss and his argument for premiss (iii). First, according to the first premiss, language and thought cannot be both compositional. Why not? According to the standard version of the principle of compositionality, the semantic value of either a complex thought or a complex linguistic expression depends on the semantic values of its constituents. If one accepts this principle, why could not both thought and language be compositional? By contraposition, if one denies that both language and thought can be compositional, then the suspicion arises that the principle of compositionality involved might not be the standard principle. Secondly, consider Fodor’s argument for premiss (iii) that language is not compositional. At bottom, Fodor’s argument for premiss (iii) is that the content of thought is underdetermined by linguistic meaning. In particular, to put it in John Perry’s (1993) and others’ terminology, the claim is that there are conceptual constituents of thoughts that remain unarticulated at the level of sentence meaning. For example, the sentence “It’s raining” lacks a constituent for the location. The sentence “It’s three o’clock” leaves out such conceptual constituents as in the afternoon (as opposed to in the morning) or perhaps in New York (as opposed to in London). Granted: there are constituents of thought that are unarticulated at the level of sentence meaning. 10
This argument is suggested by Horwich (1998).
Recognitional Concepts and Conceptual Combination
145
But now the question is: from the fact that there are contituents of thought that are unarticulated at the level of sentence meaning, how does it follow that language or rather sentence meaning is not compositional? How could it follow that the meaning of a sentence is not a function of the meanings of its constituents and the rules of syntax? My suspicion is that it could not follow on the standard construal of compositionality. Only on a different construal of compositionality could it follow that sentence meaning is not compositional. Which construal? The answer, I think, is: Fodor’s unacceptably strong biconditional version of the compositionality principle. For the purpose of this argument, Fodor’s picture on the relation between language and thought seems to be the following: the task of the hearer is to determine the speaker’s thought on the basis of linguistic cues, i.e., on the basis of the linguistic meanings of the sentences uttered by the speaker. Now, as the linguistic evidence shows, thoughts have more constituents than sentences do. Thus, in the light of the hearer’s task, the linguistic meaning of a sentence uttered is in effect a constituent of the full content of the speaker’s thought, which functions as the host of the sentence’s linguistic meaning. Now, consider Fodor’s biconditional version of the principle of compositionality according to which nothing can be a property of a host concept unless it is a property of its constituents and vice-versa. In other words, nothing can be a property of some constituent concept unless it is also a property of the host concept of which it is a constituent. Since there are properties of the content of a thought that are not properties of the linguistic meaning of the sentence used to express it, it follows that the linguistic meaning of the sentence uttered is not compositional. Perhaps indeed sentence meaning is not compositional. Perhaps there is more in the meanings of some sentences than there is in the meanings of their parts. But I fail to see the force of an argument against the compositionality of sentence meaning based on the fact that there are more constituents in the content of a thought expressed than in the meaning of the sentence used to express it. Acknowledgements I am grateful to an anonymous referee for his comments and to the reactions of members of the audience who heard a partial version of this paper at the Paris conference on “New Aspects of Compositionality”. References Fodor, J. A. (1994). The elm and the expert. Cambridge, MA: MIT Press.
146
Pierre Jacob
Fodor, J. A. (1995). Concepts: A potboiler. In E. Villanueva (Ed.), Philosophical issues (Vol. 6: Content, pp. 6–24). Atascadero, CA: Ridgeview. Fodor, J. A. (1998a). Concepts. Where cognitive science went wrong. Oxford: Oxford University Press. Fodor, J. A. (1998b). There are no recognitional concepts – not even RED. In In critical condition (pp. 35–47). Cambridge, MA: MIT Press. Fodor, J. A. (2000). Replies to critics. Mind and Language, 15, 350–374. Fodor, J. A. (2001). Language, thought and compositionality. Mind and Language, 16, 1–15. Fodor, J. A. (2004). Having concepts: A brief refutation of the twentieth century. Mind and Language, 19, 29–47. Horwich, P. (1998). Concepts constitution. In E. Villanueva (Ed.), Philosophical issues (Vol. 9: Concepts, pp. 15–19). Atascadero, CA: Ridgeview. Jacob, P. (1995). Can semantic properties be non-causal? In E. Villanueva (Ed.), Philosophical issues (Vol. 6: Content, pp. 44–52). Atascadero, CA: Ridgeview. Jacob, P. (1996). Review of J. A. Fodor’s ‘The elm and the expert’. The European Journal of Philosophy, 4, 373–378. Jacob, P. (1997). What minds can do. Cambridge: Cambridge University Press. Jacob, P. (1998). Conceptual competence and inadequate conceptions. In E. Villanueva (Ed.), Philosophical issues (Vol. 9: Content, pp. 169–174). Atascadero, CA: Ridgeview. Peacocke, C. (2000). Fodor on concepts: Philosophical aspects. Mind and Language, 15, 327–340. Peacocke, C. (2004). Concepts, knowledge, reference and structure. Mind and Language, 19, 85–98. Perry, J. (1993). The problems of the essential indexical and other essays. Oxford: Oxford University Press. Recanati, F. (2002). The Fodorian fallacy. Analysis, 62, 285–289. Rey, G. (2004). Fodor’s ingratitude and change of heart. Mind and Language, 19, 70–84. Stich, S. P. (1983). From folk psychology to cognitive science: The case against beliefs. Cambridge, MA: MIT Press.
How Similarities Compose Hannes Leitgeb
1
Preliminaries
The main question that we are going to deal with in this paper is: What can a study of “the” notion of similarity tell us about compositionality, concepts, and cognition? As far as cognitive science is concerned, typical answers to this question involve empirical findings on prototypes, subsymbolic representation, the “rules vs. similarity” dichotomy, and the like. In contrast to such an approach, we will consider the question from a strictly philosophical perspective. The answers that theories in epistemology, philosophy of science, and ontology can give us are of course not empirical, they are rather part of the rational reconstruction of empirical knowledge as being made prominent by Carnap’s classic Der Logische Aufbau der Welt (briefly: Aufbau; Carnap, 1961; for recent interpretations and evaluations of the Aufbau see Friedman, 1999, and Richardson, 1998). We are going to proceed along the lines of this latter tradition. The plan of the paper is as follows: we start with the study of different notions of similarity and see how they relate to concepts, in particular, to so-called “natural” concepts. The study of this type of concepts will lead us to the further question of how to represent natural concepts symbolically. Finally, we will turn from symbolic representations to the compositionality or non-compositionality of languages with respect to relations of meaning similarity. Each part focuses on a particular way of making our main question from above more specific by raising further questions, i.e.: Question I (section 2): Is the constitution of concepts on the basis of similarity necessarily hampered, as has been argued by several authors? E.g.: is Carnap’s method of quasianalysis in his Aufbau necessarily affected by what Goodman calls the difficulties of imperfect community and of companionship? We will show that this is not the case by pointing out necessary and sufficient Address for correspondence: Fachbereich Philosophie, Universit¨at Salzburg, Franziskanergasse 1, A-5020 Salzburg, Austria. E-mail: [email protected]. The Compositionality of Meaning and Content. Volume I: Foundational Issues. Edited by Markus Werning, Edouard Machery, & Gerhard Schurz. c
2005 Ontos Verlag. Printed in Germany.
148
Hannes Leitgeb
conditions under which systems of concepts can be constituted from a binary similarity relation on individuals in a sound and complete manner. The concepts thus determined have to be considered as natural concepts, as opposed to concepts in general. Since natural concepts are supposed to play an important role for topics such as the confirmation of laws, and since similarity is assumed to play an analogous role for topics such as the categorization of entities by means of prototypes, a nice interplay between epistemology and cognitive science is the natural consequence. It is easy to see that a system of natural concepts as being given by similarity is typically not closed under all the usual logical operations. Therefore, closure under the arbitrary application of logical operations extends such systems to systems of concepts in general. Question II (section 3): Is there some way of generating systems of natural concepts from basic natural concepts by the application of restricted logical operations? As far as one representative example is concerned (G¨ardenfors’ “natural properties” in his Conceptual Spaces; G¨ardenfors, 1990, 2000), we are going to see that the answer is affirmative: simple syntactic rules can be stated by which all and only those concept terms are generated that express such natural concepts. The natural concepts in question are no longer determined by a similarity relation between individuals, but rather by a similarity relation between individuals and sets of individuals. If we set up an assignment of concepts to concept terms by extending some given assignment of natural concepts to simple terms compositionally, compositionality is satisfied trivially with respect to every logical connective. But now assume that we have introduced some notion of similarity for concepts (as can be done, e.g., on the basis of some given notion of similarity for individuals): Question III (section 4): Is there some way of “softening” compositionality, such that the concept that is expressed by any complex term is only determined up to similarity by the concepts that are expressed by its subterms? Another theorem tells us this is not the case: every notion of meaning similarity for concept terms necessarily collapses into some notion of meaning identity or equality (in this sense, Fodor’s and Lepore’s criticism of meaning similarity in Fodor & Lepore, 1999, is confirmed). However, restricted principles of “soft” compositionality may be seen not to be defective in a similar manner. Each of the three parts will only consider paradigm cases; of course, we do not lay any claim to completeness or priority whatsoever concerning the kinds of similarity relations that we investigate, the classes of natural concepts that we consider, or the notions of compositionality that we focus on. For reasons of space, we omit all proofs (in the bibliography we cite two submitted articles which contain some of the proofs); the purpose of this paper is to consider our main question from above in terms of the three more specific questions I-III and to see how these latter questions are interrelated formally and philosophically.
How Similarities Compose
2
149
From Similarity to Natural Concepts
Is it possible to constitute concepts on the basis of a notion of similarity for individuals? If yes: what concepts can be constituted in such a way? Of course, the answers to these questions depend on what kind of similarity is presupposed. Let us concentrate in this section on the simplest possible case, i.e., similarity as a binary relation of individuals. According to Carnap (1961), a similarity relation is just a binary reflexive and symmetric relation. The paradigm case example is metrical similarity: x is similar to y if and only if the distance d(x, y) of x and y is less than or equal to some fixed boundary ε > 0; d is a given metric or measure of distance. Obviously, similarity need not be transitive since small distances might add up in a way such that the given boundary is exceeded, but reflexivity and symmetry hold. We follow Mormann (1994), when we define: Definition 1. (Similarity structure) A pair < S, ∼> is a similarity structure (on S) :iff 1. S is a non-empty set, 2. ∼⊆ S × S is a reflexive and symmetric relation on S. If < S, ∼> is a similarity structure, the members of S will be called ‘individuals’, and if x ∼ y we say that x and y are similar (according to < S, ∼>). Since reflexivity and symmetry are presupposed, we can represent similarity structures uniquely by their corresponding undirected graphs G∼ =< S, {{x, y}|x ∼ y, x 6= y} >. How can we relate this type of similarity to concepts? We are not going to deal with the ontology of concepts; in particular, we will not say much about what kind of entities concepts are. For our purposes it is sufficient that certain individuals may be said to fall under a concept or, synonymously, that a concept may be said to apply to certain individuals. Throughout the paper we will only consider unary concepts and we will not be very specific on the syntactic category of those terms by which we will express concepts. Formally, we define concept structures as follows: Definition 2. (Concept structure) A pair < S,C > is a concept structure (on S) :iff 1. S is a non-empty set, 2. C is a set of subsets of S, ∅ ∈ / C, and for every x ∈ S there is an X ∈ C, such that x ∈ X.
150
Hannes Leitgeb
If < S,C > is a concept structure, we call the members of S again ‘individuals’ and the members of C ‘concepts’ (according to < S,C >). We “extensionalize” concepts by regarding them as sets: the members of every such set are considered to be precisely those individuals to which the concept in question applies. For convenience, we assume that there is no “empty” concept under which no individual falls and furthermore that every individual in S falls under at least one of the concepts in C. Concept structures such as < S,C > are so-called hypergraphs (see Berge, 1989), i.e., generalizations of simply graphs where edges are not necessarily unordered pairs but rather sets of individuals of any positive cardinality. Similarity structures can be related to concept structures in a manner already considered by Leibniz (see Leibniz, 1923, A64 107/P 13) who suggested to reduce sentences such as ‘Peter is similar to Paul’ to sentences of the form ‘Peter is A now and Paul is A now’. Essentially, this amounts to regarding two individuals as similar if and only if they fall under a common concept. Let us make this precise in the following way: Definition 3. (Determined similarity structure) < S, ∼> is determined by < S,C > :iff for all x, y ∈ S: x ∼ y iff there is an X ∈ C, such that x, y ∈ X. Obviously, every concept structure on S determines a unique similarity structure on S. Here is an example: Example 1. Consider a concept structure < S1 ,C1 > with four individuals and two concepts: S1 = {1, 2, 3, 4}, C1 = {{1, 2, 3}, {3, 4}}. Graphically: ' r2
$ r3
r1 &
r4
%
Figure 1: Concept Structure < S1 ,C1 > The similarity structure < S1 , ∼ 1 > that is determined by < S1 ,C1 > is given by the following graph:
How Similarities Compose
r2 Q Q Q
Q Qr3
151
r4
r
1
Figure 2: Similarity Structure < S1 , ∼ 1 > It is easy to see that every similarity structure can be determined by a concept structure on the same set of individuals. In fact, in the majority of cases, a similarity structure can be determined by several pairwise distinct concept structures. We will return to this point below, when we compare the cardinalities of similarity structures and concept structures. Now for the other direction: in what sense may a similarity structure be said to determine a concept structure? Carnap’s method of quasianalysis in his Aufbau is a natural suggestion of how this is to be done. 1 We need some terminology: X ⊆ S is a clique of < S, ∼> iff for all x, y ∈ X: x ∼ y. X ⊆ S is a maximal clique of < S, ∼> iff X is a clique of < S, ∼> and there is no Y ⊆ S, such that X & Y and Y is a clique of < S, ∼>. Quasianalysis is motivated as follows: every two individuals which fall under a common concept are similar precisely because they fall under a common concept; so concepts are cliques with respect to similarity. In order not to “forget about” members to which a concept applies, we should regard concepts not just as cliques but rather as maximal cliques. The formal definition is as follows: Definition 4. (Determined concept structure; Quasianalysis) < S,C > is determined by < S, ∼> :iff C = {X ⊆ S|X is a maximal clique of < S, ∼>}. In this sense, every similarity structure < S, ∼> determines a unique concept structure < S,C >. We might call the concepts which are determined in such a way “quasiconcepts”, i.e., concepts as being given by quasianalysis. < S1 ,C1 > and < S1 , ∼ 1 > of example 1 may also be used in order to state an example of quasianalysis: Example 2. If < S1 , ∼ 1 > is now a given similarity structure, < S1 ,C1 > is its determined concept structure. 1
Carnap actually introduces two versions of quasianalysis in the Aufbau: quasianalysis of the first and of the second kind. We concentrate solely on the former and thus omit the qualification ‘of the first kind’. We also do not consider the common predecessor to both kinds of quasianalysis that is developed in Carnap (1923) and has been studied by Mormann (1994, 1996).
152
Hannes Leitgeb
Quasianalysis obviously inverts the previous determination of < S1 , ∼ 1 > by < S1 ,C1 >. Is this the case for every similarity structure that has been determined by a given concept structure? It is well known that the answer is ‘no’; counterexamples can be found, among others, in Goodman (1951), Eberle (1975), Kleinknecht (1980). Let us assume that a similarity structure is determined by a concept structure and that the similarity structure in turn determines a concept structure of “quasiconcepts” by quasianalysis. We introduce the following systematic treatment of the possible merits (or failures) of quasianalysis: we call the similarity structure sound if and only if all (quasi-)concepts that it determines are among the original concepts by which it was determined. So if a similarity structure is sound, quasianalysis does not add any “improper” concepts. The similarity structure is called complete if and only if all concepts by which it was determined are among the (quasi-)concepts that it determines. Given completeness, quasianalysis does not skip any of the “actual” concepts. Finally, we call the similarity structure adequate if and only if it is both sound and complete. This is the precise definition: Definition 5. (Soundness; completeness; adequacy) Let < S,C > be a concept structure on S, let < S, ∼> be determined by < S,C >, and let < S,C∼ > be determined by < S, ∼>: 1. < S, ∼> is sound with respect to < S,C > :iff C∼ ⊆ C. 2. < S, ∼> is complete with respect to < S,C > :iff C ⊆ C∼ . 3. < S, ∼> is adequate with respect to < S,C > :iff C∼ = C. Note that the definitions of ‘sound’, ‘complete’, and ‘adequate’ are conditional definitions. Since every concept structure determines a unique similarity structure which in turn determines a unique (quasi-)concept structure, we could instead have turned our definitions into unconditional ones which would define ‘sound’, ‘complete’, and ‘adequate’ as unary predicates that apply to concept structures. However, the underlying motivation for the definitions seems to be clearer if we say that a determined similarity structure is sound or complete or adequate with respect to the determining concept structure rather than saying that the determining concept structure is itself sound or complete or adequate. All combinations of soundness/unsoundness vs. completeness/incompleteness can be realized. We have already seen that < S1 , ∼ 1 > in examples 1 and 2 is adequate with respect to < S1 ,C1 >. The following three examples exemplify the other three possible combinations: Example 3. (Sound, but not complete) Let S2 = {1, 2, 3, 4}, C2 = {{1, 2}, {1, 2, 3}, {3, 4}}:
How Similarities Compose
153
' $ r2 r3
r4
r1 % &
Figure 3: Concept Structure < S2 ,C2 > The similarity structure < S2 , ∼ 2 > which is determined by < S2 ,C2 > is identical to < S1 , ∼ 1 > from above. < S2 , ∼ 2 > is sound but not complete with respect to < S2 ,C2 >, because concept {1,2} is “accompanied by” {1,2,3}: Goodman (1951) calls this the “companionship difficulty”. Example 4. (Not sound, but complete) Let S3 = {1, 2, 3, 4, 5, 6}, C3 = {{1, 2, 4}, {2, 3, 5}, {4, 5, 6}}: ' 6r ' r4
' $ 5r
& r1 &
$
$
%
2r & %
3r %
Figure 4: Concept Structure < S3 ,C3 > The graph that corresponds to the similarity structure < S3 , ∼ 3 > which is determined by < S3 ,C3 > is in this case: 6r
J
J
J J 5 4 r
Jr
JJ
J
J J
J J 2
1 r
Jr3 Jr
Figure 5: Similarity Structure < S3 , ∼ 3 >
154
Hannes Leitgeb
< S3 , ∼ 3 > is not sound but complete with respect to < S3 ,C3 >. According to Goodman (1951), this is an instance of the “difficulty of imperfect community”: individuals 2,4,5 are mutually similar, but three different concepts have determined the similarity relationships in question. Example 5. (Neither sound nor complete) Let S4 = {1, 2, 3}, C4 = {{1, 2}, {1, 3}, {2, 3}}: '' $$ r
3
$
'
2r r1 && %% & % Figure 6: Concept Structure < S4 ,C4 > The similarity structure < S4 , ∼ 4 > that is determined by < S4 ,C4 > is given by: 3r
JJ
J
J
J
J 1 r Jr2
Figure 7: Similarity Structure < S4 , ∼ 4 > < S4 , ∼ 4 > is neither sound nor complete with respect to < S4 ,C4 >. This example is another instance of Goodman’s “difficulty of imperfect community”. Let us now consider those cases more closely where a determined similarity structure is not adequate with respect to the determining concept structure. How often do these cases occur? It can be shown that the proportion of concept structures that determine an adequate similarity structure, among all concept structures, goes to 0 for increasing cardinality of S. The reason is simply that the number of concept structures is exponentially larger than the cardinality of similarity structures (on the same set, respectively). Thus, it is impossible that the majority of determining concept structures can be reconstructed by quasianalysis from the determined similarity structures. This is not because quasianalysis is deficient in any way, it
How Similarities Compose
155
is just a consequence of the different amount of information that can be stored in concepts structures in contrast to similarity structures. Every concept structure determines a unique similarity structure and vice versa, but there are way more concept structures than similarity structures. Therefore, several distinct concept structures must determine the same similarity structure, however only one of the former can be reconstructed from the latter. On the other hand, infinitely many concept structures do determine an adequate similarity structure; in fact, for every finite cardinality there is a set S of that cardinality such that there is at least one concept structure on S which determines an adequate similarity structure. So the next obvious question is how to characterize the concept structures which determine a similarity structure on which quasianalysis leads to adequate results. Fortunately, there is a theorem in hypergraph theory which gets this job done: it states a beautiful necessary and sufficient condition for soundness, a trivial necessary condition for completeness, and a necessary and sufficient condition for adequacy that is just the conjunction of the two previous conditions (this is the case although the condition for completeness is just a necessary one). The theorem is by Gilmore; proofs and references can be found in Berge (1989, pp. 22–31) and in Berge (1973, pp. 396f.): Theorem 1. (Criteria; Gilmore) Let < S, ∼> be determined by < S,C >, and let S be finite: 1. < S, ∼> is sound with respect to < S,C > iff for all A, B,C ∈ C there is an X ∈ C, such that (A ∩ B) ∪ (A ∩C) ∪ (B ∩C) ⊆ X 2. If < S, ∼> is complete with respect to < S,C >, then for all X,Y ∈ C, X 6⊆ Y . 3. < S, ∼> is adequate with respect to < S,C > iff each of the following two conditions is satisfied: (a) for all A, B,C ∈ C there is an X ∈ C, such that (A ∩ B) ∪ (A ∩C) ∪ (B ∩C) ⊆ X, (b) there are no X,Y ∈ C, such that X & Y . The two parts of the adequacy criterion 3 of theorem 1 may be regarded as pointing to Goodman’s difficulties of imperfect community and of companionship. As a first result of our investigations concerning our question I we have found that all and only concept structures of a very particular kind – those which satisfy 3 of theorem 1 – determine similarity structures from which the determining concept structures can be reconstructed.
156
Hannes Leitgeb
Let us take a look at some further examples. Carnap himself used quality spheres as the very concepts from which similarity was to be determined in the phenomenalistic constitution system of the Aufbau (see section 72). However, the similarity structure that is determined in this way proves to be inadequate: Example 6. (Quality spheres) Let d be a metric on S5 = R2 (or take some higher dimension), let ε > 0, and Sph(x, ε2 ) = {y ∈ S5 |d(x, y) 6 ε2 } (the closed ε2 -sphere around x ∈ S5 ): the similarity structure < S5 , ∼ 5 > that is determined by < S5 ,C5 > with C5 = {Sph(x, ε2 )|x ∈ S5 } is not sound, but complete with respect to < S5 ,C5 >. But if some of the spheres in our last example are excluded from the concept structure < S5 ,C5 >, the similarity structure that is determined by the resulting concept structure is adequate again: Example 7. (Restricted quality spheres) Let S6 be a finite subset of the two-dimensional Euclidean space R2 (we choose dimension n = 2 merely for graphical reasons). Let C6 be a tesselation of spheres of the following form, such that every “spherical” concept in C6 has as its members those members of S6 which are within a corresponding sphere (one might as well consider instead an infinite tesselation that extends this pattern into every direction): '$ '$ '$ '$ '$ '$ '$ '$ '$ '$ '$ '$ '$ &% &% &% &% &% &% &% &% &% &% &% &% &%
Figure 8: Concept Structure < S6 ,C6 > The similarity structure < S6 , ∼ 6 > that is determined by < S6 ,C6 > is adequate with respect to < S6 ,C6 >. Obviously, too many concepts in a concept structure may yield inadequate results as far as quasianalysis on the determined similarity structure is concerned. In fact, it may be shown that if C is a set of concepts on a set S of individuals n with n members, such that the cardinality of C is larger than bn/2c , then the similarity structure determined by < S,C > is not complete, let alone adequate (bxc is the largest integer less than or equal to x). Our final example illustrates this point in a most dramatic manner and was turned by Goodman (1972) into an additional argument against the scientific applicability of the notion of similarity in general:
How Similarities Compose
157
Example 8. Let S7 be an arbitrary set of individuals, let C7 equal the powerset of S7 minus the empty set: the similarity structure < S7 , ∼ 7 > that is determined by < S7 ,C7 > is simply the complete graph on S7 , therefore < S7 , ∼ 7 > is sound with respect to < S7 ,C7 > but its single quasiconcept S7 of course excludes completeness if there are at least two individuals. From this example we see that if every non-empty set of individuals is a concept, the determined similarity structure is trivial in the sense that every two individuals are similar to each other; for the same reason, the similarity structure is hopelessly incomplete. However, this can only be used in order to discredit the notion of similarity if this abundance of concepts is presupposed as a premise. Contrary to Goodman, we suggest to give up that premise and to consider similarity as being determined by natural concepts. E.g., if similarity is to be given by concepts as expressed by, say, colour terms, then not every Boolean combination of colour terms will express a natural concept: perhaps ‘red’ and ‘orange’ can be taken to express natural concepts broadly construed, i.e., considered as extended colour regions, say, not just as colour points; the same may be said about ‘red and orange’. However, ‘not red’, ‘red or green’, and perhaps even ‘red and green’ presumably do not express natural concepts (in the last case because ‘red’ and ‘green’ seem to contradict each other, although in some extended sense also the empty set might be regarded “natural”). What does ‘red’ make natural in contrast to, say, ‘red or green’? As far as perceptual concepts are concerned, natural concepts somehow ought to correspond to the basic kinds of classifications that are “built into” our cognitive procedures. If we think of scientific concepts, natural concepts may be supposed to correspond to the factually determined division of individuals into kinds; we will not go into further details (cf. Quine, 1969, Lewis, 1997, G¨ardenfors, 1990, 2000, Hirsch, 1993, Dunn, 1997). Unfortunately, each of these paraphrases is a platitude – the theory of natural concepts is not yet in a good shape. While the study of similarity cannot itself provide a foundation for a theory of natural concepts, we follow Lewis (1997) in hypothesizing that there might be a common, successful theory both for similarity and natural concepts. If natural concepts can play the role that Goodman’s so-called “projectible” concepts are supposed to play in his account of confirmation, then any such theory will be highly relevant also for epistemology and the philosophy of science. In any case, if similarity and natural concepts make up a “perfect pair” in which none is conceptually prior to the other, this is also to be understood in the way that every particular notion of similarity corresponds to its particular class of natural concepts (and vice versa) via a particular method of determination of the one from the other (and vice versa). Accordingly, instead of speaking of “the” notion of natural concept, one should rather speak of the notion of natural concepts as being given relative to a particular similarity relation (and vice versa). Our example of simi-
158
Hannes Leitgeb
larity structures as being given by definition 1 versus concept structures as being given by definition 2 and constrained by 3 of theorem 1 might be regarded as one exemplification of this sort of correspondence. One type of natural concept structures are those concept structures that determine adequate similarity structures in the above sense; they are natural relative to the similarity structures that they determine.
3
From Natural Concepts to Symbolic Representations
Let us reconsider the concept structure of example 7: the similarity structure that is determined by this concept structure in turn determines the latter by means of quasianalysis. Thus, if the similarity structure < S6 , ∼ 6 > is “given” – e.g., ∼ 6 could be one of the basic relations of a cognitive system – the concept structure < S6 ,C6 > which is depticted in figure 7 can be determined from it by quasianalysis. Assume this to be the case: the next question to ask is whether the concepts in C6 can be represented symbolically, as may be expected of concepts, and if so how this can be done. Here we are of course not interested in any kind of symbolic representation, but just in representations on the basis of a simple vocabulary of primitive terms from which complex representations are to be built up by the application of recursive procedures; moreover, we want to restrict ourselves just to the study of those syntactic procedures which express, on the semantic level, the application of a Boolean function to concepts. Do natural concepts that are constituted from similarity relations conform to closure conditions which may be expressed by such rules? As far as the members of C6 are concerned, the answer is more or less negative. We can of course represent each concept in a pattern such as C6 (even if extended infinitely) by, say, primitive concept terms p1 , p2 , . . ., but the closure of these terms with respect to the application of, e.g., a negation sign ¬, a conjunction sign ∧, or a disjunction sign ∨ generates a large set of concept terms which no longer express natural concepts. E.g., if p1 and p2 express two distinct concepts in C6 , neither of ¬p1 , p1 ∧ p2 , and p1 ∨ p2 expresses a member of C6 (we presuppose that ¬p1 expresses the complement of the concept that is expressed by p1 relative to the set S6 of individuals; accordingly, the application of ∧ and ∨ corresponds to taking the intersection and the union of concepts, respectively). This type of “problem” is actually not specific to example 7 but it rather is a general feature of all classes of natural concepts which are generated by quasianalysis. It is not so much soundness that causes the difficulties: theorem 1 tells us that < S, ∼> is sound with respect to < S,C > iff for all A, B,C ∈ C there is an X ∈ C, such that (A ∩ B) ∪ (A ∩C) ∪ (B ∩C) ⊆ X. Hence, if the C is closed under taking intersections and unions, soundness is definitely
How Similarities Compose
159
satisfied; for A, B,C ∈ C there is indeed an X as demanded by the criterion – just take (A ∩ B) ∪ (A ∩ C) ∪ (B ∩ C) itself. However, completeness interferes with this type of closure: according to theorem 1, if < S, ∼> is complete with respect to < S,C >, then for all X,Y ∈ C, X 6⊆ Y . Therefore, if p1 and p2 express two distinct concepts in C, neither their conjunction p1 ∧ p2 nor their disjunction p1 ∨ p2 can express a concept in C, or the corresponding similarity structure is incomplete with respect to the concept structure. This is the case simply because the concept expressed by p1 ∧ p2 would be a proper subset of both p1 and p2 , just as p1 and p2 would be proper subsets of p1 ∨ p2 . If we added negation, even p1 ∨ ¬p1 would express a concept in C, where this latter concept would be a proper superset of every other concept, and so forth. So we find that if a similarity structure and a concept structure as being given by definitions 1 and 2 determine each other in the sense of definitions 4 and 3, the concepts of the concept structure in question are not to be represented symbolically by terms which are generated by the recursive application of the usual Boolean connectives. Indeed, the closure of natural concept terms p1 , p2 , . . . under such syntactic operations leads to a class of concept terms that stand for what might be called concepts in general or general concepts, whether natural or “unnatural”. The terms which express such concepts may thus be called ‘terms for concepts in general’ or ‘general concept terms’ (the latter expression is not tied to the common syntactic distinction between singular terms and general terms). This corresponds to the two uses of concepts or properties described by Lewis (1997): as far as resemblance or confirmation is concerned, we are in need of a notion of natural concept (or, in Lewis’ terminology, of natural property); but for other purposes, in particular for semantics, we have to understand ‘concept’ in a broader sense according to which every primitive or complex open formula expresses a concept. But there are other types of natural concepts as being determined by other forms of similarity relations. Consider the following example: let Sim now be a ternary relation of individuals a and sets X and Y of individuals, i.e., Sim ⊆ S ×℘(S) ×℘(S) (where S is again a set of individuals and ℘(S) is the powerset of S). Sim is supposed to be a comparative relation of similarity with the following interpretation: Sim(a, X,Y ) if and only if every member of X is at least as similar to a as to any member of Y . If ‘. . .is at least as similar to. . .as to. . .’ is understood as ‘. . .is at least as close to. . .as to. . .’, where closeness is made precise by turning to a metric d that measures distances between individuals in a set S, then this amounts to: Simd (a, X,Y ) if and only if for all x ∈ X, for all y ∈ Y : d(x, a) 6 d(x, y). A relation such as Sim can also be viewed as a convenient first-order manner of expressing a polyadic similarity relation of the form Sim(a, x1 , x2 , . . . , y1 , y2 , . . .) which applies to an unbounded and perhaps even infinite sequence of individu-
160
Hannes Leitgeb
als. X from above is just the set of individuals x1 , x2 , . . ., while Y is the set with members y1 , y2 , . . .. Such polyadic resemblance relations have been considered by Quine (1973) and Lewis (1997). As Quine pointed out, “One is inclined to distinguish respects of perceptual similarity. . . this complication is convenient in practice, but I think it is dispensable in theory, by spreading the similarity polyadically” (Quine, 1973, p.18). How might natural concepts be determined from similarity relations of such type? One possibility is stated in the next definition: for every a ∈ S and every Y ⊆ S take the largest set X ⊆ S (assume there is one), such that Sim(a, X,Y ); X is thus the set of all individuals in S that are at least as similar to a as to the members of Y . Let the class of sets X of such kind be regarded as the class of (natural) concepts determined by Sim. In formal terms: Definition 6. (Determined concept structure II) < S,CSim by < S, Sim > :iff > is determined ∃a∃Y (Sim(a, X,Y )∧ CSim = X ⊆ S ∀Z(Sim(a, Z,Y ) → Z ⊆ X)) . If Sim is a similarity relation the field S of which is the set of quality points in a quality space or conceptual space in the sense of G¨ardenfors (1990, 2000), such that this space is a Euclidean vector space whith Euclidean metric d, and if furthermore Sim = Simd , then CSim can be shown to be the class of closed convex regions in the space. A region X is called convex (in the space) if and only if for every two members x, y of X, every point that lies “between” x and y, i.e., is a member of the straight line segment that joins x to y, is also a member of X (cf. Matousek, 2002; in a more abstract setting, betweenness would not be defined in terms of straight lines but would rather be characterized implicitly by the axioms of Ordered Geometry or by closure conditions; see e.g. Vel, 1993). Being closed and convex and being a natural concept relative to Sim in the sense of being determined in the manner of definition 6 therefore coincides in the given circumstances. Fig. 9 is the graphical illustration of a particular convex set which is determined according to definition 6: every point within or on the quadrangle is at least as close to the inner point than to any of the outer points; the quadrangle itself the largest set of points which have this property; the quadrangle is obviously convex: r aa
aa a
r
aa a ``` ``` ` r r
r
Figure 9: Natural concepts and Simd
How Similarities Compose
161
In the literature on geometry, in particular computational geometry, convex tesselations of the plane which are given in the manner of definition 6 are called ‘Voronoi diagrams’ (cf. G¨ardenfors, 2000). G¨ardenfors (1990, 2000) has argued on indepependent grounds that convex sets might “naturally” be regarded as natural regions in those conceptual spaces on which some notion of betweenness is presupposed. Convex sets show the closure or non-closure properties that might be expected of natural concepts with respect to set-theoretic operations: the complement of a convex set is not necessarily convex; the union of two convex sets is not necessarily convex; but the intersection of two convex sets in the same quality space is convex. Furthermore, as G¨ardenfors has outlined, convexity seems to match the projectibility constraints that become transparent from the analysis of Hempel’s raven paradox and Goodman’s grue paradox: ‘All non-black entities are non-ravens’ might be regarded as not being confirmed by a wight swan because the extension of ‘non-black’ is perhaps not a convex region in our human colour spaces and something similar might be said about ‘non-raven’; accordingly, the extension of ‘grue’ is the non-convex union of two convex regions, contrary to the convex sets of colour points that ‘green’ and ‘blue’ stand for. Convex sets also subserve prototype categorizations if bounded, since bounded convex sets have a unique “center of gravity”. Mormann (1993) argues in favour of a topological account of natural concepts according to which the class of open sets, or the class of closed sets, or perhaps the class of connected sets in a topological space are supposed to play the role the class of convex sets plays in G¨ardenfors’ theory. We will hold on to convexity, but it is clear that there are various other notions of natural concept and similarity that we might have a look at (moreover, convexity in Euclidean spaces implies connectedness and thus a topological property). As far as similarity is concerned, convex sets – at least some convex sets – are plausible candidates for respects of similarity: if two individuals x and y are similar to each other in respect r and the further individual z is qualitatively “between” x and y, z may be supposed to be similar to both x and y in the same respect r. If made precise along the lines sketched above, respects of similarity would thus be convex regions in quality spaces. In fact, convex sets in the Euclidean spaces Rn can even be shown to be closed under betweenness with regard to “close” (i.e., metrically similar) points in the following sense: a closed and connected set M ⊆ Rn is convex iff there is an ε > 0, such that for every x, y ∈ M with Euclidean distance d(x, y) 6 ε, every point between x and y is a member of M (this was proved by Zamfirescu, 1971). Let us now assume a class CSim of natural concepts has been determined by a similarity relation Sim as introduced above and that this class of natural concepts is indeed identical to the class of closed convex sets in the linear Euclidean space
162
Hannes Leitgeb
Rn . We will not elaborate on the questions of section 2 now, in particular, we are not going to study the ways of how similarity can in turn be determined from these natural concepts. Instead, we will only focus on the main question of this section: do natural concepts that are constituted from similarity conform to closure conditions that may be expressed by simple syntactic rules? We have already pointed out that CSim is not closed under taking arbitrary complements or unions (that would give us again concepts in general). But consider the following restricted rules: • We presuppose a family (pi )i∈I of primitive concept terms. • All primitive terms are natural concept terms. • If ϕ is a primitive natural concept term, then ¬ϕ is a natural concept term. • If ϕ and ψ are natural concept terms, then ϕ ∧ ψ is a natural concept term. Let us now consider so-called half-spaces of Rn , i.e., sets which lie on one side of an n − 1-dimensional hyperplane for Rn where the latter is taken as an affine space (i.e., the hyperplane does not necessarily contain the space origin as a member). A half-space is closed if the hyper-plane itself is a subset of it; otherwise the half-space is open. E.g., a straight line in R2 that is defined by an equation ax + by = c in the Euclidean plane gives us the closed halfspaces {(x, y)|ax + by 6 c}, {(x, y)|ax + by > c} as well as the open half-spaces {(x, y)|ax + by < c} and {(x, y)|ax + by > c}; accordingly for all other dimensions. Every half-space is a convex set and every set-theoretic complement of a half-space with respect to Rn is again a half-space and thus convex. Finally, every intersection of half-spaces is convex. Thus, if we assume that the primitive terms pi (for i being a member of a given index set I) from above express closed half-spaces, each of the natural concept terms derivable by the formation rules stated above expresses a convex set. There is one drawback to this construction: the complement of a closed half-space is actually an open half-space, but open convex sets are not precisely the natural concepts which are given by definition 6 and our additional assumptions; our natural concepts are closed convex sets. So the extension of a natural concept term that is derivable by our rules is not necessarily a natural concept itself but rather its topological closure is natural. Let us grant this in view of the fact that for cognitive systems the up-to-topological-closure approximation of concepts by means of symbolic representations would surely be sufficient. Furthermore, if every closed half-space is expressed by one member of the family (pi )i∈I , it can even be shown that every closed convex set can be approximated by an infinite sequence of natural concepts that are expressed by natural concept terms derivable by our rules. This follows from the theorem of Hahn-Banach, a well-known theorem in functional analysis.
How Similarities Compose
163
We might actually even add further rules, such as: • If ¬ϕ and ¬ψ are natural concept terms, then ¬(ϕ ∨ ψ) is a natural concept term. • If ϕ is a natural concept term, then ¬¬ϕ is a natural concept term. However, by the laws of De Morgan and Double Negation, no additional natural concept is expressed by the terms which are derivable according to the extended set of rules; thus these latter rules are redundant. In the case of our similarity relation Sim and the natural concepts – certain convex sets – that we can define in terms of it, the latter may indeed be represented approximatively by syntactic expressions which are governed by simple rules of formation and which only consist of atomic expressions and logical connectives. The rules have to be restricted, though, or the concepts which are expressed by the derivable terms might turn out not to be natural. 4
From Symbolic Representations to Compositionality
We have studied examples of how particular kinds of natural concepts can be defined on the basis of their corresponding similarity relations. For these natural concepts, systems of symbolic representations can be set up, but these systems are more or less constrained with respect to the syntactic closure conditions that apply to them. How constrained they are depends on the properties of the given similarity relation. If these systems of natural concept terms are closed under arbitrary syntactic operations, we end up with systems of concepts in general. Let us focus now on these latter systems, which contain the corresponding systems of natural concept terms as proper subclasses, in more detail. Assume that for a system of natural concept terms some binary relation ≡ of meaning identity or meaning equality is defined; ‘ϕ ≡ ψ’ is to express that ϕ has the same meaning as ψ. It is easy to extend meaning identity for natural concept terms to a relation of meaning identity for arbitrary concept terms by presupposing a compositionality thesis for each possible connective, e.g.: if ϕ ≡ ψ, then ¬ϕ ≡ ¬ψ; if ϕ ≡ ψ and ρ ≡ σ , then ϕ ∧ ρ ≈ ψ ∧ σ ; and so forth. These postulates express compositionality, since, by contraposition, it is implied that there is no difference with respect to the meanings of ¬ϕ and ¬ψ without a corresponding difference in the meanings of ϕ and ψ, and accordingly for the other connectives. Therefore, the meaning of a complex concept term depends on the meaning of its subterms. Furthermore, ≡ will surely be assumed an equivalence relation on the set of general concepts, i.e., ≡ is reflexive, symmetric, and transitive; after all, ≡ is supposed to express that the meanings of terms are identical or equal. Finally,
164
Hannes Leitgeb
consider a concept term ϕ being logically equivalent to a concept term ψ in the same sense as ϕ would be logically equivalent to ψ if they were regarded as propositional formulas or open first-order formulas with a fixed free variable: if ϕ is logically equivalent to ψ, we may conclude that ϕ ≡ ψ, at least if we presuppose – as we do now – that the meanings of concept terms are concepts and that these concepts are extensional or intensional entities rather than hyperintensional ones (cf. Carnap, 1947). This is all more or less straight-forward. Let us now investigate whether there is also some useful notion of meaning similarity for concept terms. Carnap (1961) states an example of how a relation of concept similarity can be introduced for natural concepts (in his case, these natural concepts were quality classes): two concepts X and Y are similar if and only if for every x ∈ X and every y ∈ Y , x is similar to y; similarity for individuals is given by some similarity structure in the sense of definition 1. Meaning similarity of natural concept terms ϕ and ψ can thus be defined on the basis of the similarity of the concepts that they express (and there are many further ways of defining meaning similarity for natural concept terms; in fact, there are many definitions that are superior to Carnap’s suggestion). But is there also a relation of meaning similarity for concept terms in general, not just natural ones, such that this relation (i) obeys compositionality postulates analogous to the ones stated above, (ii) respects logical equivalence, (iii) is a similarity relation in the sense of definition 1, but (iv) is not an equivalence relation, i.e., not transitive? As Fodor & Lepore (1999) have argued on the basis of examples in natural language, there is no compositional notion of meaning similarity. The following more general argument supports their thesis as far as unrestricted compositionality is concerned. Read ‘ϕ ≈ ψ’ as: concept term ϕ is meaning-similar to concept term ψ. We adopt as postulates for meaning similarity: • (SIM) ≈ is a reflexive and symmetric binary relation (on a given, syntactically unconstrained language of general concept terms). • (LOG) For all ϕ, ψ, ρ: – If ϕ ≈ ψ and ψ is log. equivalent to ρ, then ϕ ≈ ρ. – If ψ ≈ ϕ and ψ is log. equivalent to ρ, then ρ ≈ ϕ. • (COM) For all ϕ, ψ, ρ, σ : – If ϕ ≈ ψ, then ¬ϕ ≈ ¬ψ.
How Similarities Compose
165
– If ϕ ≈ ψ and ρ ≈ σ , then ϕ ∧ ρ ≈ ψ ∧ σ . – If ϕ ≈ ψ and ρ ≈ σ , then ϕ ∨ ρ ≈ ψ ∨ σ . • (¬TRANS) ≈ is not transitive. COM expresses the “softened” similarity-version of compositionality with respect to the negation, conjunction, and disjunction of concept terms. ¬TRANS states that transitivity of ≈ fails for at least some ϕ, ψ, ρ. Is there a relation of meaning similarity that obeys each of SIM, LOG, COM, and ¬TRANS? The following theorem states a negative answer: Theorem 2. (Failure of unrestricted compositionality) No relation ≈ satisfies SIM, LOG, COM, and ¬TRANS simultaneously. In other words: given the other postulates, meaning similarity does not cohere with compositionality concerning ¬, ∧, ∨, only meaning identity or meaning equality does. But SIM, LOG, ¬TRANS and restrictions of COM may be shown to be satisfiable: if only one of the three clauses of COM is used as a compositionality postulate, then there are indeed examples of meaning similarity relations which satisfy SIM, LOG, ¬TRANS, and the given clause of COM. Therefore, meaning similarity for general concept terms can be partially compositional, though it cannot be completely compositional.
5
Conclusions
We have made our way from similarity to natural concepts, from natural concepts to symbolic representations, and finally from symbolic representations to compositionality. As we have seen, different notions of similarity for individuals give us different notions of natural concepts, natural concept terms do not conform to unrestricted syntactic rules but may conform to restricted ones, and meaning similarity for concept terms does not obey unrestricted compositionality but it may respect restricted compositionality. While similarity and natural concepts make a perfect pair, similarity and general concepts do not. While the representation of natural concepts is not necessarily confined to the level of subsymbolic representations, their symbolic representations are subject to certain restrictions of syntax. While we can introduce principles of compositionality for meaning similarity, these principles have to be weaker than those for meaning identity. Similarity coheres with our usual understanding of compositionality, concepts, and cognition, but only given certain constraints that can (and have to) be made precise.
166
Hannes Leitgeb
Acknowledgements We want to thank an anonymous referee and the editors of this volume. This paper was supported by the Erwin-Schroedinger Fellowship J2344-G03, Austrian Research Fund FWF. References Berge, C. (1973). Graphs and hypergraphs. Amsterdam: North-Holland. Berge, C. (1989). Hypergraphs. Amsterdam: North-Holland. Carnap, R. (1923). Die Quasizerlegung – Ein Verfahren zur Ordnung nichthomogener Mengen mit den Mitteln der Beziehungslehre. (Unpublished manuscript RC-081-04-01) Carnap, R. (1947). Meaning and necessity. Chicago: University of Chicago Press. Carnap, R. (1961). Der logische Aufbau der Welt. Hamburg: Meiner. Dunn, J. M. (1997). A logical framework for the notion of natural property. In J. Earman & J. Norton (Eds.), The cosmos of science (pp. 458–497). Pittsburgh: University of Pittsburgh Press. Eberle, R. (1975). A construction of quality classes improved upon the aufbau. In J. Hintikka (Ed.), Rudolf Carnap, logical empiricist (pp. 55–73). Dordrecht: Reidel. Fodor, J., & Lepore, E. (1999). All at sea in semantic space: Churchland on meaning similarity. Journal of Philosophy, XCVI, 381–403. Friedman, M. (1999). Reconsidering logical positivism. Cambridge: Cambridge University Press. Gaerdenfors, P. (1990). Induction, conceptual spaces, and AI. Philosophy of Science, 57, 78–95. Gaerdenfors, P. (2000). Conceptual spaces. Cambridge: MIT Press. Goodman, N. (1951). The structure of appearance. Cambridge: Harvard University Press. Goodman, N. (1972). Problems and projects. In (pp. 437–446). Indianapolis: Bobbs-Merrill. Hirsch, E. (1993). Dividing reality. New York: Oxford University Press.
How Similarities Compose
Kleinknecht, R. (1980). Quasianalyse und Qualit¨atsklassen. Philosophische Studien, 11, 23–43.
167
Grazer
Leibniz, G. W. (1923). Saemtliche Schriften und Briefe. New York: Akademie Verlag. Leitgeb, H. (2005a). Meaning similarity and compositionality. (Submitted) Leitgeb, H. (2005b). Under what conditions does quasianalysis succeed? (Submitted) Lewis, D. (1997). New work for a theory of universals. In D. Mellor & A. Oliver (Eds.), Properties (pp. 188–227). Oxford: Oxford University Press. Matousek, J. (2002). Lectures on discrete geometry. New York: Springer. Mormann, T. (1993). Natural predicates and topological structures of conceptual spaces. Synthese, 95, 219–240. Mormann, T. (1994). A representational reconstruction of Carnap’s quasianalysis. In M. Forbes (Ed.), PSA 1994 (Vol. I, pp. 96–104). Mormann, T. (1996). Similarity and continuous quality distributions. The Monist, 79/I, 76–88. Quine, W. (1969). Ontological relativity and other essays. In (pp. 114–138). New York: Columbia University Press. Quine, W. (1973). The roots of reference. La Salle: Open Court. Richardson, A. (1998). Carnap’s construction of the world: The Aufbau and the emergence of logical empiricism. Cambridge: Cambridge University Press. Vel, M. van de. (1993). Theory of convex structures. Amsterdam: NorthHolland. Zamfirescu, T. (1971/72). Trois characterisations des ensembles convexes. Ist. Veneto Sci. Lett. Atti Cl. Sci. Mat. Natur., 130, 377–384.
The Structure of Thoughts Menno Lievers In recent years the philosophy of mind has been developed relatively independently from the philosophy of language. Only a few decades ago it was quite generally held that thought is linguistic: language was supposed to be both a vehicle for thought and a means of communication. As recent as 1981 Michael Dummett famously wrote that the following three claims jointly comprised ‘the basic tenet of analytical philosophy’: (i) an account of language does not presuppose an account of thought, (ii) an account of language yields an account of thought, and (iii) there is no other adequate means by which an account of thought may be given. (M. Dummett, 1981, p. 39) Due to the rise of cognitive science this idea has been abandoned. Thinking in language is now regarded to be only a part of our cognitive apparatus. As a result of this development new directions within the philosophy of language have been pursued without taking notice of what motivated philosophers to study language in the first place. Nowadays semantic theories about language are being developed (like Dynamic Predicate Logic) that simply ignore the fact that language serves as an expression of thoughts.1 An important motivation for this development is a rejection of the claim that the underlying logic of language is predicate logic; an idea, perhaps most prominently developed by Davidson, who required that a theory of meaning is compositional and that the compositional structure is that of predicate logic, as is the logical form of sentences.2 It is striking, however, that at the same time it is still widely assumed within the philosophy of mind that the logic of thoughts is predicate logic. To give just one influential example: in Fodor’s language of thought hypothesis, it is Address for correspondence: Philosophy Department, Utrecht University, Heidelberglaan 8, 3581 CS Utrecht, The Netherlands. E-mail: [email protected]. 1
See f.i. H. Kamp and U. Reyle (1993), chapter 0; J. Groenendijk and M. Stokhof (1991). 2 D. Davidson (1970/1984). The Compositionality of Meaning and Content. Volume I: Foundational Issues. Edited by Markus Werning, Edouard Machery, & Gerhard Schurz. c
2005 Ontos Verlag. Printed in Germany.
170
Menno Lievers
assumed without argument that the sentences of the language of thought can be analyzed in terms of predicate logic.3 A plausible explanation for this situation is to attribute its occurrence to the influence of Frege. As is well known Frege tried to establish a tight connection between linguistic assertions and thoughts. In an act of assertion, according to Frege, the speaker judges that the thought expressed in the assertion is true. He then analyzed the thought, which he also referred to with the expression ‘judgeable content’, in terms of the mathematical notions ‘function’ and ‘argument’. The mathematical function f (x) = x2 gets a determinate value, if we substitute for the variable ‘x’ an argument, f.i. the value 9 for the argument 3. We ought to treat a predicate similarly, according to Frege. The predicate ‘x is a philosopher’ should be conceived of as a function that gets a determinate value, either the true or the false, if we substitute an argument, f.i. the name Socrates, for the variable. It has become a matter of some dispute whether Frege intended his analysis as one of thoughts or of sentences that express thoughts. Dummett defends the latter interpretation, which is supported by the famous footnote Frege added to his article ‘Thoughts’.4 Others have contested Dummett’s interpretation, most notably Evans and Burge, by pointing out that Frege always emphasized that his real interest was in the structure of thoughts and that he, as early as 1879 in his Begriffsschrift, ascribed the function/argument structure in the first instance to judgeable contents and not to sentences.5 Both sides follow Frege in maintaining that thoughts have the following four properties: 3
See J. A. Fodor (1975), pp. 79–97. J. A. Fodor (1987), pp. 135–154, J. A. Fodor (1994), passim, J. A. Fodor (1998), p. 1–22. For a rejection of Fodor’s view that thoughts are structured see S. Schiffer (1991). The assumption that the logic of the mind is predicate logic is, of course, also rejected by philosophers who favour connectionist models of the mind, as f.i. P. M. Churchland (1989), chapter 5. 4 G. Frege (1918/1984), p. 360, f.n. 6: “I am not here in the happy position of a mineralogist who shows his audience a rockcrystal: I cannot put a thought in the hands of my readers with the request that they should examine it from all sides. Something in itself not perceptible by sense, the thought, is presented to the reader – and I must be content with that – wrapped up in a perceptible linguistic form. The pictorial aspect of language presents difficulties. The sensible always breaks in and makes expressions pictorial and so improper. So one fights against language, and I am compelled to occupy myself with language although it is not my proper concern here. I hope I have succeeded in making clear to my readers what I mean by ‘a thought’.” 5 For this dispute about the proper interpretation of Frege, see M. Dummett (1981), chapter 3, M. Dummett (1986), M. Dummett (1988/1991), M. Dummett (1993b) chapter 13, T. Burge (1990), G. R. Evans (1982), chapter 1.
Structure Thoughts
171
1. thoughts are the carriers of truth and falsity, 2. thoughts are the objects of the propositional attitudes, such as belief and doubt, 3. thoughts can be shared, in the sense that the same thought can be judged to be true and talked about by different thinkers, 4. thoughts are composite, structured entities. In this essay this interpretation of the concept of thoughts will be adopted. The debate between the different interpretations of Frege’s philosophy is relevant, especially with respect to the fourth property, that thoughts are structured. If thought is linguistic, as Dummett claims, then the structure of sentences can be projected onto thoughts. The structure of thoughts can then be derived from the structure of sentences. If, however, the idea that thoughts are linguistic has to be rejected, which seems plausible, we need an independent justification for the claim that thoughts are structured. Is it from another medium, like pictures, or are thoughts intrinsically structured? Some kind of justification is required for the structure of thoughts, for it does not seem prima facie implausible nor impossible that corresponding to sentences there are unstructured thoughts or thoughts with a logical structure far more complicated than that of good old predicate logic, as philosophers of language now claim with respect to linguistic utterances. In this paper I examine, first, a well-known attempt to justify the claim that thoughts are intrinsically structured, Evans’s justification of the Generality Constraint. I then compare this with a rival account, proposed by Peaocke. I end by suggesting that a na¨ıve realist has no difficulty at all in providing a justification of the Generality Constraint, which is therefore a view that deserves serious consideration. 1
Evans’s Justification of the Generality Constraint
What could be the starting point for the justification of the claim that thoughts are essentially structured? In order to gain more clarity on this question, I have structured an informal argument for that conclusion in separate premises. Let us start by saying that a thought is the smallest unit of content that can be true or false. A thought about the external world is true, if and only if it describes the way the world is. These thoughts are world-directed. In order to have a thought about an object in the world, we need to be able to identify and re-identify that object.
172
Menno Lievers
The notion of re-identification presupposes that we have a conception of that object as existing unperceived. When is a thought about an object true? If the object the thought is about indeed has the property that it is thought to have. Thus formulated, it seems that to entertain a thought about an object involves at least two separate ways of thinking: We need to be able to think about the object in a particular way (as the occupant of a particular place in time and space). We need to think about that same object as the possessor of a particular property (quality). So when can we credit an individual with a true thought about an object? At the very least the subject must be able to think about that object and be able to identify it, and also must be able to think that it has that property. This way of arguing for the claim that thoughts are intrinsically structured is basically Strawsonian. Strawson starts from the assumption that “the basic combination of subject and predicate” reflects some fundamental features of our thought about the world.6 In any ground-level linguistic expression of a judgement of our fundamental type we distinguish three functions: that of specifying the particular(s) concerned; that of specifying the general concept concerned (i.e. the general concept which the particular(s) is (are) judged to exemplify); and that of presenting particular (s) and general concepts as assigned to each other in such a way that you have a propositional combination, true if the particular (or pair, trio, etc.) exemplifies the concept, false if not. (P. F. Strawson, 1974, p. 22) What distinguishes the informal argument cited above from Strawson’s approach, is that Strawson explains the structure of thoughts in terms of the structure of sentences that express these thoughts. As is well known, Evans has reversed this order of explanation in The Varieties of Reference, by interpreting Frege’s notion of sense as a way of thinking about the reference.7 The reversal of the order of explanation implies, of course, that “the essential combination” of subject and predicate in linguistic propositions needs to be explained in terms of the structure of the thoughts that are expressed in these propositions. For this purpose Evans introduces his by now famous Generality Constraint. 6 7
P. F. Strawson (1959), Part II, P. F. Strawson (1970/1971), and P. F. Strawson (1974). G. R. Evans (1982), chapter 1: “Frege”.
Structure Thoughts
173
[...], if a subject can be credited with the thought that a is F, then he must have the conceptual resources for entertaining the thought that a is G, for every property of being G of which he has a conception. (G. R. Evans, 1982, p. 104) The Generality Constraint plays a crucial role in Evans’s justification for the claim that thoughts are structured as can be seen from the following quote. Evans writes: With the Generality Constraint in mind, we may take a small step from our truistic starting-point, and say that in the case of a proposition of the form ’a is F’, knowledge of what it is for it to be true must be the result of two pieces of knowledge, one of which can be equated with an Idea of an object, and the other with an Idea of property, or more familiarly, a concept. [...] An Idea of an object is part of a conception of a world of such objects, distinguished from one another in certain fundamental ways. For every kind of object, there is a general answer to the question ‘What makes it the case that there are two objects of this kind rather than one (or three rather than two)? (G. R. Evans, 1982, p. 106) At this point we are in the following position. The claim that thoughts are essentially structured is based on the Generality Constraint. So if we want to justify that thoughts are essentially structured, we need a justification for the Generality Constraint. If the Generality Constraint can be justified, Evans seems to claim, we have all that is required for the thesis that thoughts are essentially structured. Does this justification of the Generality Constraint achieve what it was meant to do? What we were after was a justification for the claim that all thoughts are essentially structured.8 The conclusion states this is indeed the case. But is that general conclusion justified? It could be objected that all that has been justified thus far is the claim that thoughts of the form ‘a is F’ are essentially structured.9 The conclusion, however, is much more general and includes what might be called ‘existential thoughts’ of the form ‘There exists at least one thing with property F’. What would Evans say, for instance, in response to Quine’s insistence on the primacy of predicates?10 Presumably, Evans would defend Strawson’s position that wherever thought concerns individuals the functions of ‘this’ and ‘such’ are mutually irreducible. Thoughts require designation as well 8
G. R. Evans (1982), p. 102 I owe this objection to an anonymous referee of this paper. 10 W. V. O. Quine (1950/1959), p. 218. 9
174
Menno Lievers
predication.11 As it stands, this reply is not very satisfactory. For if it is a requirement on any thought that it involves the acts of both designating and predicating, then it seems that both the justification of the Generality Constraint and the claim that thoughts are essentially structured are based on that requirement. What is more, the structure of ‘existential thoughts’ is then still unaccounted for. This leaves us with two options. First, we could argue that thoughts about objects have genealogical priority in our thinking, i.e. we learn to entertain thoughts about objects before we acquire the capacity to entertain ‘existential thoughts’. Because of this prior acquired ability, we are able to discern structure in ‘existential thoughts’. Secondly, we could try to construct a more general justification of the Generality Constraint; ‘more general’ in the sense that is applicable, not only to thoughts about objects, but also to existential thoughts. Another route to a similar conclusion emerges, if we observe that Evans’s justification of the Generality Constraint employs as crucial notions that of ‘the fundamental ground of difference of an object’ and that of a ‘fundamental Idea of an object’. These ideas are grounded in Wiggins’s sortal theory of identity.12 An idea of an object is part of a conception of a world of such objects, distinguished from one another in certain fundamental ways. The most fundamental way that distinguishes an object from others is the fundamental ground of difference of that object. This fundamental ground of difference is a specific answer to the question what differentiates this object from all other objects. A subject possesses a fundamental idea of an object, if he thinks of it as the possessor of the fundamental ground of difference that it in fact possesses. Given this explanation of what a fundamental idea is, Evans clearly presupposes that objects that can be differentiated from one another in one and only one fundamental way inhabit the world. He, therefore, does presuppose that there is a ready-made world out there and his justification is therefore a realist one. However, Evans inherits from Wiggins and Strawson a version of realism that could be called ‘conceptual realism’. It starts from the assumption that we possess thoughts and then proceeds by inquiring what the pre-conditions are for being able to possess thoughts about a mind-independent reality. Evans then justifies the Generality Constraint, the claim that thoughts are essentially structured, by considering thoughts about objects. But given his starting point from within the realm of thoughts, one could very well question why we ought to base the justification of the Generality Constraint on thoughts about objects. 11
Evans cites Strawson approvingly and emphasizes that he is concerned with an Idea of an object, which is to be capable of yielding indefinitely many thoughts about it. See G. R. Evans (1982), pp. 102–105. 12 D. Wiggins (1980) and (2001). I return to this influence below.
Structure Thoughts
175
For, so one could claim, all our thoughts are structured, not just thoughts about objects. So again: either one ought to assign a conceptual priority to thoughts about objects or else the justification of the Generality Constraint should not be based on a fundamental level of thinking about objects. Peacocke chooses to follow the second route. 2
Peacocke’s Justification of the Generality Constraint
Peacocke has criticized Evans’s approach, first, because it presupposes a questionable theory of identity, secondly, because it is not general enough, in the sense that it cannot explain the structure of thoughts about objects that do not possess a fundamental ground of difference. He could have added that Evans’s account does not seem to be applicable to existential thoughts of the form ‘there exists at least one x that has property F’.13 His own justification of the Generality Constraint is given in terms of the notion of ‘knowing what it is for’. This notion is embedded in his theory of concepts. According to Peacocke a thought is individuated by possession conditions for its constituents, concepts. Those possession conditions state what conditions a thinker must meet in order to be credited with the possession of that particular concept. It has to be supplemented with an account of how concepts are combined in complete thoughts.14 Peacocke’s account starts with Frege’s dictum that sense determines reference. He uses this very requirement to explain how concepts are combined in one thought. The requirement, in his words, is: Possessing a concept is knowing what it is for something to be its semantic value. (C. Peacocke, 1992, p. 43) The fact that concepts possess a semantic value explains how the truth-value is determined of a thought that is the result of combining two concepts. 13
C. Peacocke (1992), “Appendix B: Evans’s derivation of the Generality Constraint: A Comparison”, pp. 231 –235. 14 Although Peacocke explicitly situates himself in the Fregean tradition, it is a substantive question whether his theory of concepts can be faithful to Frege in all respects. In particular this concerns Frege’s analysis of thoughts in terms of functions and arguments or quantifiers over variables, which suggests an asymmetry between subject and predicate. Peacocke seems committed to the un-Fregean sounding claim that concepts can occupy the argument-place in a thought. However, this impression can be removed partly, by realizing that ‘concept’ is employed by Peacocke as a translation of ‘mode of presentation’ or ‘sense’ (Frege’s ‘Art des Gegebenseins’ or ‘Sinn’) and not of Frege’s ‘Begriff’ to which predicates refer. Peacocke claimed in conversation that his theory of concepts could easily be expanded with an account of how concepts are combined in one complete thought. In this paper I assume that he is right in this.
176
Menno Lievers
Peacocke justifies the Generality Constraint by providing a Referential Explanation. This justification starts from two premises. The first is that attitudes are relations to complex contents, composed in a distinctive way from concepts possessed by the thinker. The second premise is that possessing a concept is knowing what it is for something to be its semantic value. Armed with these premises we need to explain what it is for a thinker who has the thought Fa and the singular mode of presentation b to know what it is for the thought Fb to be true. If a thinker is capable of entertaining the thought Fa, then he has to know three things. First, he has to know, since he possesses the concept a, what it is for an arbitrary object to be the semantic value of a. Secondly, since he possesses the concept F, he must know what is for an arbitrary object to be the semantic value of F. Thirdly, he must be able to grasp the semantic significance of the mode of combination of F and a. If a thinker possesses the concept b, he similarly has to know what it is for an arbitrary object to be the semantic value of b. Since the possession of any concept requires that a subject has the capacity to entertain propositional attitudes towards contents containing that concept, the subject also knows what it is to judge Fb. If he judges Fb, then he eo ipso aims at the truth, when judging Fb. If he is able to aim at the truth when judging Fb he must also be able to grasp the semantic significance of the mode of combination of F and b. In that case we cannot but conclude that the subject knows everything that is required for knowing what it is for the thought Fb to be true, so we have justified the Generality Constraint.15 Peacocke’s justification of the Generality Constraint is very abstract. This has two consequences. The first is a virtue, because the account is applicable not only to the name/bearer relation in the case of observable, concrete objects, but also to abstract and microscopic objects, and also, his own example, moments in time. It thus avoids the problems Evans’s account faces as a result of introducing the notion of a fundamental level of thought. Peacocke does have a point: why should we require, as Evans does, that our thoughts necessarily involve knowing the fundamental ground of difference of objects? Do we really need to know to what ultimate sortal an object belongs to in order to be able to identify and re-identify that object? It is instructive to try to reconstruct why Evans felt that thoughts have to be fundamental in order to be objective. The introduction of the Generality Constraint occurs in a chapter that is also a direct polemic with Dummett. As is well known Dummett accused semantic realists of employing a verification transcendent notion of truth in their theory of meaning. Consequently, according to Dummett, semantic realists cannot give a satisfactory account of how we 15
Cf. C. Peacocke (1992), pp. 43–44.
Structure Thoughts
177
manifest in our linguistic behaviour knowledge of the meaning of many sentences.16 Evans attempts to defend the realist’s contention that we possess a verification transcendent notion of truth. Such a notion involves an absolute conception of the word. In such an absolute conception of the world each object possesses its fundamental ground of difference. Evans thus seems to have been motivated by the realist’s craving for objectivity, but from within the realm of thoughts. But why couldn’t a realist just start with the assumption that there is world out there inhabited by structured entities? This question brings us to the second consequence. Since Peacocke intends his account to remain neutral about the external world, he needs to provide us with an explanation of how we acquire the capacity to know what it is for something to be semantic value of a concept.17 But that he does not provide. It is difficult to see how this can be achieved without presupposing the prototype situation of the referential name/bearer relation: names for observable entities. And indeed Peacocke suggests as much: Some of the possession conditions presented earlier have a clause with the property that judging in accordance with that clause requires, in the most basic case, making a fundamental identification of the object of predication. [...] Judging in accordance with that clause involves identifying the object of predication in a perceptual-demonstrative way. In the most basic case, this provides an egocentric identification of the location of the object at the time of the judgement. I also emphasized in earlier chapters that the other clauses of these possession conditions ride on the back of the perceptual clause: these other clauses make reference to experiences of the sort mentioned in the perceptual case, but not vice versa. (C. Peacocke, 1992, p. 233) If perceptual concepts are the most basic concepts of our conceptual repertoire, then we expect that to be reflected in an account of the structure of thoughts, not just in a clause, but also centrally in the framework of the explanation, unless we are being given very good reasons to give up that expectation. 16
M. Dummett (1976/1993a). R. Garrett Millikan (2000), pp. 180–184, makes a similar point. 17 That Peacocke intends his Referential Explanation of the Generality Constraint to be abstract is illustrated by the following quote: “[...] the Referential Explanation is independent of the kind of object referred to by the concepts quantified over in the constraint. It is of equal application to thoughts about material objects, numbers, sets, mental events, or anything else. As far as the explanation of the constraint is concerned, no subject matter has any explanatory priority.” (C. Peacocke, 1992, p. 44)
178
3
Menno Lievers
A Naive Realist Justification of the Generality Constraint
As we saw in the previous section Peacocke questioned whether a fundamental level of thought is required in order to be able to credit someone with the capacity of referring and thinking about an object. We do not need a fundamental, sortal concept in order to be able to refer and think about an object. He provided an alternative justification of the Generality Constraint that did not require a fundamental level of thought. However, his account became so abstract that it did not pay justice to the fact that our basic thoughts are intimately tied with our perceptions; it treated that fact as a mere afterthought. So the challenge that we now face is to come up with an explanation of the structure of thoughts that respects that thought and perception are inextricably linked without invoking the notion of a fundamental level of thought in the way Evans proposed. In order to develop such an alternative, let us return to the informal argument for the Generality Constraint given above. The first premise in that argument was the following: 1. Let us start by saying that a thought is the smallest unit of content that can be true or false. This premise might be true, but it invites the question whether thoughts are the only contents that can be true or false. The answer is that they are not, since there are other contents like perceptual contents, that can be true of false as well.18 Let us focus on perceptual content. In perception objects are presented to us. It is the means through which our thought is connected with the world. We perceive objects that are presented to us in perception.19 This brings us to the second premise of the informal argument. 2. A thought about the external world is true, if and only if it describes the way the world is. These thoughts are world-directed. The crucial notion in this premise is that of ‘world directedness’. Is the ‘world directedness’ of thoughts the same as that of perception? The answer is that it is not. The aboutness of perception requires an ongoing flow of information; the aboutness of thoughts does not. As Austen Clark has put it: “The ‘ofness’ of sensation is not the ‘ofness’ of thought.” (A. Clark, 2000, p. 115) This claim raises the question which of the two is the more fundamental notion. From the seventeenth century onwards the mainstream answer has been that only conceptual content has intentional content, which implies that experience, if is about anything, has to be conceptual, an implication embraced by 18 19
See a. o. T. Crane (1992). A point also stressed by A. Clark (2000), p. 115.
Structure Thoughts
179
philosophers as diverse as Kant, Quine and McDowell. As is well known, this position has been challenged on the basis of the following arguments, which I repeat below, because I want to connect them with the justification of the Generality Constraint. The main arguments are the following: 1) One should not confuse the content of a perceptual experience with a description of that content, just like one should not confuse an object with a description of that object.20 An indication of the truth of 1) is the observation that contents of experiences are more finely grained than the concepts possessed by the experiencer.21 A further indication of the truth of 1) is that perceptions are resilient to conclusive counterevidence, which forces us to acknowledge that they are not beliefs.22 Perceptual content is a more primitive notion than that of conceptual content. We share it with animals and children.23 If these arguments are convincing, some obvious consequences for an account of thought follow. For we now have to say that the propositional account of non-propositional content is based or grounded on that content, not simply caused by it. It has to be so grounded, in order to be an account of the thing of which it is an account. The cause of a description is not thereby its object. The object of a non-accidentally true description has both to cause and ground it. (M. R. Ayers, 2004, p. 251) This view is supported by an account of knowledge of reference in which it is conscious attention that explains which non-conceptual, information processing content is selected to enable a subject to have demonstrative thoughts about objects.24 Knowledge of reference is thus based on acquaintance with the object, on conscious attention to an object. Campbell writes: Our grasp of the identity conditions of an object over time, or the boundaries of the object at a time, is grounded not in grasp of sortal 20
M. R. Ayers (2004), pp. 250–251. C. Peacocke (1992), p. 67. 22 G. R. Evans (1982), pp. 123–124, T. Crane (1992). 23 F.i. C. McGinn (1989), p. 62. 24 J. Campbell (2002), pp. 13–16, and chapter 2. 21
180
Menno Lievers
concepts, but in the style of conscious attention we pay to the thing. And conscious attention to the object does not have to be focused by a grasp of sortal concepts; the various styles of conscious attention of which we are capable do not rely on our use of sortal concepts. Grasp of sortal concepts is a more sophisticated matter than the phenomena of reference and conscious attention.25 If that is correct, the implications for the justification of the Generality Constraint in terms of sortal concepts or fundamental Ideas at the fundamental level of thought are far reaching. For such a justification just seems to demand too much sophistication on the part of the thinker, than seems required for the ability to entertain structured thoughts. Campbell concludes: “It seems evident that we cannot sustain this conception of a level of thought, more fundamental than the level of perceptual demonstratives, at which predicates of physical things are first introduced and explained.” (J. Campbell, 2002, pp. 109–113). However, he does not explicitly draw the obvious conclusion from his own argument that Evans’s justification of the Generality Constraint fails for the very reason that such a fundamental level of thought cannot be sustained, although it is implied by what he writes. The reason that Evans’s justification of the Generality Constraint fails is thus his adherence to a fundamental level of thought. As mentioned above, what seems to have motivated Evans to adopt it is a vestige of conceptualism, probably under the influence of Wiggins. In Sameness and Substance the latter defends a sortal theory of identity according to which every object is individuated by the fact that it belongs to a natural kind and occupies a certain position in space and time. Indeed, exactly the criteria an idea of an object has to satisfy in order to qualify as a fundamental Idea of that object. In the course of expounding his views Wiggins defends the thesis that realism and conceptualism are compatible; in a later essay supported by a reference to Leibniz: In Leibniz’s account of ordinary human knowledge, a clear idea of horse is not an image or a likeness of horse. It is that by the possession of which I recognize a horse when I encounter one. [...] A clear idea of horse is confused or non-distinct if, even though I can recognize a horse when I encounter one, I cannot enumerate one by one the marks, which are sufficient to distinguish that kind of thing from another kind of thing. My understanding is simply practical and deictic. What I possess here I possess simply by having been brought into the presence of the thing. [...] Our idea of horse will begin to become distinct as we learn to enumerate the marks that flow from the nature of a horse 25
J. Campbell (2002), p. 83.
Structure Thoughts
181
and that distinguishes a horse form other creatures. (D. Wiggins, 1994, p. 213) Conceptual realism thus requires an active role of concepts in the identification of objects, and at the same time demands a contribution of the object to be identified in that process. Wiggins emphasizes that duality: [...] the object is there anyway, even if it took a particular sort of empirically and logically constrained looking to light up there. The mind conceptualizes an object that is there to be conceptualized, even as the object impinges upon the mind that has the right thought to single out that object. (D. Wiggins, 1986, p. 180) Evans has not adopted Wiggins’s terminology, but the framework of conceptual realism is clearly visible in his account of an information-based particular thought about an object. Such a thought involves a duality of factors: [...] on the one hand, the subject’s possession of information derived from an object, which he regards as germane to the evaluation and appreciation of the thought; and on the other hand, the subject’s [...] identification of the object which his thought concerns. (G. R. Evans, 1982, p. 138) So even though Evans’s work provided many of the ingredients that enabled philosophers to develop theories about non-conceptual content, in his own work that notion mainly played a role in a causal explanation of how we are able to locate an object belonging to a particular natural kind in space and time of which we already possess a conception, thus essentially remaining faithful to conceptual realism. If what has motivated Evans to introduce the notion of a fundamental level of thought is no longer credible, then there is less urgency to defend it on independent grounds. And if the arguments against a fundamental level of thought are convincing, then we need an alternative justification for the Generality Constraint and for the idea that thoughts are structured. In earlier work Campbell has provided an alternative account of why thoughts are structured.26 However, his own recent work suggests yet another alternative. Campbell argues at some length that conscious attention to an object unites the different features our senses register as coming from one location. Conscious attention thus presents an object as a material unity with different qualities to our thinking. Subsequently we can apply concepts to that object and refer to it. 26
J. Campbell (1986).
182
Menno Lievers
It could be objected that this account focuses on perceptual identification and that an account and an explanation of reference ought to be more general, as Peacocke argues. But there is a simple rebuttal of this objection (already mentioned in the above): the notion of reference is acquired by the prototype of the reference relation, which is the name/bearer relation. Even the most prominent anti-realist has to resort to the standard situation of referring to medium shaped size objects in order to explain what reference is.27 That means that there is conceptual asymmetry between the concept of reference when applied to medium-sized objects and the concept of reference when it is used for referring to, for instance, deceased or abstract objects, in the following sense: that the notion of reference when applied in the case of terms for abstract objects is explained in terms of reference to concrete medium-sized objects, but not vice versa.28 The question of what justifies this conceptual asymmetry then arises, and the answer surely must be that our notion of reference is based on a realist conception of the world around us. So we can derive from Dummett’s remark the following transcendental argument: 1. We possess a notion of (successful) reference. 2. A precondition for the possibility of possessing the notion of referring to objects, is that we are able to refer to medium sized objects in our vicinity. 3. Therefore: we are able to refer to medium sized objects in our vicinity. Our capacity to refer thus requires us to acknowledge the existence of medium sized objects. The question then arises what explains our capacity to refer successfully to these objects and the answer must be: through experience. In Campbell’s words: “concepts of individual physical objects, and concepts of the observable characteristics of such objects, are made available by our experience of the world. It is experience of the world that explains our grasp of these concepts.” (J. Campbell, 2002a, p. 128) The explanatory role experience plays is that knowledge of reference of a term in the prototype situation (for instance, when a person is introduced to a hearer by a speaker with a phrase like: “This is Miss so and so”) demands an ability to locate the object referred to. The ability to locate that object involves the binding of features located at the place occupied by the object and registered by our senses and our informational systems. 27 28
M. Dummett (1973/1981), several places, f.i. pp. 406–408. As Peacocke also suggests in the passage quoted above from his (1992), p. 233.
Structure Thoughts
183
Thus described, the ability to locate an object that enables a thinker to describe whether his or her reference to an object has been successful involves a capacity to unite different features into a single object. This cognitive capacity of binding features ought to be distinguished from the capacity to make a distinction between features and the thing they are features of. It is a more primitive capacity than the capacity to predicate, yet is crucial for an account of the structure of thoughts. The argument for the claim that thoughts are structured now proceeds as follows. The first step is to accept that we have to assign conceptual priority to name/bearer relation in the case of middle-sized objects. That is the prototype situation in which we acquire the notion of reference. Secondly, in that prototype situation a thinker has a thought about that middle-sized object in front of him or her. That thought is true, if and only if the object the thought is about indeed has the property the thought assign to it. Traditionally, the explanation of how thoughts were verified followed the Fregean model. “The two anchors of Frege’s semantics” are truth and reference.29 The relationship between truth and reference was thought to be reciprocal. Only if the proper name successfully refers to an object, can the question whether the thought is true arise. Conversely, asking whether the thought is true, forces a thinker to find out whether the proper name refers. Campbell’s explanation of knowledge of reference demands a modification of Frege’s model. Knowledge of reference ought to be distinguished from finding out whether a demonstrative or a proper name occurring in a thought refers. The identification of the object is more fundamental than its re-identification and recognition. We first have to lower the first anchor, reference, and then we can start to entertain thoughts about the object. Thoughts require pre-predicative experience.30 Campbell argues convincingly that we ought to distinguish between (1) Using an object’s possession of a property to single it out visually, and (2) Verifying a proposition to the effect that the object has that property.31 29
The phrase “The two anchors of Frege’s semantics” is Evans’s. See G. R. Evans (1982), p. 9. 30 The phrase ‘pre-predicative experience’ is borrowed from Husserl, who also argued for the need to acknowledge the existence of several different levels of content in sense perception. It would be interesting to compare Husserl’s account with contemporary theories of perception in which the notion of non-conceptual content is employed. I intend to do so elsewhere. See E. Husserl (1939/1985), Abschnitt I. 31 J. Campbell (2002), pp. 28–34.
184
Menno Lievers
The first is a pre-conceptual cognitive activity, the second a conceptual one. The distinction between these two cognitive activities thus explains that thoughts are structured, much in the way as Aristotle once suggested: there is something immediately present in front of us (hypokeimenon) that has certain properties.32 This brings us back to the two options referred to at the end of the first section. After having concluded that Evans’s justification of the Generality Constraint is unsatisfactory, I argued that we could either defend the view that thoughts about objects have genealogical and conceptual priority in our thinking, or we could try to provide a more general justification of the Generality Constraint. Peacocke adopted the second alternative, which, I argued, was unsatisfactory, because it did not assign a central role to perception. In this section I have constructed a justification of the Generality Constraint on the basis of Campbell’s account of knowledge of reference. The first option involved the challenge to meet the demand for a justification of the claim that thoughts about objects have priority. Campbell’s account of knowledge of reference provides us with a justification for basing our justification of the Generality Constraint on thoughts about objects. At the first stage of our conceptual development our thoughts have to be object involving. The ability to entertain these thoughts is structured, as described in this section. Once we have acquired this ability we are in a position to entertain more abstract thoughts, like ‘existential thoughts’.33 4
Conclusion
After the linguistic turn in philosophy the structure of thoughts could be explained and derived from the structure of sentences, since it was thought to be 32
For an account of Aristotle’s theory of demonstrative thoughts see E. Tugendhat (1956/1982). 33 This position implies two claims that need to be developed and defended elsewhere. First, contrary to what Dummett and Strawson have claimed, so-called feature placing language cannot be the first stage in our conceptual development. See. M. Dummett [1973] (1981), pp. 76–77, pp. 228–232, 551; P. F. Strawson (1959), p. 202. Secondly, our first thoughts have to be Russellian in the sense that Evans described as follows: “[...] demonstrative thoughts about objects [...] are Russellian. If there is no one object with which the subject is in fact in informational ‘contact’ – if he is hallucinating, or if several different objects succeed each other without his noticing – then he has no Idea-of-a-particular-object, and hence no thought.” See G. R. Evans (1982), p. 173. It is a substantive issue whether this genealogical claim commits me to the claim that all our thoughts about objects need to be Russellian or that I could rest content with the claim that the structure of, f.i. hallucinations, is parasitic upon the structure of Russellian thoughts. For a discussion of Russellian thoughts see a.o. P. Carruthers (1987) and (1988), G. McCulloch (1988), H. Noonan (1986), M. Pendlebury (1988).
Structure Thoughts
185
definitive of the linguistic turn that thought is linguistic. If that thesis is rejected, as many feel it ought to be, the claim that thoughts are structured demands a new justification that does not rely on the structure of sentences. I have examined attempts to justify the structure and compositionality of thoughts that take as their point of departure the so-called Generality Constraint. Evans’ s account, which has been the first attempt, failed because it relied on the notion of a fundamental level of thought, which is too demanding. First, since its explanans seems more sophisticated than the explanandum, the capacity to entertain structured thoughts. Secondly, because it is incapable to account for the structure of many thoughts that do not involve a fundamental way of thinking, yet are clearly structured. I then examined Peacocke’s justification. Although in many ways congenial to the spirit of the proposal I favour, I concluded that its main structure, based on the notion of ‘knowing what it is for’, is too abstract, in particular because it does not assign a central role to perception, although Peacocke states that this is the basic case. In Campbell’s account of knowledge of reference perception is treated as being fundamental for the ability and the development of conceptual thoughts. He treats knowledge of reference as a pre-conceptual cognitive capacity that enables us to entertain thoughts about objects presented to us in perception. If that is correct, an extremely straightforward justification of the Generality Constraint and the structure of thoughts presents itself: it is simply that the structure of thoughts mirrors the structure of reality. There is simply no other way to think about reality. 5
Acknowledgements
I would like to thank an anonymous referee for several penetrating comments on a previous draft of this essay. References Ayers, M. R. (1991). Locke. Routledge. Ayers, M. R. (2001). What is realism? Proceedings of the Aristotelian Society, S. V. 74. Ayers, M. R. (2004). Sense experience, concepts and content – objections to Davidson and McDowell. In R. Schumacher (Ed.), Perception and reality from Descartes to the present (pp. 239–262). Mentis.
186
Menno Lievers
Bell, D. (1996). The formation of concepts and the structure of thoughts. Philosophy and Phenomenological Research 56. Burge, T. (1990). Frege on sense and linguistic meaning. In D. Bell & N. Cooper (Eds.), The analytic tradition: Meaning, thought and knowledge (pp. 30– 60). Blackwell, Oxford. Camp, E. (2004). The generality constraint and categorical restrictions. Philosophical Quarterly 54, 209–231. Campbell, J. (1986). Conceptual structure. In C. Travis (Ed.), Meaning and structure (pp. 159–174). Basil Blackwell, Oxford. Campbell, J. (2002). Reference and consciousness. Clarendon Press. Campbell, J. (2002a). Berkeley’s puzzle. In T. S. Gendler & J. Hawthorne (Eds.), Conceivability and possibility (pp. 127–143). Clarendon Press, Oxford. Carruthers, P. (1987). Russellian thoughts. Mind 96, 18–35. Carruthers, P. (1988). More faith than hope: Russellian thoughts attacked. Analysis 48, 91–96. Churchland, P. M. (1989). Some reductive strategies in cognitive neurobiology. In P. M. Churchland (Ed.), A neurocomputational perspective. The nature of mind and the structure of science (pp. 77–110). MIT Press. Clark, A. (2000). A theory of sentience. Oxford University Press. Crane, T. (1992). The nonconceptual content of experience. In T. Crane (Ed.), The contents of experience. Essays on perception (pp. 136–157). Cambridge University Press. Cussins, A. (1998). Subjectivity, objectivity and frames of reference in Evans’s theory of thought. EJAP. Davidson, D. (1970/1984). Semantics for natural languages. In D. Davidson (Ed.), Inquiries into truth & interpretation (pp. 55–64). Clarendon Press, Oxford. Dummett, M. (1973/1981). Frege. Philosophy of language. Duckworth. Dummett, M. (1976/1993a). What is a theory of meaning? (II). In M. Dummett (Ed.), The seas of language (pp. 34–93). Clarendon Press, Oxford. Dummett, M. (1981). The interpretation of Frege’s philosophy. Duckworth, London.
Structure Thoughts
187
Dummett, M. (1986). The philosophy of thought and the philosophy of language. In J. Vuillemin (Ed.), M´erites et limites des m´ethodes logiques en philosophie (pp. 141–155). J. Vrin. Dummett, M. (1988/1991). The relative priority of thought and language. In M. Dummett (Ed.), Frege and other philosophers (pp. 315–324). Oxford University Press. Dummett, M. (1993b). Origins of analytical philosophy. Duckworth, London. Evans, G. (1982). The varieties of reference. Clarendon Press. Fodor, J. A. (1975). The language of thought. The Harvard University Press. Fodor, J. A. (1987). Psychosemantics. The problem of meaning in the philosophy of mind. MIT Press. Fodor, J. A. (1994). The elm and the expert. Mentalese and its semantics. MIT Press. Fodor, J. A. (1998). Concepts. Where cognitive science went wrong. Clarendon Press. Frege, G. (1879/1977). Begriffschrift und andere Aufs¨atze. Wissenschaftliche Buchgesellschaft. Frege, G. (1891/1980). Funktion und Begriff. In G. Patzig (Ed.), Funktion, Begriff, Bedeutung. Vandenhoeck & Ruprecht, G¨ottingen. Frege, G. (1918/1976). Der Gedanke. In G. Patzig (Ed.), Logische Untersuchungen (pp. 30–53). Vandenhoeck & Ruprecht, G¨ottingen. Gibson, M. I. (2004). From naming to saying. The unity of the proposition. Blackwell. Groenendijk, J., & Stokhof, M. (1991). Dynamic predicate logic. Linguistics and Philosophy 14. Husserl, E. (1939/1985). Erfahrung und Urteil. Felix Meiner Verlag. Johnson, K. (2004). On the systematicity of language and thought. Journal of Philosophy 101, 111–139. Kamp, H., & Reyle, U. (1993). From discourse to logic. Kluwer. McCullogh, G. (1988). Faith, hope and charity: Russellian thoughts defended. Analysis 48, 85–90. McGinn, C. (1989). Mental content. Basil Blackwell.
188
Menno Lievers
Millikan, R. G. (2000). On clear and confused ideas. Cambridge University Press. Noonan, H. (1986). Russellian thoughts and methodological solipsism. In J. Butterfield (Ed.), Language, mind and logic (pp. 67–90). Cambridge University Press. Peacocke, C. (1992). A study of concepts. MIT Press. Pendlebury, M. (1988). Russellian thoughts. Philosophy and Phenomenological Research 48, 669–682. Quine, W. V. O. (1950/1959). Methods of logic. Holt, Rinehart and Winston, New York. Schiffer, S. (1991). Does mentalese have a compositional semantics? In B. Loewer & G. Rey (Eds.), Meaning in mind. Fodor and his criticis (pp. 181–199). Basil Blackwell, Oxford. Strawson, P. F. (1959). Individuals. Methuen & Co. Ltd. Strawson, P. F. (1970/1971). The asymmetry of subjects and predicates. In P. F. Strawson (Ed.), Logico-linguistic papers (pp. 96–115). Methuen & Co, London. Strawson, P. F. (1974). Subject and predicate in logic and grammar. Methuen & CO LTD. Tugendhat, E. (1958/1982). Ti Kata Tinos. Eine Untersuchung zu Struktur und Ursprung aristotelischer Grundbegriffe. Verlag Karl Alber. Wiggins, D. (1980). Sameness and substance. Basil Blackwell. Wiggins, D. (1986). On singling out an object determinately. In P. Pettit & J. McDowell (Eds.), Subject, thought and context (pp. 169–180). Clarendon Press. Wiggins, D. (1994). Putnam’s doctrine of natural kind words and Frege’s doctrines of sense, reference, and extension: Can they cohere? In P. Clark & B. Hale (Eds.), Reading Putnam (pp. 201–215). Basil Blackwell, Oxford. Wiggins, D. (2001). Sameness and substance renewed. Cambridge University Press.
Intensional Epistemic Wholes: A Study in the Ontology of Collectivity Alda Mari
1
Groups and Kinds of Wholes
In formal ontology1 , for both concrete and abstract objects, the principle of compositionality amounts to the statement that the constitution and the representation of the whole is a function of the constitution and the representation of the parts and the way they are assembled. In this paper we analyze the notion of group that we formally treat as an abstract whole whose parts are its members. Our investigation is based on natural language data and in particular plural (the boys) and conjoined noun phrases (John and Mary, the boy and the girl, the boys and the girls) in relation with distributive or singular predication (i.e. predicates that denote only singular atoms such as walk, be nice, be in some place...). The notion of group2 has been examined extensively in the literature on plurality in recent years. Major advances in this domain have shown that formal ontology provides useful means for understanding this notion3 . On the other hand, natural language data can lead us to revise existing models for part-whole structures and to elaborate new ones. Address for correspondence: CNRS-ENST, 46, rue Barrault, 75013 Paris, France. E-mail: [email protected]. 1
Following the doctrine (Husserl, 1901), by “formal ontology,” we refer to the study of relations amongst objects and their parts, independently of their nature. 2 This notion is meant by various labels. The most popular are collection, set, whole, integrated − whole, group atom. However, they profoundly differ in the way they capture it. The first two roughly refer to groups as to sets without unity; the others emphasize their monadic character. However, as we show in this paper, only a close inspection of the theories within which they are defined can make justice of these differences. 3 See in particular the works of (Lasersohn, 1995), (Moltmann, 1997), (Schwarzschild, 1996) and (Landman, 2000) explicitly applying formal ontology techniques to the analysis of plurality. The Compositionality of Meaning and Content. Volume I: Foundational Issues. Edited by Markus Werning, Edouard Machery, & Gerhard Schurz. c
2005 Ontos Verlag. Printed in Germany.
190
Alda Mari
From the theoretical point of view, our aim is to question the principle of compositionality in two respects. First, with respect to the parts, one has to state clearly under what conditions two ontologically individuated objects can be composed into a whole4 . Secondly, with respect to the whole, one has to predict the nature of the object resulting from the composition of some particular parts and a specific mode of assembling. On the phenomenological side, we analyze a kind of collective interpretation that we label as “collectivity as dependence” and from now on CODEP. This case poses a real challenge to existing theories of mereology and ultimately to compositionality in the domain of abstract objects. Our claim is that none of the foundational models for part-whole relations is able to explain this phenomenon, and a new conception of whole has to be worked out. Let us introduce the phenomenon of CODEP by two examples. First of all, to capture this notion, one has to distinguish it from that of “juxtaposition.” Consider the propositional content of the following sentence. (1) John and Mary are walking along the beach The scene described in (1) is such that there are two people waking side by side along the same trajectory. This scene can be interpreted in two ways: (i) juxtaposition or genuine distributive interpretation – two people are accidentally walking side by side, or (ii) CODEP – two people are walking together (as a group) along the same trajectory. The first interpretation appeals to a minimal amount of composition, which only consists in acknowledging that two people satisfy the same description. The second interpretation appeals to a real process of composition which leads to see two people as forming a whole, or a group. Secondly, to understand CODEP, one has to distinguish it form the notion of “collective responsibility.” Consider (2). (2) The boys sing This sentence is comparable to (1) in that the predication is singular. Three interpretations are immediately available: (i) juxtaposition or genuine distributive interpretation: each of the boys sings, but they are not coordinating their singings (= (1i))5 ; 4
The parts of a whole are not always seen as ontologically independent, especially with respect to organisms and concrete objects, as Aristotle states in the De Anima, II1,2. In the case of group members, they certainly are. 5 We discard here cases of distributions to collections in which subcollections of the
Intensional Epistemic Wholes
191
(ii) collectivity as dependence interpretation: each of the boys necessarily sings, and they are all coordinating their singings with one another (= (1ii)); (iii) collective responsibility interpretation: all of them are not necessarily singing, but there is a collective responsibility, insofar as they are intended to form a chorus. The major difference between (ii) and (iii) is that under (iii) the boys are expected to form an entity “chorus,” while in (ii) it is by virtue of coordinating their singings that they are conceived as a group. This group does not necessarily have an independent status from the actual coordination of singings. This points to the fact that the second interpretation is somehow in between the two others: it requires that each of the boys sings (= (i) and 6=(iii)), and that they do coordinate their singings (= (iii) but 6=(i)). It is clear, then, that a proper mereological theory will have to predict, first, under what conditions, ceteris paribus, CODEP interpretation is enhanced only in some but not all cases: provided that the scene under the distributive and CODEP interpretations is exactly the same, one has to explain by virtue of what interpretative process the second interpretation raises (difference between interpretation (i) and (ii) of (1) and (2)). Secondly, the model will have to make explicit the nature of the whole that results from this composition (difference between (ii) and (iii) of (2))6 . set denoted by the plural NP are individuated as the proper loci for the application of the property. In this case, for instance, the boys would be separated into two – or more – subcollections and the property sing would be predicated of these subgroups. Once the proper level of individuability has been chosen, the property is distributed in the same way as in cases of distribution to genuine individuals. This procedure requires that a huge amount of information be supplied contextually but seems to be explainable on an extensional basis (Gillon, 1987; Schwarzschild, 1997) in the same way as distributivity to non-group atoms. This phenomenon is to be predicted correctly by any model of plurality, and it is by our model. However, this would require a long and overly detailed explanation and is outside the scope of this paper. 6 The “collective responsibility” interpretation seems difficult for (1), unless one admits that in the case where one person brings the other one on her shoulders they can be nonetheless said to be walking “as a group.” This seems an unacceptable interpretation for most of the speakers. Another case seems more triggering, that in which one of the people is handicapped and moves on a wheel chair pushed by other person. One could nevertheless respond that the wheel chair provides a way of moving and that this is not a collective responsibility, but a CODEP interpretation according to which the two people have to adjust their movements to one another and no entity “group” pre-exists the mere event of walking. It is then not unanimously recognized that “collective respon-
192
Alda Mari
There exist different conceptions of part-whole relations in the light of which these data have been explained. In a very recent study, Meirav (2003) has shown that they can be reduced to two foundational ways of representing wholes, either as sums or as unities and that, when they are conceived as unities, one can either explain their monadicity by the dependence relations linking the parts7 , or to consider them as ontological primitives8 . In the first part of this paper, in section (2), we show that none of these conceptions of wholes can explain the notion of CODEP. On the phenomenological side, we show that two features characterize this case: first, with respect to the parts, one can easily state that they are linked by coherence – or dependence – relations. When walking “together,” if one of the people turns, the other will turn too, if one stops, the other will stop too, and so on. Secondly, with respect to the nature of the whole, one can equally easily state that this is neither the sum of its parts, nor a unity existing above them: it does not exist a common walk out of the two distinct walks of the two people. On the theoretical side, we consider in turn models of wholes as sums and wholes as unity and analyze some formal implementations. We argue that theories of wholes as sums cannot capture the first feature, and that theories of wholes as unities would compel us to accept the existence of a whole above the parts not fitting the case of CODEP. Our conclusion is that another notion of whole is needed, one that we will call wholes as networks. In the second part of the paper, in section (3), we first present the notion of whole as network informally, and then elaborate a formal account. The definition of wholes as networks is designed to grasp the fact that, under CODEP reading, the parts are seen as related via their properties, and that this relation takes the form of an inferential constraint. From a theoretical point of view, this corresponds to a counterfactual reasoning: the cognitive agent makes predictions about the possible evolutions of the entities, and, in the case where a covariation is f oreseen, the entities are conceived as entering a network. In the case where the entities are only observed as covarying, and there is no counterfactual reasoning relating them within a network, the distributive reading raises. As the result of the inferential constraint by which the agent counterfactually relates properties of individuated objects, the whole is nowhere else but in the network of related entities. sibility” is a possible interpretation for the predicate walk and because its availability would eventually be added to CODEP interpretation, without replacing it, we prefer not to commit ourselves to its existence. 7 This conception goes back to Aristotle; see Metaphysics, especially book Z Chs. 10,17; book H Ch.6. 8 This conception goes back to Plato; see Theaetetus 202e-205e, and particularly 203c-205a.
Intensional Epistemic Wholes
193
Under this account, the distinction between accidental vs. non-accidental association is expressed in terms of “observation” vs. “prediction” about the existence of a covariation of the properties of the parts. As far as the parts are thought of as covariating, they can be thought of as forming a whole. On its side, the whole is the resulting structure of this covariation. In section (4) we extend our model to cases of stative eventualities, showing that the notion of dependence and property constraint is strictly correlated to that of acquisition of information. We show that, whenever the application of the constraint on properties does not bring new information, the collective interpretation is suspended. Our conclusion from a theoretical perspective (section (5)) is that the notion of whole as network, going beyond extensionality, can conciliate, as a middle term, two struggling notions: holism and compositionality. 2
Kinds of Wholes
There exist a huge amount of literature about the notions of wholes as sums and of wholes as unities9 . In this section, we recall the foundations of these two conceptions, consider recent linguistic theories that rely on them, and show, in turn, why none of these conceptions is appropriate to explain CODEP. We conclude this section individuating the specificities of this case, which make it to resist the existing explanations. Wholes as sums One way of conceiving wholes is as sums. Theories that subscribe to this conception generally use operations of set union (or, in mereological terms, mereological sum) over a domain of eventualities or individuals and recognize the axioms of closure (3a) and uniqueness (3b). Let A be a finite set, and t a finitary operation of union. (3a) Closure: A is closed under the operation t, i.e. for any a, b ∈ A there is an element c ∈ A such that a t b = c 9
See, among many others, (Russell, 1903) for the contrasting distinctions between sums and unities, called, respectively, “aggregates” and ”unities”; see (Frege, 1884), (Goodman, 1951), (Le´sniewski, 1916), (Lewis, 1986) for conception of wholes as sums; see (Nagel, 1952), (Simons, 1987), among many others, for theories of wholes as unities. The notion of unity is strictly related and largely inspired by that of “organic unity” (Husserl, 1901), and shares its essential features with that of “Gestalt” (Wertheimer, 1925).
194
Alda Mari
(3b) Uniqueness: If a = a0 and b = b0 then a t b = a0 t b0 On this view, groups are not considered as being of a different nature from their members (they are nothing else than their members). In the mereological literature, the term union (of individuals and of sets) corresponds to that of mereological sum (of parts) and it is generally defined in terms of overlapping10 . (4) u is a sum of x1 , x2 , ...xn = def for all ys, y overlaps u if and only if y overlaps one of the xs From the perspective of compositionality, these facts point to the following one: there is no other entity besides the sum of the parts, and sum is the only ”way of composition.” This means that ”sum” and ”compose” are the very same operation11 . (5) x1 , x2 , ...xn compose u = def u is a sum of x1 , x2 , ...xn This means that there are no extra relations other than the sum of the parts that compose a whole. It follows that wholes as sums are perfectly coextensive with their parts12 : (6) Coextensive determination. Wholes are coextensively determined if and only if for all us, for all vs, for any x, for any y, if u is a whole which corresponds to the xs, v is a whole which corresponds to the ys, then u is identical to v only if the xs are coextensive with the ys. Finally, the notion of sum brings with it the principle of universal existence of sums: (7) Universal existence of sums. Whenever we specify individuals, some individual exists which is a sum of those individuals. These axioms found Lasersohn’s (1995) definition of the togetherness effect, which, if appropriate, should properly generate CODEP in all and only the appropriate contexts. This definition is stated, in its original form (Lasersohn 1995, p. 190), as in (8). Let e, g, P, be event, group and property variables. x1 , x2 , ...xn denotes the group consisting of x1 , x2 , ...xn . 10
See, among others, (Simons, 1987, p. 73), (Moltmann, 1997, p. 12) and more recently, (Meirav, 2003, p. 39sqq.). The statements of principles (4), (5) and (6) are from Meirav (ibid.). 11 Again, see (Simons, 1987, p. 73sqq.), (Moltmann, 1997, p. 12sqq.) and more recently, (Meirav, 2003, p. 40). 12 See (Simons, 1987, ch. 2), (Moltmann, 1997, ch.1) and (Meirav, 2003, p. 224)
Intensional Epistemic Wholes
195
(8) First condition for group formation. λ P, e, g[g ∈ P(e)&∀e0 v e(∃x(x ∈ P(e0 ) =⇒ P(e0 ) = P(e)))] This condition states that a group g is the set of people that satisfies a property in each proper and improper part of the collective event. No other entity can be added in any subpart of e. So a group g is the minimal set satisfying property P. A corollary requirement is that the property does not have to be distributed to the members constituting the group (Lasersohn, ibid.). (9) Principle of no redistribution. A group g has a property P together in eventuality e iff e has a smaller eventuality e0 as a part, such that g has P in e0 , and e0 does not have parts such that the members of g have P in those parts. Consider (10) and its interpretation (11): (10) John and Mary are carrying the piano upstairs (11) ∃E, ∃e1 v E[carry({ j, m}, e1 )] According to the Davidsonian (Davidson, 1990) view of events endorsed by Lasersohn (Lasersohn, 1995, p. 191), the assertion of the sentence involves demonstrative reference to a particular eventuality (the collective event (E)). This eventuality, in the situation we have described, consists of one sub-event, e1 , that in which John and Mary are carrying the piano together. The eventuality E in (10) is collective since there is a sub-event (e1 ) in which the individuals do not satisfy the property separately. In spite of conforming to the intuition of what a collective action is, this account presents a major hurdle when one tries to extend it to the cases of CODEP. Representation (12) seems to be the only possible one for (1). However, it cannot differentiate the distributive from CODEP reading. (12) ∃E, ∃e1 , e2 , e3 v E [walk( { j}, {m} , e1 ), walk({ j}, e2 ), walk({m}, e3 )] In the case of a singular predicate such as walk the collective event, if any, can only be represented as ewalk = {walkj }, {walkm } : by the nature of the predicate, the property has to be distributed to the individuals, so in ewalk John and Mary act both individually and as a group. Moreover, this very same symbolization represents any set satisfying condition (8). Consider the case where John and Mary are two persons walking accidentally side by side, going exactly from point A to point B. This set of people satisfies the definition of group given in (8) since it is the minimal set that satisfies the property of walking in every proper and improper part of the
196
Alda Mari
event of going from A to B. It follows that (8) cannot capture the difference between accidental and non-accidental association. This is due to the principle of universal existence of sums (7) to which classical extensional mereology subscribes. Even in the case of purely accidental association the individuals can be summed up in such a way that there is no criterion to distinguish the distributive/accidental reading from the collective/regular interpretation. This distinction is fundamental in the case of (1) for which the distributive and the collective interpretations can only be distinguished on the basis of the distinction between accidentality vs. regularity. This distinction coincides with that between sum and dependence relation, and thus an appropriate account of the collective interpretation for singular predicates will have to integrate the notion of dependence. Wholes as unities An alternative way of conceiving wholes is as unities. The basic assumption endorsed by this conception is that a whole is a primitive, which has parts, without being dependent, for its own existence, on the existence of each of them. The principle of composition to which theories of wholes as unities subscribe is not that of sum, but that of making up13 : (13) for any xs, for all ys, if the xs make up the ys then each of the xs is a part of y Very importantly, the notion of making up is such that the converse of (13) does not hold: if some xs are part of y they do not necessarily make up y. Consequently, some basic principles grounding the notion of whole as sum are not obeyed by theories of wholes as unties. (14) Wholes as unities do not obey the principle of universal existence of sums. Whenever some elements exist, a sum of these elements does not necessarily make up a whole. (15) Wholes as unities do not obey the principle of uniqueness. This is because some elements can be assembled in such a way that more than a whole can result. (16) Wholes as unities do not obey the principle of coextensive determination. Wholes are not identical with the sum of their parts. From the ontological point of view, (Landman, 1989b) and (Landman, 2000) claim that groups are plural entities seen under a certain perspective, that is to 13
According to (Meirav, 2003, ch.9) the notion of making up is to be understood in contrast with that of “coextensiveness.” Parts contribute to the existence of the whole, but this is not coextensive with them.
Intensional Epistemic Wholes
197
say, monadic entities of a nature different from that of the plural entity that underlies them. Once the existence of a unit has been recovered, the accessibility to each of the members is blocked, i.e. the internal structure of the unit becomes completely opaque. In Landman’s ontology only atoms count as entities. Predicates differ with respect to whether they take group atoms or individual atoms in their denotation. Singular predicates (17) never denote group atoms but only individual atoms. They can be pluralized14 (the “*” indicates the pluralization of the predicate). (17) John and Mary walk John t Mary ∈ *WALK John ∈ WALK & Mary ∈ WALK → ∀a ∈ John t Mary a ∈ WALK Collective predicates (18) take group atoms and do not distribute the property to the members (the ↑ indicates that the entities in its scope form an indivisible atom): (18) John and Mary meet ↑(John t Mary) ∈ MEET 6→ ∀a ∈ John t Mary a ∈ MEET Some other predicates are ambiguous (19), and a type shifting operation allows one to switch form the distributive to the collective reading (the σ indicates that the plurality in its scope is maximal): (19) The boys carry the piano individually −→ The boys carry the piano together σ (*BOYS) ∈ *CARRY −→ ↑ (σ (*BOYS) ∈ CARRY) On the ontological level, the type shifting operation corresponds to a change in perspective and can be translated by “as a group.” It is of major importance to state under what criteria one can shift the interpretation from a sum reading to a group reading. Landman (2000) provides the collectivity criterion (20): (20) Collectivity criterion. The predication of a predicate to a plural argument is collective iff the predication is a predication of a thematic basic predicate to that plural argument, i.e. is a predication where the plural argument fills a thematic role of the predicate. 14
See (Link, 1983).
198
Alda Mari
In order to keep the theory coherent, Landman assumes that distributive predication is not thematic. However, in some cases, the property can be distributed to each singular entity of the group, with the collectivity criterion still satisfied. Consider (21): (21) The journalists asked the president five questions Landman (2000, pp. 171–172) uses the notion of collective responsibility or team credit, guaranteeing that, even if each singular journalist asks the president five questions, each question is ascribed to the press body to which each journalist belongs. In this case, the existence of a group has to be presupposed and in fact has to pre-exists each of the members. This account raises a major difficulty when we try to apply it to CODEP interpretation of (1) and (2). According to the collectivity criterion, there are only two ways to model this interpretation: 1. either one assumes that the individuals forming a group must not satisfy the property separately, i.e. singular predication can never be interpreted collectively, as Landman assumes at one point in his argument (Landman, 2000, p. 148), in contrast to the fact that (21) can be interpreted collectively; 2. or one has to retrieve the existence of a group from the context or from the lexical information (this is the option to which Landman (2000, pp. 164– 176) finally subscribes for cases such as (21)). The first statement cannot be accepted if one recognizes that there is a collective interpretation for (1): (22) John and Mary are walking (as a group) ↑(John t Mary) ∈ WALK and ∀a ∈ John t Mary a ∈ WALK The second statement would force the speaker and the hearer to assume that John and Mary form a couple, for instance. This is, however, not mandatory at all. The group does not necessarily pre-exist the eventuality described in the scene. It is precisely “in walking” that John and Mary form a group. So we have either to admit that CODEP interpretation is incompatible with singular predicates, or to abandon the collectivity criterion. We assume that singular predication can give rise to CODEP reading and that in these cases no group pre-exists the scene described by the sentence. Consequently, it cannot be grounded on the collectivity criterion. Instead, we agree, and we develop it in detail, that the notion of group is strictly related to that of “perspective.”
Intensional Epistemic Wholes
199
Unity by virtue of dependence Theories of dependence with unity are an avatar of theories of wholes as unity tout court. It is worth mentioning these theories since our purpose is to analyze the notion of dependence without unity. The basic claim endorsed by theories of dependence with unity is that the whole is a unity if and only if there exist some dependence relations among its parts: the unity exists by virtue of the parts functioning together. Nevertheless, the ontological claim seems still too compelling, if it is mentioned – as in fact it is – to capture cases of CODEP interpretation. According to Moltmann (1997), the way of reasoning goes as follows: the parts are dependent, then they form a unity, and once the whole as unity has come into existence, the access to the parts is blocked. An integrated-whole is such that the parts do not have an independent ontological existence from the whole. This amounts to stating that a collective sentence necessarily denotes a collective event such that the composing events are no longer accessible. Formally this is given by the following condition (Moltmann, 1997, p. 56), where < s is a proper part relation in situation s: (23) (Strict) Collective interpretation. For entities e and x, a verb f , and situations s and s0 , f is interpreted collectively in s with respect to e, x, s and s0 iff [ f ]S (e, < x, s0 >) = 1, and there is no e0 , e0 < s , such that for some x0 , x0 < s x and [ f ]S (e0 , < x0 , s0 >) = 1 An integrated-whole is taken to be the kind of whole that explains the cases of CODEP. However, likewise the other approaches of the notion of whole as unity, this definition fails to capture the fact that, under CODEP, the sub-events have to remain accessible: a common walk (1ii) exists nowhere but in the co-ordination of accessible walks. Neither sums nor unities Let us sum up. There exist two conceptions of wholes in the light of which one could explain notion of collectivity: wholes as sums and wholes as unities. We have shown that none of these models captures the features that characterize CODEP and, at this point, one seems to come to a certain impasse. Let us first recall the two characteristic features of CODEP. (24) Elements for CODEP 1. a coherence relation differentiates CODEP from pure accidental association or distributive interpretation (e.g. two people walking as a group have coordinated trajectories),
200
Alda Mari
2. an access to the group members differentiates CODEP from “collective responsibility” interpretation (e.g. a walk independent of the walk of each of the people does not exist). Theories subscribing to the view of wholes as sums cannot make a distinction between two extensionally identical situations and are thus too weak to catch the first of the two features. Theories of wholes as unities are too ontologically compelling to capture the second feature. It is clear now that we are looking for a notion of whole stronger than sum and weaker than monadicity. We claim that coherence relations are necessary and sufficient. They guarantee that the collectivity reading is possible by virtue of the existence of a network. There is, then, a third way of looking at wholes: as networks. Individuals functioning together make up a complex object without this abstract object existing per se. This is precisely the case for a collective walk. 3
Wholes as Networks
The notion of whole as network is nowadays largely studied in computer science, particularly in theories of distributed systems and communicating processes (see, in particular, (Barwise & Seligman, 1997; Milner 1999; Stirling 2001)) and the model we are about to present is inspired by Barwise and Seligman’s theory of information flow (ibid.). Let us first introduce an informal definition of wholes as networks. (25) Whole as Network. To a person with prior knowledge k, f having property p carries the information that f 0 has property p0 , in all possible worlds compatible with k, if the person could legitimately infer that f 0 has property p0 from f having property p. f and f 0 are seen then as entering a network. The core of the definition rests in the notion of counterfactuality and property relation. From the point of view of the speaker, two entities can enter a network if, for every possible world, their properties maintain a certain relation. More precisely, two entities are considered to enter a network as long as the speaker, given her/his previous knowledge, can foreseen that their properties will entail each other in every possible accessible world. In this way, the entities are seen as mutually dependent via their properties. In the case of a CODEP interpretation for the predicate walk, the agent can foreseen that the people will coordinate their trajectories. If this prediction cannot be made, and the association is only observed, the distributive interpretation will be the only available one.
Intensional Epistemic Wholes
201
This notion of epistemic dependence has to be distinguished, on the one hand, from that of juxtaposition, and, on the other hand, from that of cause. Two juxtaposed entities can form a collection, but not a unity nor a coherent whole. Juxtaposition is an extensional relation that links entities that belong to a collection. This relation may rely on the fact that entities share some common properties15 . Nevertheless, the properties of each entity exist totally independently of the properties of every other element in the same collection. On the other hand, epistemic dependency is weaker than cause, though sharing very deep resemblances. If, one the one hand, cause can be understood as “bringing into existence,” there is another notion of cause in the light of which we can understand dependence: that of covariation of properties (Lewis, 1973). In (1) for instance, the walk of one of the two people exists independently of the walk of the other. However, under CODEP interpretation, the properties of each of them constrain the properties of the other: they do influence each other trajectories, for instance. Crucially, like causality, epistemic dependence relies on types. Causality is not random, but can be foreseen by virtue of the types of the events involved. This is also the case for dependency. In a weaker way, though: types are called into play when the cognitive agent epistemically links the occurrences of two events. If the knowledge that one has about the properties of one event entails the knowledge of the properties of another event, then these two events are epistemically dependent. The informational feature is crucial for dependence: for two people to be seen as walking collectively, the property of the walk of one of them has to provide certain information about the walk of the other. It is reasonable to ask whether the non-accidental character is provided by the existence of a constraint or by the modal notion of possible world. Our answer is that both of these notions are needed. Given one point in time, it is observationally impossible to make the difference between the cases of accidental and non-accidental association and thus formalize the situation by possibly constraining the descriptions. Under the collective interpretation one foresees that the coordination will be kept, in such a way that the notion of constraint and of maintenance of the constraint go hand in hand16 . Implementation In this section we work out the model for the notion of network, providing an 15
This is the case of the collections studied by (Schwarzschild, 1996). This is corroborated by the fact that, with achievements, the collective interpretation is lost by together that is, in these cases, synonymous of at the same time; achievements are incompatible with with, which requires that two entities influence one another. See (Mari, 2003, ch.4–5) and (Jayez and Mari, 2005). 16
202
Alda Mari
event-based account (Parsons, 1990). The model is articulated in two domains: objects and descriptions. Individuals and eventualities are objects (26): (26) D domain of individuals; E domain of eventualities. Eventualities are temporal entities of any kind, dynamic or stative. Insofar as we are considering the propositional content of a sentence, we need to analyze its constituents and assign to each of them the appropriate task in the construction of the overall scene. Singular NPs denote singular objects, plural and conjoined NPs denote sets of plural entities, without requiring any particular structure on this set. (27) k NPplural k = {E ⊂ D | #E > 1} Let, for a predicate f , be I the set of entities that occupy a certain role (agent, patient, theme, ...) in the eventuality denoted by the predicate. (28) I = {d R | ∃(e)(k f k (e) & Role(e) = d))} Following Landman (2000), we assume that when a singular predicate is combined with a plural argument, it is pluralized. So, if the predicate is singular, there will be only one event, if the predicate is plural, there will be as many events as participants irrespective of whether the interpretation is collective or distributive. Typically, this is the case for the predicate walk. (29) states that for every individual in the set denoted by the plural argument, there is an event, such that the predicate assigns the truth value 1 to every pair < e, d > (indexed on R), if d occupies a certain role in one of the events denoted by the pluralized predicate. (29) ∀d R ∈ I(∃e(k f (e, d R ) k= 1)) The second domain in the model is that of descriptions or types. (30) Θ is the set of types The introduction of types into the model allows us to integrate the cognitive agent’s perspective on entities. The agent can assign a description to any entity, minimally recognizing its location in space and time. A classification (31) is the object’s type assignment. (31) Classification. A classification is a triple (Ob jects, Types, |=), where Ob jects is a set of objects, Types a set of categories or types, and |= a relation between Ob jects and Types. If o ∈ Ob jects and σ ∈ Types, o |= σ means that the o is of type σ .
Intensional Epistemic Wholes
203
Types can be assigned to either individuals or events. Types assigned to events are called phases and they register the content of an event, i.e. its past and future developments17 . For an event of walking, for instance, the phases register the trajectory of the walk. For each point in time, part of the content of an event is represented by its future developments. Assuming a branching time representation (Penczek, 1995) for the future, given a certain point in the trajectory, there exist a particular set of points that can continue the walk under description at t 0 t. This set is Markovianly determined, i.e. it does not depend on the whole history of the events but only on the point in which the system is at time t. However, it depends on the content of the description. For instance, a walk cannot evolve by itself into a telephone call. (32) State space. s is a set of classifications for which each token is assigned exactly one type. The state space s is complete if every type is the type of some token. Note that a state space is a situation, i.e. an agent-oriented structured part of the reality. Very importantly, a state space is relative to a time t. If one agrees that the agent assigns a description only to tracked objects (Pylyshyn, 2003), a situation, in this perspective, is not generally a smaller “world” compatible with the conceptual capacities of a cognitive agent (Barwise & Perry, 1983). The agent focalizes her attention on some events in which s/he is interested, and, by “situation” we precisely mean all and only the events that can be assigned a description. This is acknowledged in our model by the fact that we can only use descriptions for the events mentioned in the sentence. The content of these descriptions is retrieved on the basis of contextual and encyclopedic knowledge. A state space can evolve. Let (33) represent the state space that describes the events of walking of John and Mary at t. (33) s := { tra jectoryτ |= s ej ; tra jectoryτ 0 |= s em } This state space can evolve in different ways such as those given in (34). (34) Possible evolutions of state space (33) • • • • 17
s1 0 := { tra jectoryτ |= s0 ej ; tra jectoryτ 0 |= s0 em } s2 0 := { changetra jectoryτ |= s0 ej ; tra jectoryτ 0 |= s0 em } s3 0 := { tra jectoryτ |= s 0 ej ; changetra jectoryτ 0 |= s0 em } s4 0 := { stoptra jectoryτ |= s0 ej ; tra jectoryτ 0 |= s0 em }
They can be compared to object files for abstract objects (Pylyshyn, 2003), which can be seen as a memory structure or a folder which stores information about a given object.
204
Alda Mari
• s5 0 := { tra jectoryτ |= s0 ej ; stoptra jectoryτ 0 |= s0 em } S0 is the set of state spaces into which s can evolve, at time t 0 immediately following t 18 . 0
(35) S0 = { s0 | ∀t,t 0 , S.t t 0 → (st ,→ s0t )} This is all we need to introduce in relation with particular events and their descriptions. Let us now consider the formalization of the proper notion of dependence. Its formal counterpart is that of constraint. (36) Constraint. A constraint is a closed formula of the general form Q((τ |= o) =⇒ ρ) where, Q is a series of quantifiers, τ a type, o an object and ρ a well-formed formula. Types correspond to observations and constraints are entailments between observations: x having a certain property entails that y has a certain property. From perspective of the speaker, to observe that entity x has a certain property means to infer that another entity in the domain has another property. Constraints express then the fact that, if one observation can be made, another observation can also be made. This fact amounts to the acquisition of a piece of information. At this point we can provide a formal condition for the wholes-as-networks interpretation. (37) Whole as network: condition for CODEP k f sing (NPpl ) k sit,t coll = {ei,i∈I | ∀i ∈ I(k f k sit,t ei = 1 & ∀s0 s,→s0 (∀τ((τ |= 0 0 s0 ei ) ⇒ ∀ j, j∈I ∃τ (τ |= s0 ej ))))} When it is applied to a plural NP, a singular predicate f sing is interpreted collectively in situation sit at time t if it denotes the set of events indexed on individuals and thematic roles such that: 1. the predicate is true of any event involving an individual with respect to a certain thematic role and, 2. for any possible state space s0 accessible from s for every description for the eventuality involving any of the participants, this description entails another description for any other eventuality involving any other participant. As we show in section (4), any description of any object/event cannot provide information about any other object/event, for any observer. Formula (37) states that the content of the description is such that it provides information about another object/event. Moreover, this informational content is observer dependent. 18
The “immediateness” depends on the granularity one has chosen.
Intensional Epistemic Wholes
205
Note that we are not supposing that the collective interpretation calls into play a collective event without accessibility to its subevents. Instead, the collectivity is engendered by a constraint with scope upon the descriptions of particular events, and it exists nowhere but in the constraint relating the entities via their descriptions. The instantiation of the definition for the collective vs. distributive reading of (1) in (38) and (39) respectively, will clarify our purpose. (38) CODEP interpretation for (1) k John and Mary are walking along the beachk sit,t coll = {e(agent{j}) , e(agent{m}) | walk(ej ) = 1 & walk(em ) = 1 & ∀s0 s,→s0 ∀tra jectoryτ ((tra jectoryτ |= s0 ej ) ⇐⇒ ∃tra jectoryτ 0 (tra jectoryτ 0 |= s0 em ))} (39) Distributive interpretation for (1) k John and Mary are walking along the beach k sit,t distr = {e(agent{j}) , e(agent{m}) | walk(ej ) = 1 & walk(em ) = 1 & ∃tra jectoryτ ∃s((tra jectoryτ |= s ej ) & ∃tra jectoryτ 0 (tra jectoryτ 0 |= s em ))} CODEP interpretation (38) contrasts with the distributive one (39) in two respects: (i) in (39) there is no constraint; (ii) in (39) the possible evolutions are not taken into account (the association is accidental and can only be captured step by step). They resemble each other in that there is no independent collective event for (38). These interpretations run as follows. Consider an s in which John and Mary are walking. In S0 at t 0 one of following configurations can be verified: 1. John keeps on walking/stops and so does Mary. There is a covariation (Lewis, 1973), so they can be said to walk collectively in S. 2. John stops (or keeps on walking) and Mary keeps on walking (or stops). There is no covariation so the formula is false at S. When there is no covariation, John and Mary are viewed as walking distributively. When a covariation is observed there are two possibilities. On the one hand, John and Mary can be considered as walking “as a group.” The agent will foresee that the covariation will be maintained, and thus constrains to one another the descriptions of the two events. On the other hand, if the two walks are only observed as evolving in parallel and no prediction is made, the parallelism is considered to be accidental. In other terms, (37) is a rule of interpretation. It
206
Alda Mari
states that it is necessary that under CODEP interpretation the agent foresee that two events covariate. It does not exclude that a parallelism is observationally verified in the case of an accidental association. 4
Co-localization, States and Definitory Properties
At this point, we have considered only dynamic eventualities. Even though our model seems to capture the specificities of the cases illustrated by (1), its predictive power can be appreciated when applied to cases of stative eventualities (more commonly, states) and co-localization in particular. Note that these cases belong to the kind of collective interpretations that we have been considering so far, for “to be localized in some place l” is trivially distributive. It is very often the case that one can link two co-localized entities in a nonaccidental manner, as illustrated in (40). Some authors even claim that any time two entities are co-localized they can be considered as associated19 . (40) The glasses and the decanters are in the cupboard Under CODEP interpretation, the positions of the glasses and that of the decanters are seen as related to one another. In wholes-as-networks terms, we suggest that a cognitive agent coordinates the two sets of entities and recognizes that entities of the same type form a structure or a network when they are in the same location. In inferential terms, this means that one can retrieve the position of the glasses from that of the decanters and vice versa. However this is not a general rule of interpretation. One cannot interpret (41) collectively: (41) The forest and the lake are at the top of the mountain Our model explains this impossibility by the fact that the descriptions of the entities are (epistemically) unrelated: the position of the lake (on the top of a mountain) and that of the forest (on the top of the same mountain) are seen as independent of one another. We have now to state why this is the case. Let us first emphasize that this is not due to the fact that these are nonmovable entities; the collective interpretation can in fact be unavailable in cases where the entities are movable:20 19
Among others, see (Moltmann, 1997). Of course, any entity in the world can be destroyed or removed. However, we can clearly make a difference between entities that (can) continuously change their position and those that keep it for a considerable amount of time, generally exceeding a human’s life. 20
Intensional Epistemic Wholes
207
(42) John and President Clinton are in New York City at the moment (42) cannot be interpreted collectively unless one considers that John and President Clinton share a certain activity while in NYC, or that they know each other. If John is in NYC independently of President Clinton, the collective interpretation cannot be enhanced solely by virtue of the fact that they share the same localization. To understand why the constraint cannot be applied in these cases, one has to consider the nature of the property, and to evaluate whether the application of the constraint brings with it a gain of information. Recall, in fact, that the constraint founding the notion of wholes as networks is epistemic, and amounts to relate the knowledge that the cognitive agent has about one entity, to the knowledge that she has about another entity, via a counterfactual reasoning. This entailment brings the benefit of acquiring a new information about the structure relating the entities and, ultimately, the entities themselves. In this respect, for (41), it turns out that it would be totally informationless to apply a constraint between the properties of localization of the lake and of the forest. It is in fact useless to epistemically link to one another the positions of the lake and of the forest, while knowing them independently. Once it is possible to know the localization of an entity in a definitory manner, it is totally redundant to epistemically associate it to the localization of another entity21 . In the light of the fact that individual-level properties (i.e. definitory properties22 ) can never be involved in a whole-as-network interpretation, the reader will easily conclude that this is a general rule in the cognitive grammar. In the case of (42), on the other hand, the co-localization is considered to be irrelevant. It is informationally irrelevant to know the localization of a President with respect to the localization of a citizen (and vice versa), unless the agent previously knows that they share some activity, or that they have a certain relation. For both of these two cases, even though for different reasons, it is informationally unworthy applying the constraint, and thus the collective interpretation is not available. 21
Note that when the collective interpretation is explicitly instantiated (by using the preposition with for instance), the epistemic dimension becomes central. Consider the following discourse: A: Which lake are you talking about? B: That with the forest nearby! In this case, speaker A is looking for a particular lake. The fact of having a forest nearby is relevant for the individuation. 22 See (Carlson, 1977). It is not possible to interpret John and Mary are handsome collectively (unless handsome characterizes their behavior each time they are together.
208
5
Alda Mari
Conclusion
In this paper we have considered the collective interpretation of distributive predicates that we have analyzed in the light of the notion of compositionality in a formal ontology framework. In this perspective, the scene denoted by the sentences has been understood as a whole whose parts are represented by the entities involved in the scene and explicitly mentioned in the sentence. More specifically we have focused our attention on conjoined and plural NPs and on their relation with singular (or distributive) predication. There exist different types of collective interpretations which involve different types of relations between the parts and the whole, and which appeal to different conceptions of the notion of whole. In compositionality terms, this amounts to the standard statement that, given some individuated objects which possibly can be brought together into a whole, there exist different modes of assembling such that the nature of the whole is not only function of the nature of its parts, but also of the particular strategy of composition. We have considered a set of data, characterized by the fact that every member satisfies the same property. The mode of composition that we have taken into account appeals to two specific factors: (i) there exist some coherence or dependence relations among the members; (ii) these dependence relations do not make up a unity. Existing theories of part-whole structures cannot explain this particular type of collective interpretation that we have labeled CODEP. As Meirav (2003) has shown, most of (possibly all) the theories of part-whole structures elaborated in the course of the history, can be classified according to two different conceptions: wholes as sums and wholes as unities. Under the first account there exist no mode of composition other than the juxtaposition of the entities. These can enter a unique class by virtue of their similarity. Theories of wholes as sums are based on the axioms of universal existence of sums stating that, whenever some elements are given, it is always possible to sum them together. When applied to CODEP interpretation of singular predication, these theories fail to capture the essential feature that distinguishes it from the distributive one: the entities entertain a non-accidental relation and are not simply juxtaposed. Another option is to capture the notion of collectivity by that of whole as unity. A unity is characterized by the fact that the parts of the whole are no longer accessible, or, in other terms, they do not have an independent existence from the whole, and, then, from one another. This ontological claim is too strong for CODEP cases. If we consider two people following together the same trajectory, it is clear that the association exists by virtue of the coordination of their walks, and nowhere else.
Intensional Epistemic Wholes
209
Consequently, we have elaborated a different model to explain this type of collective interpretation, that we have labeled “wholes as networks.” This model is based on the sole notion of dependence, which has to be understood as an entailment between properties of entities through possible worlds. The cognitive agent authorizes this entailment whenever she foresees that the entities will covary with respect to some particular properties. This model abstracts from the entities themselves and considers the predictions that the cognitive agent makes about the possibility that their properties will keep on covarying through possible accessible worlds. The reason to introduce an intensional relation is that the record of mere extensional events is not sufficient to distinguish a pure accidental from a regular association. In this respect, the difference between the distributive and the collective reading is formulated as a difference between mere observation and prediction. Crucially, our account is based on the notion of information flow and gain of information. Whenever the agent associates two entities via their properties by a counterfactual reasoning, s/he is gaining information about one entity from what she knows of the other. This model seems not only appropriate to capture the specificities of the particular type of collective interpretation in case of dynamic eventualities, but it also provides interesting predictions about the nature of the properties that can be involved in a whole-as-network interpretation. Definitory properties are epistemic-constraint-proof, insofar as there would be no extra information added by relating to each other entities that can be known by one of their definitory properties steadily. Triggering data about co-localization in particular can be explained in the light of this statement. Co-localization is not sufficient to enhance the collective interpretation, as often argued. If some entities can be localized in an absolute way, it is informationless to relativize their position to that of other entities. Whenever the wholes-as-networks interpretation rises, the agent has to make sure s/he can gain some information. It is highly predictable that this general pattern of reasoning is instantiated in the grammar by specific constructions and lexical items. Let us mention the adverbial together, the preposition with and the expressions of reciprocity. A detailed study of these constructions is outside the scope of this work23 , whose aim was restricted to the introduction and the elaboration of a different manner of conceiving wholes, which, leaving extensionality far behind and adding intensionality with state spaces and epistemic constraints, opens a middle way between the set theoretic notion of sum and holism. 23
See (Mari, 2003).
210
Alda Mari
References Barwise, J., & Perry, J. (1983). Situations and attitudes. Cambridge Mass.: MIT Press. Barwise, J., & Seligman, J. (1997). Information flow. the logic of distributed systems. Cambridge: Cambridge University Press. Carlson, G. (1977). Reference to kinds in English. (PhD dissertation, University of Massachusetts) Davidson, D. (1967). The logical form of action sentences. In N. Rescher (Ed.), The logic of decision and action. Pittsburgh: University of Pittsburgh Press. Dretske, F. (1981). Information flow. Cambridge, Mass.: MIT Press. Frege, G. (1884). Die Grundlagen der Arithmetik: Eine logisch-mathematische Untersuchung u¨ ber den Begriff der Zahl. Breslau: Koebner. Gillon, B. (1987). The readings of plural noun phrases in English. Linguistics and Philosophy, 10, 199–219. Goodman, N. (1951). The structure of appearance. Cambridge Mass.: Harvard University Press. Husserl, E. (1901/1970). Logishe Untersuchungen: Untersuchungen zur Ph¨anomenologie und Theorie der Erkenntnis. London: Routledge & Kegan Paul, Findlay, J.N. (Transl.). Logical Investigations. Jayez, J., & Mari, A. (2005). Togetherness. In E. Maier, C. Bary, & J. Huitink (Eds.), Proceedings of Sinn und Bedeutung 9 (pp. 155–169). Nijmegen. Landman, F. (1989a). Groups II. Linguistics and Philosophy, 12.6, 723–744. Landman, F. (1989b). Groups I. Linguistics and Philosophy, 12.5, 559–605. Landman, F. (2000). Events and plurality. Dordrecht: Kluwer. Lasersohn, P. (1995). Plurality, conjunction and events. Dordrecht: Kluwer. Le´sniewski, S. (1916/1992). Foundatations of the general theory of sets. I. In S. Surma, J. T. Srzednicki, & D. I. Barnett (Eds.), Collected works (Vol. I, pp. 227–264). Dordrecht: Kluwer. Lewis, D. (1973). Counterfactuals. Oxford: Basil Blackwell. Lewis, D. (1986). Against structural universals. Australasian Journal of Philosophy, 64, 25–46.
Intensional Epistemic Wholes
211
Link, G. (1983). The logical analysis of plural and mass terms: a lattice theoretic approach. In C. S. R. Bauerle & A. von Stechow (Eds.), Meaning, use and interpretation of language (pp. 179–228). Berlin: de Gruyter. Mari, A. (2003). Principes d’identification et de cat´egorisation du sens. Le cas de ‘avec’ ou l’association par les canaux. Paris: L’Harmattan. Meirav, A. (2003). Wholes, sums and unities. Dordrecht: Kluwer. Moltmann, F. (1997). Parts and wholes in semantics. Oxford: Oxford University Press. Nagel, E. (1952). Wholes, sums and organic unities. Philosophical Studies, 3, 17–32. Parsons, T. (1990). Events in the semantics of English. Cambridge Mass.: MIT Press. Penczek, W. (1995). Branching time and partial order in temporal logic. In L. Bolc & A. Szalas (Eds.), Time and logic. A computational approach (pp. 179–228). London: UCL Press. Pylyshyn, Z. (2003). Seeing and visualizing: It’s not what you think. Cambridge Mass.: MIT Press. Robin, M. (1999). Communicating and mobile systems. Cambridge: Cambridge University Press. Russell, B. (1903). The principles of mathematics. London: Allen & Unwin. Schwarzschild, R. (1996). Pluralities. Dordrecht: Kluwer. Simons, P. (1987). Parts: A study in ontology. Oxford: Oxford University Press. Stirling, C. (2001). Modal and temporal properties of processes. Berlin: Springer. Wertheimer, M. (1925). Gestalt theory. Social Research, 11, 78–99.
Impossible Primitives Jaume Mateu
1
The Unresolved Problem
In this paper1 I review the theoretically interesting debate on impossible words between Fodor and Lepore (1999) and Hale and Keyser (1999), and I will put forward some theoretical and philosophical arguments in favor of Hale and Keyser’s syntactic theory of minimal decomposition of complex words. In particular, I will be concerned with what counts as an (im)possible (argument structure) primitive. The relevant story is nicely summarized by Uriagereka as follows: Suppose you tell Fodor and Lepore that the word pfzrrt does not exist because it is really derived from CAUSE x to do something, or any such variant, which violates principle P (for example, the Empty Category Principle / ECP: JM). Say they accept your (technical: JM) argument; here is what they will ask you: Why could not pfzrrt mean whatever it means as a primitive, just as CAUSE or whatever-have-you is a primitive?. You complain: But pfzrrt cannot be a primitive!? Their next line: Why, do you have intuitions about primitives!? So either you have a great theory of those primitives, or else you loose, and you do simply because you do not want what you see to be what you get (...) In sum, you know you need a limited set of primitives. Fodor and Lepore invite us to think of the lexicon as such as, more or less, that very set of Address for correspondence: Departament de Filologia Catalana, Universitat Autonoma de Barcelona, E-08193 Bellaterra (Barcelona), Spain. E-mail: [email protected]. 1
The present research in this paper has been supported by the grants BFF200308364-C02-02 (Spanish MCyT) and 2001SGR-00150 (Catalan DGR). For useful comments and suggestions, I would like to thank some participants at CoCoCo 2004, specially Wolfram Hinzen. Special thanks also go to an anonymous reviewer who helped me clarifying some basic ideas. Finally, I’m also grateful to Markus Werning and Edouard Machery for their patience in editing my paper. Needless to say, all possible errors and misunderstandings are my own. The Compositionality of Meaning and Content. Volume I: Foundational Issues. Edited by Markus Werning, Edouard Machery, & Gerhard Schurz. c
2005 Ontos Verlag. Printed in Germany.
214
Jaume Mateu
primitives; that might be large, but nobody said the primitives have to be few, so long as they are finite. A serious, sophisticated theory of a (small?) number of primitives will arguably fare better, but you have to produce that theory; Fodor and Lepore do not have to produce the lexicon, because it’s there. (Uriagereka, 2000, pp. 3–4) In order to provide some useful background, let us exemplify both positions (that is, Fodor and Lepore’s atomist one and Hale and Keyser’s decompositional one) by focusing on some relevant examples that are discussed in their respective works. For example, consider Hale and Keyser’s (1993: 60) explanation of the ungrammaticality of a sentence like (1) which is assumed to have the same argument structure as that in (2). Their relevant explanation is quoted below these examples: (1) * It cowed a calf. (2) A cow had a calf. It is well known that a subject (i.e., a subject that originates as an external argument) cannot incorporate into the verb that heads its predicate (...) Presumably, incorporation from the subject position, external to VP, would violate the Empty Category Principle (...). We will argue later that the subject of verbs of the type represented in ((1)-(2): JM) is external in the sense that it is not present at all in Lexical Relational Structure. Lexical incorporation would therefore be impossible. (Hale and Keyser, 1993, pp. 60) However, Fodor and Lepore are not convinced by Hale and Keyser’s explanation and their corresponding reply is as follows: There must be something wrong with Hale and Keyser’s account of cases like (1) since, even if it did explain why there could not be a derived verb to cow with the paraphrase (2), it does not explain why there could not be a primitive, underived verb to cow with the paraphrase (2). As far as we can tell, this sort of point applies to any attempt to explain why a word is impossible by reference to the impossibility of a certain transformation (...) We assume, along with Hale and Keyser, that the intuition about (1) is that is impossible and not just that if it is possible, then it is underived. (We do not suppose that anyone, except perhaps linguists, has intuitions of the latter kind.) So we claim that Hale and Keyser have not explained the intuition that to cow is impossible (...) (Patently, if E is not a possible meaning, then one does not need an explanation of why no word can mean it) (...) (Hale and Keyser’s account fails because: JM) it is left open that underivable words might not be
Impossible Primitives
215
impossible (because it is left open that they might be primitive). (Fodor and Lepore, 1999, pp. 449–450/452) Unfortunately, Hale and Keyser’s (1999: 463) rejoinder does not address Fodor and Lepore’s main objection: in fact, the former limit themselves to pointing out the following: Fodor and Lepore object that we do not explain why there could not be a primitive, underived verb to cow with the paraphrase A cow had a calf. We guess that such a verb could only come about through illicit conflation (i.e., specifiers/subjects do not undergo conflation, only complements do: JM), in which case the conflation account is more successful than we have hoped to show. (Hale and Keyser, 1999, pp. 463) However, notice that Fodor and Lepore do not actually want to call the technical device ruling out the alleged derivation into question, for example, let’s say the Empty Category Principle (ECP) or whatever other principle ruling out the incorporation/conflation of specifiers/subjects. Consider their concessive clause: “(...) even if it did explain why there could not be a derived verb to cow with the paraphrase (2) (...)”. That is, the technical details of the derivational account of the example in (1) are not what they actually want to call into question. Fodor and Lepore’s objection is then clear: Hale and Keyser’s account fails precisely because it is left open that underivable words might not be impossible (because it is left open that they might be primitive) (p. 452). Therefore, Fodor and Lepore appear to be right in concluding the following: “so we claim that Hale and Keyser have not explained the intuition that to cow is impossible”. His rebus cognitis, it will be necessary to sketch out the basics of the framework I am going to assume in order to deal properly with Fodor and Lepore’s main objection (to be refuted in section 3 below): 2
The Framework
I argue that there is a strong homomorphism between the syntax and semantics of argument structure (Mateu 2002). My present proposal benefits from both Hale and Keyser’s (1993) paper, where certain meanings were associated to certain syntactic structures, and their more recent (2002) monograph, where a refinement of the basic argument structure types is presented. According to Hale and Keyser (2002), the argument structure relations a head X can enter into are those in (3): in (3a) X only takes a complement; in (3b) X takes both a complement and a specifier; in (3c) X only takes a specifier; finally, in (3d) X is a non-relational element.
216
Jaume Mateu
(3) Basic Structural Types of Argument Structure Head (X); complement (Y of X); predicate (X of Z) (3a) [X Y] (3b) [Z [X Y]] (3c) [Z [W X]] (3d) X According to Hale and Keyser (2002), the prototypical or unmarked morphosyntactic realizations in English of the syntactic heads in (3) (i.e., the X’s) are the following: V in (3a), P in (3b), Adj in (3c), and N in (3d). As argued by Mateu (2002), an important non-trivial reduction and/or modification of (3) appears to be necessary when the above-mentioned homomorphism is seriously considered. In Mateu (2002), it is claimed that the head X in (3c) is not a primitive element of the argument structure theory, as in Hale and Keyser’s approach, but a composite unit. The prototypical category to be associated to the X in (3c) (i.e., Adjective) can be argued to be decomposed into two elements: a non-relational element (similar to that instantiated by N) plus a relational element (similar to that instantiated by P), the former being incorporated into the latter. That is, our claim is that the structural combination in (3b) can also be argued to account for the argument structure properties of Adjectives. Accordingly, the small clause-like argument structure involved in two examples like those in (4a-4b) turns out to be the same, that in (4c). 2 (4a) went [John [to Paris]] (cf. John1 went [SC t1 to Paris]) (4b) went [John [crazy]] (cf. John1 went [SC t1 crazy]) (4c) went [ Z [ X Y]] At first sight, the present modification/reduction of four to three basic types could be said to be at odds with Hale and Keyser’s approach: basically, it is the causative/inchoative alternation that allows them to maintain the structural distinction between those denominal verbs that involve merge of (3b) into (3a) (see (5a)), and those deadjectival verbs that involve merge of (3c) into (3a) (see (5b)). According to them, this explains why the former are always transitive, while the latter have an intransitive variant (the W verbal head in (3c) being then inflected with Tense): 2
As pointed out by an anonymous reviewer, there is another reason to disbelieve the possibility that (3c) could exist, within Chomsky’s (1995f) Minimalist Program series of assumptions: specifiers simply cannot exist without complements. See also Harley (1995, 2002), where it is also concluded that (3c) does not exist as a fundamental structural unit of argument structure.
Impossible Primitives
217
(5a) [X1 [Z [X2 Y]]] (i.e., [PUT [the books [ONTO shelf]]] John shelved the books / *The books shelved. (5b) [X1 [Z [W X2]]] (i.e., [CAUSE [the screen [BECOME clear]]] John cleared the screen / The screen cleared. However, as pointed out by Kiparsky (1997) and Mateu (2001), Hale and Keyser’s structurally-based generalization is not well-grounded: denominal verbs can participate in the causative/inchoative alternation if they denote events which can proceed without an explicit animate agent (e.g., pile (up), carbonize, oxidize, etc.). On the other hand, there are deadjectival verbs that cannot participate in such an alternation (e.g., legalize, visualize, etc.). Given this, the relevant conclusion for our present purposes is the following: the fact that transitive denominal verbs like to shelve do not enter into the causative alternation is not due to a structural reason, as Hale and Keyser propose, but to the fact that they usually involve an animate agent. Therefore, the main/basic objection that could be entertained with respect to our eliminating the apparently basic combination of (3c) vanishes. This reduction accepted, the basic, irreducible argument structure types turn out to be those in (6): (6) Basic Structural Types of Argument Structure revisited (Mateu 2002) Head (X); complement (Y of X); predicate (X of Z) (6a) [X Y] (6b) [Z [X Y]] (6c) X The reduction of (3) to (6) allows the promised homomorphism to come to the fore, this being informally expressed in (7). Given (7), the relational syntax of argument structure can be argued to be directly associated to its corresponding relational semantics in a uniform way. (7a) The head X in (6a) is associated to an eventive relational element. (7b) The head X in (6b) is associated to a non-eventive relational element. (7c) The head X in (6c) is associated to a non-relational element. In turn, the eventive relational element can be instantiated via two different semantic relations: if there is an external argument in the specifier position of the
218
Jaume Mateu
relevant F(unctional) projection, 3 the eventive relational element will be instantiated as a source relation, the external argument being interpreted as Originator (cf. Borer, 1994f.). Following Uriagereka (1999), I assume that both transitive and unergative verbs are marked with a strong feature to be checked in the relevant functional projection introducing the external argument. The so-called assignment of theta-role to the external argument can then be said to be licensed through this checking process. The source relation can be static (cf. transitive stative verbs like to know, to fear, etc. or dynamic (cf. causative verbs like to clear, to thin, etc.; we, along with Baker (1997), also include unergative verbs like to laugh, to cry, etc, as causative verbs. Notice that positing such a binary distinction, namely the one between static and dynamic relations, does not force us to assume Baker’s (1997: 132; fn. 43) claim that transitive stative verbs be regarded as involving causative predicates. If there is no external argument, the eventive relation will be instantiated as a transitional relation, which always selects a non-eventive spatial or abstract relation, whose specifier and complement are to be interpreted as Figure and Ground, respectively, this terminology being adapted and borrowed from Talmy (2000). As above, the transitional relation can be dynamic (e.g., cf. inherently directed verbs of motion like to arrive) or static (e.g., cf. verbs of existence like to exist); see Levin and Rappaport Hovav (1995) for an extensive descriptive survey. In turn, the non-eventive relational element can also take a dynamic value (i.e., Terminal Coincidence Relation (TCR)) or a static one (i.e., Central Coincidence Relation (CCR)); both TCR and CCR are adopted from Hale & Keyser’s (1993, 2002) work: a terminal coincidence relation involves a coincidence between one edge or terminus of the Figure’s path and the Ground, while a central coincidence relation involves a coincidence between the center of the Figure and the center of the Ground. See also Mateu (2001) for some grammatically relevant correlations that can be established between (lexical) telicity and TCR, and between (lexical) atelicity and CCR. The resulting syntactic argument structures are those depicted in (8) (strictly speaking, notice that the F(unctional) projection introducing the external argument (Z1) is not part of the l-syntactic argument structure). (8a) transitive structure: [Z1 [F [X1 [Z2 [X2 Y2]]]]] 3
Alternatively, one could assume, along with Hale & Keyser (2002: 249; fn. 2), that the external argument is not introduced by a functional projection, but rather is an adjunct to VP. My analysis is compatible with both hypotheses, the truly relevant point here being that the external argument is external to the syntactic argument structure. Moreover, the relevant F(unctional) projection is not to be necessarily seen as meaningful (cf. Kratzer 1996), but just as a syntactic projection that introduces the external argument.
Impossible Primitives
219
(8b) unergative structure: [Z1 [F [X1 Y1]]] (8c) unaccusative structure: [X1 [Z2 [X2 Y2]]] The main structural difference between transitive and unergative structures concerns the type of complement selected by the eventive relation X1: a noneventive relation X2 is selected as complement in (8a), while it is a nonrelational element Y1 that is selected in (8b). Moreover, notice that the transitive structure in (8a) can be argued to partake in both an unergative structure (notice that it includes the eventive relation X1 to be associated with an external argument Z1 via F) and an unaccusative structure (notice that it includes the non-eventive relation X2, relating two non-relational elements, Z2 and Y2). It should also be emphasized that the mere relational structures in (8) are meaningful: the non-relational element Z1 is the Originator, X1 is the eventive relation, X2 is the non-eventive one relating two non-relational elements, the Figure (i.e., its specifier) and the Ground (i.e., its complement). Y1 is to be regarded as the true Incremental Theme in the sense of Harley (2002, 2003). Notice then how the hypothesis that those syntactic argument structures in (8) are meaningful, relates well to Gleitman’s (1990ff.) syntactic bootstrapping hypothesis (see also Borer (2004) for related interesting discussion). For reasons of space, I will not discuss this correlation here. On the other hand, I want to emphasize that an important tenet of the present theory is that there is no configurationally based decomposition beyond the syntax of argument structure. For example, I argue that the decomposition of complex verbs like those selected ones in (9) stops at the coarse-grained level of the syntax of argument structure, the root being always associated to a nonrelational element encoding pure conceptual content. (9a) John corraled the horse [John [CAUSE [the horse [TCR CORRAL]]]] (9b) John cleared the screen. [John [CAUSE [the screen [TCR CLEAR]]]] (9c) John killed the rat. [John [CAUSE [ the rat [TCR KILL]]]] (9d) John pushed the car. [John [CAUSE [the car [CCR PUSH]]]] (9e) John loved the city. [John [HAVE [the city [CCR LOVE]]]] (9f) John laughed. [John [CAUSE LAUGH]] (9g) The screen cleared. [BECOME [the screen TCR CLEAR]] Some comments on some non-prototypical transitive verbs are in order. Here I assume Hale and Keyser’s (2002: 37–45) syntactic analysis of transitive activity verbs like to push (cf. (9d)) and transitive stative verbs like to love (cf. (9e)): according to them, both the impact noun push and the psych nominal love must be
220
Jaume Mateu
coindexed to their source, the external argument, i.e., the s(entential)-syntactic subject. In other words, here we are dealing with a special case of linking that is known in the literature as obviation: in particular, here I follow Hale and Keyser’s (2002: 44) hypothesis that both the impact noun and the psych nominal should be supplied with a bracketed subscript representing a variable that must be bound obviatively. However, since the connection between features of lexical meaning and the argumental positions is not a matter of the basic argument structure relations themselves, we can ignore this point here (see Hale and Keyser (2002: chap. 2) for a particular proposal).4 Whatever the final right analysis of non-prototypical transitive verbs turns out to be,5 I want to embrace the non-trivial hypothesis that the ONLY open-ended 4
An anonymous reviewer points out that, as tempting as it is to try within the present framework, it is impossible to maintain that John pushed the car is constructed from something like [John [CAUSE [the car [CCR PUSH]]]], i.e., John caused [the car be pushed]. In particular, the reviewer points out that the semantic tests tells one that there is no change of state, and the syntactic tests tells one that the car is not an inner subject (e.g., there’s no middle of push with the car as its subject). Our reply is as follows: first, I, along with Hale and Keyser (2002: 43), do NOT consider these verbs as involving a change of state. Rather, as pointed out by these authors, a more adequate (structural) paraphrase would be the following one: the syntactic argument structure of a sentence like He punched the bag would be similar to the one assigned to He gave the bag a punch, the CCR being interpreted as the central coincidence relation with : cf. He provided the bag WITH a punch. Concerning the middle test, I, along with Hale and Keyser (2002: 43), attribute the ill-formedness of *This bag punches easily or *This car pushes easily to the obviation facts involved in these verbs: the middle involves eliminating the external argument, the relation between the non-relational element PUNCH or PUSH and the causer/originator being then broken (cf. Hale and Keyser (2002: 37–45) for more discussion). 5 An anonymous reviewer points out that all eventive transitive structures cannot boil down to (8a) above. In particular, this reviewer claims that there has to be room for a transitive structure that is analogous to (8b), where what is needed is to allow the incorporated Y1 in (8b) to potentially take an argument (see Harley (2003) for a particular implementation of this analysis). However, given the present assumptions, I cannot assume Harley’s (2003) analysis of PUSH-verbs as correct, since (the potentially infinite number of) non-relational roots can NOT take complements, only (a very short number of) relational elements can (cf. infra for the possible correlation between relational elements and functional categories). It seems to me that some terminological problems add confusion to the present issue, since the reviewer does not appear to take the following point into consideration: i.e., the fact that PUSH or KICK are relational event-denoting concepts does NOT necessarily require that they be also relational in terms of syntactic argument structure. Indeed, from a conceptual point of view, it is indisputable that a root like PUSH denotes a relational event-denoting concept, while a root like CAR denotes
Impossible Primitives
221
class of roots is that corresponding to those non-relational elements occupying the specifier and complement positions in (9), those encoding syntactically irrelevant conceptual content. As far as syntactically-based decomposition is concerned, I claim that the non-relational element corresponding to the root in italics in (9) is an atom. One important caveat is in order here for followers of Fodor’s lexical atomism: the conceptual stuff depicted in caps in (9) must not be interpreted as it stands. For example, I do not actually claim that the non-relational element CORRAL in (9a) is to be interpreted as the noun corral. Rather what is required is that CORRAL be interpreted as the non-relational element (i.e., the abstract Ground) included in the locative verb to corral (see Mateu (2001) for further discussion). The same holds for morphologically less transparent cases: e.g., in (9c) what is meant by KILL is the non-relational element (i.e., the abstract Ground) included in the caused change of state verb to kill. It should then be clear that, unlike what is said by Fodor & Lepore (1999), those adopting Hale & Keyser’s (1993) framework do not actually claim what Generative Semanticists did claim illo tempore i.e., that the verb to kill means to to cause to die (or alternatively CAUSE (X) to GO TO DEATH/BECOME NON ALIVE). 6 Notice then that we have arrived at a very simple theory of what a possible (argument structure) primitive element could be. There are two kinds of primitive elements in the present syntactic theory of argument structure: relational elements (cf. (7a,b)) and non-relational ones (cf. (7c)). While the number of the former, i.e., the number of eventive and non-eventive relations, can be argued to be finite (in fact, very limited!), the number of the latter, i.e., the number of non-relational elements, can be argued to be potentially infinite. In striking contrast, because of his assuming that all lexical concepts are primitive elements, a non-relational non event-denoting concept. However, this obvious point is not what is at issue here. I think that the very same unfortunate confusion between conceptual content and semantic construal one can typically find in many works on lexical semantics (cf. Jackendoff 1990) appears to be behind the reviewer’s comments. The same confusion emerges when one considers concepts like FATHER or HAND as relational elements. Indeed, from a conceptual content-based point of view, they are. However, when the semantic construal dimension is considered (for our present purposes, when the syntactic argument structure dimension is considered), both FATHER and CAR are NON-relational elements (cf. Langacker (1987, 1999) for an excellent semantic analysis of nouns and verbs; see also Mateu (2002) for an attempt to adapt Langacker’s (1987, 1999) insights on the important distinction between conceptual content and semantic construal to the generative framework). 6 On the other hand, unlike Hale and Keyser and many others, here I follow Harley’s (2002) (and Fodor’s (1970f.)) view that morphologically simple causative verbs are NOT to be decomposed as CAUSE-BECOME predicates, i.e., as BIeventive predicates.
222
Jaume Mateu
Fodor is obliged to embrace the following non-trivial consequence pointed out by Jackendoff: An especially unpleasant consequence of Fodor’s position is that, given the finiteness of the brain, there can be only a finite number of possible lexical concepts. This seems highly implausible, since one can coin new names for arbitrary new types of objects and actions (This is a glarf ; now watch me snarf it), and we have no sense that we will someday run out of names for things (...) It is hard to believe that nature has equipped us with an ability to recognize individual things in the world that is limited to a finite number. (Jackendoff, 1990, pp. 40–41) Indeed, the present theory allows us to maintain the basic intuition involved in the creativity of concept formation that is alluded to by Jackendoff. For example, we should not be surprised if there appears to be a non-trivial learning process involved in the concept formation from combining a very limited/finite set of relational elements (i.e., CAUSE/HAVE, BECOME/BE, and TCR/CCR) with a potentially infinite set of non-relational elements that have very specific meanings like those ones that could be associated to glarf/SNARF, which by no means could be assigned the status of innate monades. It is also in this context that it seems appropriate to address Wolfram Hinzen’s (p.c.) criticism: If these abstract notions (i.e., those encoded by the relational elements: CAUSE/HAVE, BECOME/BE, and TCR/CCR: JM) are not the concepts expressed by the corresponding English words, you have to tell me what they mean. And I find it incredible that the child can be burdened with knowing what CCR and TCR, or even things like telicity are. This is pure theoretical vocabulary of the linguist, as Fodor rightly points out. And even if the child did know these things, I do not see any intrinsic connection between knowing such abstract semantic concepts and specific structures. These notions could be structuralized in an almost arbitrary number of ways. My reply to this criticism is as follows: although argument structures can be seen as purely syntactic objects (Hale & Keyser 1999; Mateu 2002), they have been argued to be associated with some very limited semantic notions. For example, it appears to be the case that the well-known syntactic distinction between unaccusative and unergative verbs has been argued to be semantically motivated (Perlmutter 1978): basically, the former can be described as transitional verbs (Pustejovsky 1995; Mateu 2002, i.a.), while the latter express an internal cause (Levin & Rappaport Hovav 1995; Baker 1997). In order to capture such a linguistic generalization, it has been shown that unaccusatives can be
Impossible Primitives
223
decomposed as BECOME/BE predicates, while unergatives can be decomposed as DO/CAUSE predicates. Moreover, telicity has been argued to be a semantic notion that is crucially involved in determining syntactic unaccusativity (Zaenen 1993; Sorace 2000, among others). Indeed, notice that, without such a minimal decomposition of verbs, it would be very difficult to express the semantic commonalities and generalizations that can be found in both classes of intransitive verbs: for example, despite the fact that to arrive or to die are verbs that refer to different semantic content, it is the case that they have a lot of common with respect to their syntactically relevant semantics: both can be seen as transitional verbs involving the BECOME function (Dowty 1979; Pinker 1989; Jackendoff 1990; Pustejovsky 1995; Hale & Keyser 1993; 2002; Mateu 2002, among many others). Moreover, despite the fact that both to arrive and to walk are motion verbs, it has been shown that, linguistically speaking (i.e., by considering their syntactically relevant semantics), they have nothing in common. What is actually relevant is that the former is a transitional verb, while the latter is a verb involving an internal cause, and that’s why the former is unaccusative and the latter unergative (Levin & Rappaport Hovav 1995, i.a.). Be this as it may, in the present context, I will not review the linguistic arguments for positing such relevant associations between syntactic and semantic notions (cf. Hale & Keyser 1993; 2002; Mateu 2002), nor will repeat the linguistic arguments for positing a finite set of semantic functions (cf. Dowty (1979), Jackendoff (1990) or Baker (1997), among many others, for CAUSE and BECOME, Pinker (1989) for HAVE, and Hale (1986) for TCR and CCR). Rather, here I will limit myself to mentioning a powerful philosophical argument for the present minimal decomposition theory: indeed, notice that my positing above a finite list of relational elements (CAUSE/HAVE, BECOME/BE, TCR/CCR) and a potentially infinite list of non-relational ones allows me to avoid Jackendoff’s relevant criticism reviewed above. Otherwise, I, along with Fodor (and Hinzen), would be forced to embrace the non-trivial consequences pointed out by Jackendoff in his quote above. Quite probably, Hinzen’s criticism also points to the more general fact that there appear to be no solid arguments for so-called lexical decomposition. Indeed, it is well-known that Hale and Keyser (1993, 2002) claim that the above creativity or generativity is due to the fact that there appears to be a MINIMAL syntax inside the lexicon (vs. cf. Jackendoff (1990), Pinker (1989), Levin (1993), Pustejovsky (1995), who claim that it is a rich semantics but not a poor syntax that is linguistically relevant when decomposing lexical items). This said, it should be clear that Hale and Keyser’s division between l(exical)-syntax and s(entential) -syntax is not but one theoretical option. If we want to avoid the assumption that there is syntax (or, for sake of the argument, let’s say generativity) inside the lexicon, another proposal, compatible with the one that is being
224
Jaume Mateu
argued by Borer (2004) or Marantz (1997), could be assumed here: the present distinction between relational elements and non-relational elements could be argued to coincide with the one between functional categories and encyclopediclike roots, respectively. Indeed, assuming the latter perspective, it would be inappropriate to say that so-called lexical decomposition is involved when decomposing the causative verb kill into two functional relations (e.g., TCR and CAUSE) plus the non-relational element KILL. In the latter analysis there would be no lexical decomposition since the truly lexical item, i.e., the non-relational element, would remain undecomposed or atomic. Be this as it may, it should be clear that whatever syntactic terminology we want to assume (i.e., Hale & Keyser’s or Borer’s/Marantz’s) is not the key issue, the real one being that we should all be aware that relational (or functional) elements do form a finite set, while non-relational ones do involve a potentially infinite one. Given this, it should be clear that it is our theory (not our intuitions!) that prevent us from taking lexical items as to corral, to kill, to love, to cow, etc. as primitives, i.e., as innate lexical concepts a la Fodor. To be sure, with Hale and Keyser, we cannot take what we see to be what we get, since a minimal decomposition can be shown to be necessary in order to provide an appropriate answer to questions like the following ones: (i) Why are there so few theta-roles? (Hale and Keyser 1993; Mateu 2002, i.a.); (ii) Why is it the case that the limited number of theta-roles turns out to be related to the limited number of so-called lexical categories? (Hale and Keyser 1993; Mateu 2002, i.a.); (iii) Why is there no verbal predicate having more than three arguments? (Baker 1997; Mateu 2002, i.a.); (iv) Why is there no verbal predicate having a Theme as external argument and an Agent as internal argument? (Jackendoff, 1990; Johnson, 2004, i.a.). Without a minimal (syntactically-based) decomposition theory like the one argued for by Hale & Keyser (1993, 2002) and Mateu (2002), it is not clear to me which theoretically interesting answer could be provided to these non-trivial questions. To the best of my knowledge, no principled account has been given by Fodor and his followers concerning these non-trivial questions. No doubt, I am fully convinced that the appropriate answers to these important questions will also shed light on what a(n argument structure) primitive is. His rebus cognitis, i.e., with the previous theoretical background in mind, we are well-prepared to refute Fodor and Lepore’s objection put forward in secion 1.
3
The Objection Refuted
Recall that we owe Fodor and Lepore an explanation concerning their main objection against transformational approaches to decomposition of complex
Impossible Primitives
225
words. This is repeated below: There must be something wrong with Hale and Keyser’s account of cases like *It cowed a calf since, even if it did explain why there could not be a derived verb to cow with the paraphrase A cow had a calf, it does not explain why there could not be a primitive, underived verb to cow with the paraphrase (A cow had a calf). As far as we can tell, this sort of point applies to any attempt to explain why a word is impossible by reference to the impossibility of a certain transformation (...) We assume, along with Hale and Keyser, that the intuition about *It cowed a calf is that is impossible and not just that if it is possible, then it is underived. (We do not suppose that anyone, except perhaps linguists, has intuitions of the latter kind.) So we claim that Hale and Keyser have not explained the intuition that to cow is impossible (...) (Patently, if E is not a possible meaning, then one does not need an explanation of why no word can mean it) (...) (Hale and Keyser’s account fails because: JM) it is left open that underivable words might not be impossible (because it is left open that they might be primitive). (Fodor and Lepore, 1999, pp. 449–450/452) To be sure, I agree with them that nobody (linguists included!) has intuitions about primitives. So nothing is gained by pointing out that to cow in *It cowed a calf (with the paraphrase A cow had a calf ) cannot be a primitive. It is clear that it is not our intuitions that should tell us what is a primitive and what is not. Indeed, I think that the success of such a task will depend on having an adequate theory. And here is a significant contribution of the theory I sketched out in section 2: as I emphasized above, the only open-ended class of roots corresponds to those NON-relational elements occupying the specifier and complement positions in (8) or (9). By contrast, it is quite plausible to argue that the relational elements (the eventive and non-eventive relations) do belong to a limited/closed class of elements. Notice then that the distinction between relational vs. nonrelational elements becomes crucial in my reply to Fodor and Lepore: the mere relational nature of the verb to cow in *It cowed a calf with the paraphrase A cow had a calf should prevent us from taking this lexical item as a primitive, since in the present theory only NON-relational elements can be argued to encode pure (i.e., syntactically irrelevant) conceptual content. Moreover, the kind of background knowledge to be encoded into the alleged primitive relational verbal head to cow in *It cowed a calf (with the paraphrase A cow had a calf ) could not be placed on a par with the non-encyclopedic-like meanings that are typical of the very limited set of relational elements encoding semantic construal (i.e., CAUSE/HAVE, BECOME/BE, and TCR/CCR). Accordingly, the relevant methological moral appears to be the following one: the question What
226
Jaume Mateu
is an impossible primitive is logically prior to What is an impossible word. Rebus sic stantibus, ONCE we (that is, the theoretical principles we sketched out in section 2, not our intuitions!) have eliminated the possibility that the verb to cow in *It cowed a calf (with the paraphrase A cow had a calf ) cannot be granted a primitive status (because (i) it is not a primitive element from a potentially infinite set of NON-relational elements and (ii) it is not a primitive element from a very limited set of eventive and non-eventive/spatial relations: CAUSE/HAVE, BECOME/BE, and TCR/CCR), it is THEN correct to conclude that Hale and Keyser’s suggestion repeated below, can be on the right track. Fodor and Lepore object that we do not explain why there could not be a primitive, underived verb to cow with the paraphrase A cow had a calf. We guess that such a verb could only come about through illicit conflation (i.e., specifiers/subjects do not undergo conflation, only complements do: JM), in which case the conflation account is more successful than we have hoped to show. (Hale and Keyser, 1999, pp. 463) That is, once one has shown that the verb to cow in *It cowed a calf (with the paraphrase A cow had a calf ) cannot be a primitive element, it becomes then plausible to argue that the only way for one to form the (derived) verb to cow (with the paraphrase A cow had a calf ) would be through illicit conflation: namely, specifiers (i.e., subjects of predication) do not undergo conflation, only complements do; see Hale and Keyser (1993, 2002) for extensive argumentation on this point. The conclusion we have arrived at can then be exemplified with the (im)possible to cow/COW case as follows: in (10a) the non-relational elements cow and COW both are presented as primitive elements. While the first one is morphologically realized as a noun, the second one is to be found in the decomposition of complex words (cf. (10b-c)): in (10b) the non-relational element COW is licitly involved in the formation of the complex verb to cow, this being due to its occupying a complement position; by contrast, in (10c), the non-relational element COW is illicitly involved in the formation of the complex verb to cow, this being due to its occupying a specifier position. (10) The to cow/COW case: (10a) primitive elements: cow (cf. What a cow!) and COW (cf. 10b-10c). (10b) possible derived words: COW as complement (cf. John cowed Lucile is analyzed as [John [CAUSE [Lucile [TCR COW]]]]). (10c) impossible derived words: COW as specifier (cf. *It cowed a calf (with the paraphrase A cow had a calf ) is analyzed as [COW [HAVE a calf]]).
Impossible Primitives
227
References Baker, M. (1997). Thematic roles and syntactic structures. In L. Haegeman (Ed.), Elements of grammar. Dordrecht: Kluwer. Borer, H. (1994). The projection of arguments. In E. Benedicto & J. Runner (Eds.), Functional projections (Vol. 17). University of Massachusetts. Borer, H. (2004). Structuring sense (2 vols.). Oxford, New York: Oxford University Press. Chomsky, N. (1995). The minimalist program. Cambridge, MA: MIT Press. Dowty, D. (1979). Word meaning and Montague grammar. Dordrecht: Reidel. Fodor, J. (1970). Three reasons for not deriving kill from cause to die. Linguistic Inquiry, 1, 429–438. Fodor, J. (1998). Concepts. Where cognitive science went wrong. Oxford: Clarendon Press. Fodor, J., & Lepore, E. (1999). Impossible words? Linguistic Inquiry, 30(3), 445–453. Gleitman, L. (1990). The structural source of verb meanings. Language Acquisition, 1, 3–35. Hale, K. (1986). Notes on world view and semantic categories: Some Warlpiri examples. In P. Muysken & H. Riemsdijk (Eds.), Features and projections. Foris. Hale, K., & Keyser, J. (1993). On argument structure and the lexical expression of syntactic relations. In K. Hale & J. Keyser (Eds.), A view from Building 20: Essays in linguistics in honor of Sylvain Bromberger. MIT Press. Hale, K., & Keyser, J. (1999). A response to Fodor and Lepore: Impossible words? Linguistic Inquiry, 30(3), 453–466. Hale, K., & Keyser, J. (2002). Prolegomenon to a theory of argument structure. MIT Press. Harley, H. (1995). Subjects, events and licensing. Unpublished doctoral dissertation, MIT. Harley, H. (2002). A minimal(ish) linking theory. (unpublished manuscript) Harley, H. (2003). How do verbs get their names? Denominal verbs, manner incorporation and the ontology of verb roots in English. In T. Rapoport & N. Shir (Eds.), The syntax of aspect. Oxford University Press. (To appear)
228
Jaume Mateu
Jackendoff, R. (1990). Semantic structures. Cambridge, MA: MIT Press. Johnson, K. (2004). From impossible words to conceptual structure: The role of structure and processes in the lexicon. Mind and Language, 19(3), 334– 358. Kiparsky, P. (1997). Remarks on denominal verbs. In A. Alsina, J. Bresnan, & P. Sells (Eds.), Complex predicates. Stanford, CA: CSLI Publications. Kratzer, A. (1996). Severing the external argument from the verb. In J. Rooryck & L. Zaring (Eds.), Phrase structure and the lexicon. Dordrecht: Kluwer. Langacker, R. W. (1987). Nouns and verbs. Language(63), 53–94. Langacker, R. W. (1999). Grammar and conceptualization. Berlin: Mouton de Gruyter. Levin, B. (1993). English verb classes and alternations: A preliminary investigation. The University of Chicago Press. Levin, B., & Rappaport Hovav, M. (1995). Unaccusativity. at the syntax-lexical semantics interface. MIT Press. Marantz, A. (1997). No escape from syntax: Don’t try morphological analysis in the privacy of your own lexicon. UPenn Working Papers in Linguistics, 4, 201–225. Mateu, J. (2001). Locative and locatum verbs revisited: Evidence from romance. In J. R. Y. d’Hulst & J. Schroten (Eds.), Romance languages and linguistic theory 1999. John Benjamins. Mateu, J. (2002). Argument structure: Relational construal at the syntax-semantics interface. Unpublished doctoral dissertation, Universitat Aut`onoma de Barcelona. Perlmutter, D. (1978). Impersonal passives and the unaccusative hypothesis. BLS, 4, 157–189. Pinker, S. (1989). Learnability and cognition: The acquisition of argument structure. Cambridge, MA: MIT Press. Pustejovsky, J. (1995). The generative lexicon. Cambridge, MA: MIT Press. Sorace, A. (2000). Gradients in auxiliary selection with intransitive verbs. Language, 76(4), 859–890. Talmy, L. (2000). Toward a cognitive semantics. Cambridge, MA: MIT Press.
Impossible Primitives
229
Uriagereka, J. (1999). Rhyme and reason. An introduction to minimalist syntax. Cambridge, MA: MIT Press. Uriagereka, J. (2000). So what’s in a word? Some syntactic notes on the decomposition debate. (Unpublished manuscript) Zaenen, A. (1993). Unaccusativity in Dutch: Integrating syntax and lexical semantics. In J. Pustejovsky (Ed.), Semantics and the lexicon. Dordrecht: Kluwer.
Is Compositionality an Empirical Matter? Jaroslav Peregrin
1
The Status of the Principle of Compositionality
The principle of compositionality of meaning is often seen as a kind of a ‘natural law’ of semantics: we, finite being, so the story goes, cannot grasp an infinite stock of meanings otherwise than as composed out of a finite stock of primitive building blocks. Therefore we are restricted, out of the many possible kinds of languages, to the compositional kind. Hence although there might be noncompositional languages, they would not be intelligible for us. This received wisdom has not been substantially shattered by periodically appearing attempts at showing that, as a matter of fact, our factual natural language is not compositional. However, in 1983 there appeared a different kind of challenge which was taken more seriously: Janssen (1983) presented a proof of a theorem which suggested that the principle cannot be considered as a real law because it is simply vacuous. The theorem stated, in effect, that any range of expressions can be mapped on any range of entities in a compositional way, and hence appear to imply that the principle is not capable of excluding any kinds of meanings. Recently, a more sophisticated version of the same argument was presented by Zadrozny (1994), who gives a simple algoritm for constructing, given an alphabet A, a set M, and a mapping m (‘meaning assignment’) of A on M, a function µ mapping all concatenations of elements of A on M with the following properties: (i) The value of µ for the concatenation of s and t is always µ(s)(µ(t)) (hence µ is not only compositional in that its value for a whole is uniquely determined by the values of its parts; it is, moreover, the result of the application of the value of one of its parts to those of the others). (ii) The value of µ for a simple symbol is trivialy transformable into its antecedent ‘meaning’ m(s), namely µ(s)(s) = m(s). The upshot is taken to be that every assignment of any kinds of meanings to any kinds of expressions is trivially compositional; and Address for correspondence: Department of Logic, Institute of Philosophy, Academy of Sciences, Jilsk´a 1, 110 00 Prague, Czechia. E-mail: [email protected]. The Compositionality of Meaning and Content. Volume I: Foundational Issues. Edited by Markus Werning, Edouard Machery, & Gerhard Schurz. c
2005 Ontos Verlag. Printed in Germany.
232
Jaroslav Peregrin
hence that “the property that the meaning of the whole is a function of the meanings of its parts does not put any material constraints on syntax and semantics” (p. 340–341) . This conclusion has given rise to a wave of criticism (Kazmi and Pelletier, 1998; Westert˚ahl, 1998; Szab´o, 2000). I think that the critical papers clearly indicate that the inference from the mathematical results of Janssen and Zadrozny to the conclusion that the principle is void is flawed; however, I also think that there is still an important aspect of compositionality which is not reflected by the current discussion and which, when scrutinized, poses another kind of challenge to the ‘natural law’ understanding of the principle of compositionality. Therefore I am convinced that despite the recent discussion, which may appear to scrutinize compositionality in an exhaustive way, the standard, pervasive understanding of the principle is much less indisputable than it may prima facie appear. I think this is partly caused by underestimating the distinction between matters of pure mathematics and matters of (mathematical treatment of) empirical phenomena1 . By now, the situation is relatively perspicuous on the mathematical side of the problem: it seems that since the seminal work of Montague (1970) it has become clear that the proper mathematical counterpart of the intuitive concept of compositionality is the concept of homomorphism. And given this, two (trivial) facts become relevant (see Westert˚ahl, ibid., for a more detailed discussion): FACT 1. Not every mapping of the carrier of an algebra A on the carrier of an algebra B is a homomorphism from A to B. FACT 2. Every mapping of the carrier of an algebra A on the set S is a homomorphism from A to an algebra with the carrier S. Now with a certain amount of oversimplification we can say that what Janssen and Zadrozny pointed out was, in effect, Fact 2 (and more elaborated variations on it); while what their critics urge is that the interesting issue is not Fact 2, but rather 1. On the empirical side, the situation might seem similarly clear: the question whether natural languages are compositional (whether their meaning assignments can be reconstructed as homomorphisms) seems to be bound to be a matter of empirical investigation. Turning our attention from formal to natural languages, and hence to the realm of empirical phenomena, we must obviously switch from the deductive, mathematical reasoning to inductive, empirical investigations; hence claims about the compositionality of natural languages appear to have to be empirical findings. Thus, Szab´o (ibid., p. 478) writes: “Not all languages are compositional: surely a hypothetical language where the meanings of complex expressions are influenced by the weather, while their structure 1
See Peregrin (2000a) for a general discussion of this methodological problem.
Is Compositionality an Empirical Matter?
233
and the meanings of their constituents are not, would be non-compositional by definition.” This seems a straightforward argument: as natural languages are empirical entities, finding out whether they all have a property, e.g. whether they are compositional, is bound to be an empirical enterprise. But is it really? Consider the following analogous claim: since bachelors are empirical individuals, finding out whether they all have a property, e.g. whether they are married, is bound to be an empirical enterprise. Of course that this is far from plausible: surely we have not come to be sure that bachelors are unmarried by empirical generalization! We know that bachelors are unmarried for being unmarried is the constitutive property of bachelorhood – to be a bachelor simply is to be unmarried, this is what the word “bachelor” means. Now is the conviction that the compositionality of natural languages is bound to be an empirical issue any more plausible? Compositionality perhaps is not the constitutive property of languagehood, but could it not be one of a cluster of properties which together are constitutive of it? Prima facie, this may not be apparent: why on earth could there not be a language with meanings distributed in a non-compositional manner? Why could we not, as Szab´o suggests, create such distribution by definition? The question we have to ask here is whether we can make sense of the concept of meaning without relying on the principle of compositionality (hereafter PC) in the first place. The point is that the possibility of creating a noncompositional language stipulatively makes sense only provided the relation between the concept of meaning (and hence between that of language) and the concept of compositionality is contingent, i.e. empirical – for if it were the case that compositionality were (co-)constitutive of the concept of meaning, and thereby of the concept of language, this possibility would be simply precluded. Understanding PC as an empirical thesis clearly presupposes the prior distinct knowledge of what meanings are, and hence is impossible if PC is what takes part in constitution and individuation of meanings. Compare this status of PC with that of the principle of extensionality of sets, stating that two sets with identical elements are identical: it makes no sense to try to discover whether sets are extensional, for to be extensional is part of what it takes to be a set; and I claim that to be compositional is part of what it takes to be a meaning. Of course, if you assume that you are free to tamper with senses of words (like good old Humpty Dumpty), you can ‘make’ some meanings noncompositional – just like you can ‘make’ some bachelors married by letting the word “bachelor” refer to, say, butchers. Exactly here is where I think the pernicious conflation of ‘the formal’ and ‘the natural’ takes place. Within mathematics, ‘natural’ meanings are usually not crucial: what is crucial are definitions. You can take, say, the word “group” and furnish it with a meaning quite different from its
234
Jaroslav Peregrin
“natural” one: if you do it by means of a correct definition (and if you prevent any confusion of the old and the new sense of “group”), everything is in order. Similarly you can redefine the term “meaning” so that it means, say, ‘any object assigned to an expression and called meaning’ – but then you must keep in mind that studying meanings in this new sense is something different from studying meaning in the intuitive sense. Clearly studying, say, the lengths of expressions is something far removed from studying their meanings – but nothing prevents us from creating a new, artificial sense of “meaning” by calling the mapping of expressions on their lengths meaning assignment. Szab´o obviously feels that he should block the possibility of reinterpreting the term “meaning” too weirdly, and hence he stipulates (p. 480): “The meaning of an expression is something that plays a significant role in our understanding of the expression.” But this is odd. What is a significant role? I would assume, for example, that almost everybody’s understanding of the word “brontosaurus” has been significantly aided by a picture – but could such a picture be taken as what the word means? To make Szab´o’s postulate at least minimally plausible, we would need to change it to something like “The meaning of an expression is what one grasps when one understands the expression”, but even so it would be of little use, for the concept of “grasping” in this context is just as obscure as the concept of meaning itself. As far as I can see, our predicament is the following: either we can satisfactorily explicate the concept of meaning without the help of PC, and then we are vindicated in taking the principle as an empirical thesis, or we cannot do this, and then the compositionality of meaning, and hence of language, is a conceptual, ‘a priori’ matter. And what I am going to argue for is that there are some good reasons (going back to the work of the ‘founding father’ of modern semantics, Gottlob Frege) to take this possibility seriously. I will discuss Frege in greater detail shortly, but here I would like to provide an illustration of what I have just suggested. Consider, as an example, Frege’s semantic terminology: each expression has a sense (Sinn) and a meaning (Bedeutung). Which of the two values (if any) is the meaning, in the intuitive sense of the word, of the expression? If we hold the Humpty Dumpty view, our answer is bound to be: obviously the one which Frege calls meaning! But could we hold that to know the meaning of ‘the president of the USA’ is to know its Bedeutung, namely (now, in 2001) G.W. Bush? It is clear that she who understands ‘the president of the USA’ need not know the actual president, but rather needs only to know a criterion (which, as a matter of fact, picks up Bush right now). It thus seems better to say that what Frege calls meaning is in fact not meaning in the ordinary sense – and that it is Frege’s sense, which in this case provides the more plausible counterpart of the intuitive concept of meaning.
Is Compositionality an Empirical Matter?
2
235
The Story of Frege’s Bedeutung
So far I have argued that we should think twice before taking for granted that PC is an empirical matter, and that we should consider the possibility of PC being an ‘analytic’ principle taking part in shaping the concept of meaning. Now let me present a more explicit story about how the role of PC within the enterprise of this shaping can be seen; it is the story of what Frege called the ‘Bedeutung’ of an expression. In this section I will use the term ‘meaning’ as the equivalent of ‘Bedeutung’. Let me briefly review Frege’s account. First, Frege pointed out that meaning cannot be a matter of what is going on in somebody’s head, that it cannot be a matter of psychology2 . The argument he gave was, in effect, the following: (i) What is true and what entails what is an objective matter, independent of what anybody thinks is true or entailed – truth and entailment are not a matter of anybody’s subjective psychology (hence also logic is to be sharply separated from psychology)3 . (ii) Whether a sentence is true is generally a matter of how things are in the world, but also of what the words of which it consists mean. (It is clear that the sentence “Snow is white” would not be true if, say, “is” meant has beaten in chess.) (iii) Hence meanings cannot be a matter of subjective psychology – in pain of what is true being influentiable by subjective views. Frege thus concluded that meanings must be some objective entities; and he also concluded that the meanings of at least some expressions cannot but be extracted from claims in which the expressions occur. Let us consider Frege’s analysis of the meaning of a predicate. We know that a predicate can combine with a name to form a sentence. So if we take a predicate p, we know that together with the name n1 it yields a sentence s1 , with n2 it yields s2 etc.: p + n1 = s1 p + n2 = s2 ...
“to have wings” + “Frege” = “Frege has wings” “to have wings” + “Batman” = “Batman has wings” ...
Hence we can see p as a means of assigning s1 to n1 , s2 to n2 etc. p: n1 → s1 n2 → s2 ...
“to have wings”: “Frege” → “Frege has wings” “Batman” → “Batman has wings” ...
Now suppose we know what the meanings of both names and sentences are. If we denote the meaning of a as kak (with quotes, if any, omitted), we can transfer the whole consideration to the level of semantics: 2 3
See especially his 1918/9 paper Der Gedanke. I think this holds even for what is nowadays called cognitive psychology.
236
Jaroslav Peregrin
kpk: kn1 k → ks1 k kto have wingsk: kFregek → kFrege has wingsk kn2 k → ks2 k kBatmank → kBatman has wingsk ... ... Now suppose further that the meanings of names are the objects named by them (the meaning of “Frege”, kFregek, is the person Frege, that of “Batman”, kBatmank, is Batman) and that the meanings of sentences are their truth values (kFrege has wingsk being the falsity, F, that of kBatman has wingsk being the truth, T). (Let us, for the sake of the argument, disregard the obvious implausibility of this kind of explication of the concept of meaning of sentences.) Given this, we have kto have wingsk: Frege → F Batman → T ... In other words, kto have wingsk is what must be added to Frege to yield F, to Batman to yield T and so on for all conceivable individuals. Frege’s ingenious idea was to explicate this entity as a function in the mathematical sense, so that kto have wingsk(Frege) = F kto have wingsk(Batman) = T ... Now can we take PC to vindicate Frege’s proceeding here? Not really. The principle does not tell us more than that if the meanings of Frege is a philosopher and Frege has wings are different, then so must be those of to be a philosopher and to have wings. (The point is that the principle of compositionality implies that if the meanings of two wholes differ, then the meanings of at least one of their corresponding parts are bound to differ too.) It excludes some possibilities of capturing the meanings, but it far from pins their range down to the very one entertained by Frege, namely to identifying the meanings of predicates with functions mapping meanings of names on meanings of sentences. It thus seems that to conclude what Frege did requires some additional principles. However, what I suggest is that Frege’s conclusion can be seen as the result of his realization of the fact that there is no additional meaning-characterizing principle which would narrow down the scope of functions available; and that hence we are justified in taking recourse to a kind of a general ‘minimality maxim’4 – 4
In Peregrin (1994) I called it the principle of Occam’s razor, but this term might perhaps be misleading. What I had in mind was the general maxim governing every scientific explanation, namely to take for extant only the minimal forces needed to bring about an observable effect.
Is Compositionality an Empirical Matter?
237
i.e. to see meaning assignment as the simplest of the available functions. ¿From this viewpoint, the meaning of an expression is the simplest thing which must be added to the meanings of the expressions accompanying it to reach the meaning of the complex expression they form together. Hence the meaning of to have wings is the simplest thing which must be added to the meaning of Batman to reach the meaning of Batman has wings, to that of Frege to reach the meaning of Frege has wings etc. From this angle, the problem of finding the meaning of to have wings, given we have the meanings of both Frege, Batman etc. and Frege has wings, Batman has wings etc., is the problem of ‘subtracting’ the meanings of names, individuals, from the meanings of sentences containing the names, i.e. from certain truth values. This is the problem Frege solved by bringing in functions mapping the former on the latter (which eventually provided for the divorce of semantics from psychology and its marriage to mathematics). What has been just summarized is, of course, far from the whole story of Frege’s semantics. And the way the story continued is instructive precisely from the viewpoint advocated here: Frege soon discovered that he could not make do with his Bedeutung, because it is not compositional. (Frege, for instance, realized that to determine the truth value of a sentence like Napoleon, who realized the danger for his right wing, himself led his guards against the position of the enemy we may need more than just the truth value of the relative clause5 .) In general, if Bedeutung were compositional, any two sentences with the same truth value would have to be interchangeable within any complex sentence salva veritate; and it is easy to see that this is not the case. Frege’s conclusion is that in certain contexts a sentence, or indeed any expression, can come to stand for something else than its usual meaning. This was why Frege postulated, in addition to its Bedeutung, his other semantic value, the Sinn. Hence the corrected story says that the meaning of a complex expression is not always the function of the (ordinary) meanings of its parts, but sometimes the function of their senses – or, expressed differently, that it is always the function of the meanings, but that the place of the meaning of an expression can sometimes come to be occupied by its sense. 3
Meanings as ‘Contributions’
There is one more aspect of the Fregean method which is worth investigating in some detail. The mechanism by which Frege established the Bedeutungen of predicates seems to be applicable only to establish meanings of one kind of words (e.g. predicates) only given we know those of other kinds of words 5
See Frege (1892).
238
Jaroslav Peregrin
(names, sentences) – in pain of an infinite regress. We can obtain meanings in this way only if we have some meanings to subtract from and some meanings to subtract. But this is not necessarily the case – there is a principle which stops the regress, and this is the principle stating that sentences are bound to differ in meaning if they differ in truth value6 . This means that when subtracting meanings from meanings, we ultimately subtract meanings from truth values. To see what is meant by this, consider the following reformulation of the principle of compositionality: If meanings of two wholes differ, they cannot be composed in the same way, or else the meanings of all their corresponding components cannot be the same. While the standard formulation of PC claims that sameness of meaning projects from parts to wholes, this one reverses the angle and states that difference of meaning projects from wholes to parts. Let us refer to this reformulation of the principle as PC* (and let us postpone the discussion of the question whether we are right in taking it as nothing more than a reformulation of PC). However, if we see the differences of meanings of parts as grounded in the differences of meanings of wholes which contain them, the question which comes naturally is what are the differences of meanings of the wholes grounded in? In the differences of meanings of larger wholes? But is this not an infinite regress? And here what I suggest the Fregean answer is, is that at some ultimate point the difference of meanings is grounded in the difference of truth values. If the meanings of some two expressions can be seen as differing in force of the fact that the meanings of some expressions containing them differ, then the latter difference, and thereby the former, must be traceable to the fact that some sentences containing the expressions differ in truth value. (And truth is, as Frege stresses, the ultimate subject matter of logic and semantics.) Frege accounted for this fact by simply identifying meanings of sentences with their truth values, but this worked only for the ‘extensional’ core of our language – once we take into account also the variety of modal sentences present within natural language (not to mention propositional attitude reports), it results into a noncompositional mapping. We can see the situation in such a way that although truth is the ultimate source of meaning7 , the meaning of a sentence is not a matter of merely its truth value, but also of the sentences of which it is a part8 . Thus, the meaning of Batman has wings cannot be its truth value, because 6
For lack of a better name, I called it the principle of verifoundation elsewhere (see Peregrin, 1994). Cresswell (1982) considers it to be the most certain principle of semantics. 7 A point urged forcefully especially by Davidson (1984). 8 See Peregrin (2001, Chapter 8) for more details.
Is Compositionality an Empirical Matter?
239
it contributes by more than its truth value to the truth value of such sentences as Because Batman has wings, he can fly. (Surely we could not replace Batman has wings in this complex sentence by any other true sentence without changing the truth value of the whole!) So on the ‘upper’ end, the regress-stopper is truth – but still we seem to need a regress-stopper on the ‘lower’ one. Even if what we subtract from, ultimately, are truth values, to do away with the regress completely, we appear to need some ultimate subtrahends. From what we have said about Frege it seems that the role of this opposite regress-stopper can be played by individuals which may appear to be the natural meanings for names. But in fact the story may be more complicated. Consider Frege’s (1884, p. 73) famous claim that “It is only in the context of a proposition that words have any meaning.” This indicates that Frege was not willing to establish meanings of any words, not even names, independently of investigating the truth values of the sentences in which they occur9 . Hence it would suggest that we do not really subtract, but rather decompose. We seem to subtract the meanings of names from the meanings of sentences to reach the meanings of predicates, but this is only because, as a matter of methodology, we do not want to deal with all meanings in one sweep. We simply analyze the meaning of one part of a sentence (predicate) relatively to another part (name). In principle, we could have done it the other way around – namely have taken the meaning of the predicate as primitive and analyzed that of a name as a function assigning truth values to objects of these kinds10 . This leads to the conclusion that the meaning of an expression is the contribution which the expression brings to the truth of the sentences in which it occurs. This is, to repeat, the result of the thesis that what is constitutive of the concept of meaning is (i) the principle of compositionality; (ii) the principle of verifoundation (stating that difference in truth value forces difference in meaning); and (iii) the maxim of ‘minimality’. This also indicates that from the Fregean viewpoint, another constitutive characteristic of meaning, besides compositionality, is its interlinkedness with truth – nothing can be the meaning of a sentence unless it determines its truth conditions11 . Now consider Szab´o’s (ibid.) example of “Crypto-English” which arises from English by the interchange of meanings of “Elephants are gray” and “Julius Caesar was murdered on the Ides of March”. If we were to subscribe to 9
See Peregrin (2000b) for a discussion of why Frege took meanings of sentences as constitutive of the meanings of their parts and not vice versa. 10 While in the first case a ‘property’ comes to be analyzed as a class of ‘individuals’, in the latter an individual comes to be analyzed as a bundle of ‘properties’. 11 This was urged also by David Lewis (1972) in course of his criticism of Katz.
240
Jaroslav Peregrin
the Fregean attitude just outlined, then if we change the meaning-assignment of English in the suggested way, what results is no longer a meaning assignment – not that it is not a meaning assignment for English, but it is not a meaning assignment for any language. The reason is that if the meaning of a word is the contribution it brings to meanings of the wholes in which it occurs, there is no way of changing the meanings of a sentence without changing the meaning of some of its parts. 4
Compositionality as a Regulative Principle
I anticipate the objection that rewriting PC as PC*, which played a substantial role in our above considerations, is not legitimate. If PC is understood as an empirical thesis about the way meanings of parts add up to the meaning of a whole consisting thereof, then turning PC into PC* will probably appear as a mutilation – for from this empiricist perspective it is part and parcel of what PC says that parts are primary to the whole composed of them. And what I suggest is that we give up such a metaphysical reading of PC in favor of the ‘structural’ reading which takes it to say nothing more than what also PC* says. This is, I think, an integral part of Frege’s ‘contextualism’, exhibited by the above quotation and also by his repeated insistence that concepts which add up to a proposition are not primary to the proposition12 . I do not expect everybody to accept the particular Fregean story about meaning just outlined; and it is not my purpose to agitate for it here (although I do believe there is much to be endorsed in it). What I do want to argue for is that we should see that there are good reasons for taking PC not as an empirical, 12
Hence I think that this view, according to which PC states nothing over and above of what is stated also by PC*, can throw some light on the apparent tension between Frege’s (implicit) endorsement of PC and his contextualism. (See Janssen, 2001, and Pelletier, 2001, for recent contributions to the discussion.) What I suggest is that what we have here is not an incompatibility, but rather a certain kind of complementarity. While the contextualism leads to the view that subsentential expressions are to be seen as based on those of sentences, PC (together with the principle of verifoundation and the ‘minimality’ maxim) instructs us how to individuate their meanings: namely as the contributions these expressions bring to the meanings of sentences (and eventually to their truth values). See Peregrin (2001, Chapter 4) for more details. The point that there is a sense in which compositionality is not incompatible with – but rather complementary to – contextuality is pointed out also by Hodges (2001, p. 19). That compositionality and contextuality are complementary aspects of Frege’s viewpoint is argued also by Rott (2000). A thoughtful general discussion of the possibilities of reading PC as compatible with a holistic view of language (necessitated by contextualism) is given by Pagin (1997).
Is Compositionality an Empirical Matter?
241
but rather as a conceptual, and hence regulative principle: as a principle which is needed to single out the level of meanings from the many levels of semantic values which may be associated with meaningful expressions13 . The analogy between the concept of meaning and that of set is worth pursuing here. Sets are, we may say, one peculiar kind of the species the other kinds of which are collections, groups, herds etc. What distinguishes sets from the others? It is especially that they are individuated solely by their elements – you can possibly have two different groups of people consisting of the very same individuals, but never two such different sets. Hence, what is distinctive of the concept of set is the principle of extensionality. Now meanings are, analogously, a peculiar kind of the entities accompanying expressions, entities other kinds of which are mental images, referents etc.; and we need a way of singling them out. And what I suggest is that it is PC which can plausibly perform this task. Of course, our expressions are also often associated with various kinds of entities which do concern their semantics and which are not compositional. (Fregean Bedeutungen or extensions are an example.) Now it is hard to prevent somebody from calling some of these noncompositional entities “meanings” – just as it is hard for a set-theoretician to prevent somebody from using the term “set” for something which is nonextensional. But if the term meaning is not to be ‘free-floating’ in the sense of being freely applicable in the Humpty Dumpty way, there must be a criterion for singling out the kind of entities to which it applies properly. 5
Compositionality of Meaning vs. Compositionality of Other Things
To avoid misunderstanding: I do not want to claim that the question Is natural language compositional? cannot be interpreted in a nontrivial way. I do not want to dismiss any kind of empirical research done under this heading as misguided. What I claim is that in such a case what is in question is usually not the compositionality of meaning. Surely for example the question whether the assignment of truth values to statements of a natural language is compositional is a meaningful empirical question. The concept of truth, in contrast to that of meaning, does not depend on PC; and hence we can simply answer 13
Of course what really is the meaning of the word ‘meaning’ is largely a truly empirical question: a question to be answered by an investigation of the ways in which we actually employ the term. Thus although I suggest that ‘meaning is compositional’ is an analytic truth, that this is so is at most an empirical one – hence I do not think there can be a knock-down argument here. What I put forward are rather ‘pragmatic’ arguments indicating that PC, if endorsed as analytic, can smoothly do us a job which needs to be done.
242
Jaroslav Peregrin
that, as a matter of fact, no natural language is likely to be compositional in this sense (with the exception of its narrow core which has given rise to the classical propositional calculus, whose acceptable assignments of truth values hence are also compositional). Similarly, we can pose the meaningful empirical question whether the historical development of a natural language is always compositional in the sense that the introduction of a new kind of phrase always merely builds on the previous meanings of the words of which the phrase is composed. And again, the answer would be most probably negative – for example in English there are many cases where this is obviously not so: it is clear that the meaning of, say, “blue chips” was not uniquely determined by the antecedent meanings of “blue” and “chip”. (From the viewpoint of the theory of meaning, then, we have either to take the phrase as an undecomposible idiom with a primitive meaning; or to conclude that the meaning of “blue” and/or “chip” has been extended by the introduction of the new phrase – that, say, knowing what “chip” means has come to involve knowing what it means as part of this phrase.) Similarly, we can investigate the compositionality of many other kinds of things systematically connected to natural language expressions. What I claim is that the question about the compositionality of meaning is of a basically different character that those about the compositionality of the other things. Whereas in the latter case we face an empirical question, in the former we do not. We presume that expressions of a meaningful language must have some compositional semantic values, the only nontrivial problem being to find and to explicate them. Consider the language of first-order logic. The language is taken to be extensional (i.e. based on semantic values of the kind of Frege’s Bedeutungen) and hence it might seem that what its formulas should be taken to mean are their truth values. However, if we work on the level of the predicate calculus (rather than the propositional one), we know that there is more than truth values and their combinatorics in play; we know that to account for the semantics of the formulas of the language we need the machinery of satisfaction. Now what is to be seen as the meaning of such a formula? When check the origins of model theory, we can see that the reason why Tarski introduced the concept of satisfaction was precisely because he could not build a compositional semantics out of merely truth values14 ; and that the level at which the semantics becomes compositional is the level where the formulas are taken to denote classes of valuations of variables (or, equivalently, classes of sequences of objects, or functions from the valuations or the sequences into truth values etc.). It is, then, from our perspective, only the knowledge of which valuations satisfy a given formula which 14
See Kirkham (1992, Chapter 5), and also Peregrin (2001, Section 6.2).
Is Compositionality an Empirical Matter?
243
can be taken as amounting to knowledge of what the formula means. Take an example of an allegedly non-compositional language: Hintikka’s independence-friendly logic (IFL)15 , in which formulas are interpreted as representing certain games. The author repeatedly claims that the language is not compositional, and indeed the game associated with a formula can generally not be built out of those associated with its subformulas. On the other hand, even the exponents of IFL do not doubt that there must be some level at which the language is compositional: “Various compositional semantics for IF logics have been designed (...). There was no reason to expect otherwise, since the syntax of IF logic is finitely generated. The reason why it is still worth insisting on non-compositionality is that non-compositional semantics by means of games of imperfect information may turn out to be simpler and easier to apprehend than compositional ones; not to mention the philosophical virtues of game-theoretical semantics such as its being a representation of the human activities of seeking and finding connected with quantifiers, decision-making at connectives, and a responsibility-shift at game negation.”16 From our viewpoint, this simply shows nothing more (and nothing less) than that the game introduced as represented by a formula cannot be seen as its meaning; that the real meaning is to be sought on a semantic level which is compositional (and the specification of which appears not to be trivial). Note, however, that to descend to this level is required only if what we are after are the meanings of individual formulas or symbols, i.e. in the individualized contributions these bring to the truth of formulas in which they occur as components – if we are interested in more holistic semantic aspects of the language we may indeed find it preferable to operate at the non-compositional level in which the language is specified. Or consider Kamp’s DRT17 , which associates sentences and larger units of discourse with the discourse representations structures (DRS’s). Some sentences (like A farmer owns a donkey) link readily with DRS’s (like ), and hence it might seem that these DRS’s are what the sentences mean. But for sentences containing anaphoric elements (like He beats it) there is no self-standing DRS to be associated with: they need an input DRS (like the above one) and they use it to produce an output one (like ). This indicates that generally a sentence can be associated not directly with a DRS, but rather with a mapping of DRS’s on DRS’s (which in the ‘non-anaphoric’ cases amounts to trivial addition). It is only this level which can aspire to being compositional. 15
See Hintikka (1996) and Hintikka and Sandu (1997). Pietarinen and Sandu (2000, p. 153). 17 See Kamp (1981) and Kamp and Reyle (1993). 16
244
Jaroslav Peregrin
But even if it is, its aspirations to being the level of meanings would be frustrated, from the Fregean viewpoint, by the fact that it does not provide for the interlinkage with truth. This might lead us to seek for the true semantics of DRT, as Kamp himself does, at the level of the embeddings of DRS’s into models of reality and their classes. 6
Conclusion
To summarize, I think that it is simply mistaken to assume that PC must be a kind of a naturalistic generalization, which some or all empirical languages obey, but which languages we produce artificially need not obey. I think we should seriously consider the possibility that the connection between meaning (and hence language) and compositionality is not an empirical (but rather a conceptual, an analytic) issue. And we should be aware that if we take compositionality as coming ‘after meaning’, then we must have another principle to enable us to single out the level of meanings from the various levels of semantic accessories of expressions. I believe that we should follow Frege and realize that it is precisely PC which is particularly suitable for this task. If we do so, we start to see PC as a regulative principle, a principle which helps us settle controversies about what meaning is and what it is not. (Moreover, I believe that if we do this, then we are further to follow Frege to the conclusion that the meaning of an expression is best seen as a kind of contribution, namely the contribution that the expression brings to the truth of sentences in which it occurs – but this is, so to say, ‘another story’.) In any case, I am convinced that the idea of creating a non-compositional language by definition may be no more plausible than so creating a married bachelor. Acknowledgements Work on this paper has been supported by grant No. 401/04/0117 of the Grant Agency of the Czech Republic. References Cresswell, M. J. (1982). The autonomy of semantics. In S. Peters & E. Saarinen (Eds.), Processes, beliefs and questions (pp. 69–86). Dordrecht: Reidel. Davidson, D. (1984). Inquiries into truth and interpretation. Oxford: Clarendon Press.
Is Compositionality an Empirical Matter?
245
Frege, G. (1884). Grundlagen der Arithmetik. Breslau: Koebner. ¨ Frege, G. (1892). Uber Sinn und Bedeutung. Zeitschrift f¨ur Philosophie und philosophische Kritik, 100, 25–50. Frege, G. (1918/9). Der Gedanke. Beitr¨age zur Philosophie des deutschen Idealismus, 2, 58–77. Hintikka, J. (1996). The principles of mathemtaics revisited. Cambridge: Cambridge University Press. Hintikka, J., & Sandu, G. (1997). Game-theoretical semantics. In J. van Benthem & A. ter Meulen (Eds.), Handbook of logic and language (pp. 361– 410). Oxford / Cambridge (Mass.): Elsevier / MIT Press. Hodges, W. (2001). Formal features of compositionality. Journal of Logic, Language and Information, 10, 7–28. Janssen, T. M. V. (1983). Foundations and applications of montague grammar (dissertation). Amsterdam: Mathematical Centre. Janssen, T. M. V. (2001). Frege, contextuality and compositionality. Journal of Logic, Language and Information, 10, 115–136. Kamp, H. (1981). A theory of truth and semantic representation. In J. Groenendijk, T. Janssen, & M. Stokhof (Eds.), Formal methods in the study of language (pp. 9–52). Amsterdam: Mathematical Centre. Kamp, H., & Reyle, U. (1993). From discourse to logic. Dordrecht: Kluwer. Kazmi, A., & Pelletier, F. (1998). Is compositionality vacuous? Linguistics and Philosophy, 21, 629–633. Kirkham, R. (1992). Theories of truth. Cambridge (Mass.): MIT Press. Lewis, D. (1972). General semantics. In D. Davidson & G. Harman (Eds.), Semantics of natural language (pp. 9–52). Dordrecht: Reidel. Montague, R. (1970). Universal grammar. Theoria, 36, 373–398. Pagin, P. (1997). Is compositionality compatible with holism? Language, 12, 11–33.
Mind and
Pelletier, F. (2001). Did Frege believe Frege’s principle? Journal of Logic, Language and Information, 10, 87–114. Peregrin, J. (1994). Interpreting formal logic. Erkenntnis, 40, 5–20. Peregrin, J. (2000a). The “Natural” and the “Formal”. Journal of Philosophical Logic, 29, 75–101.
246
Jaroslav Peregrin
Peregrin, J. (2000b). “Fregean” logic and “Russelian” logic. Australasian Journal of Philosophy, 78, 557–575. Peregrin, J. (2001). Meaning and structure. Aldershot: Ashgate. Pietarinen, A., & Sandu, G. (2000). Games in philosophical logic. Nordic Journal of Philosophical Logic, 4, 143–173. Rott, H. (2000). Words in contexts: Fregean elucidations. Linguistics and Philosophy, 23, 621–641. Szab´o, Z. G. (2000). Compositionality as supervenience. Linguistics and Philosophy, 23, 475–505. Westerst˚ahl, D. (1998). On mathematical proofs of the vacuity of compositionality. Linguistics and Philosophy, 21, 635–643. Zadrozny, W. (1998). From compositional to systematic semantic? Linguistics and Philosophy, 17, 329–342.
The Compositionality of Concepts and Peirce’s Pragmatic Logic Ahti-Veikko Pietarinen
1
Introduction
This paper discusses the contribution of the American philosopher and scientist Charles S. Peirce (1839–1914) to some questions concerning the compositionality of concepts and the role of context in logic. Peirce developed the iconic, diagrammatic logic of Existential Graphs (EG) and provided semantics which earned the ahead-of-time but little-known title of the Endoporeutic Principle (EP) in 1905 (Pietarinen, 2004a).1 This highlights the importance of context construction and context updating during interpretation. In conjunction with EGs, Peirce also emphasised strategic communication, mutual knowledge, and the presence of the common ground (Pietarinen, 2005b). His one-time student John Dewey (1859–1952) stated that the neglect of context is “the greatest single disaster which philosophic thinking can incur”. It is the notion that is “inescapably present” without which we cannot “grasp the meaning of what is said in our language” (Dewey, 1931). In addition to the role of context in logic, my purpose is to shed light on some of the ways in which EGs and the EP instantiate aspects of compositionality as well as non-compositionality, and to put these aspects into a historical perspective. On the logical side, Peirce’s pragmatist theory addresses precisely the Dewey contention. That is to say, there is no sentence, expression or depiction that has a meaning independent of the environment, context or circumstance within which it is uttered and within which it gets interpreted. This is not, of course, a spat against compositionality of meaning per se (apart perhaps from some very Address for correspondence: Department of Philosophy, University of Helsinki, P.O. Box 9, FIN-00014 Helsinki, Finland. E-mail: [email protected]. 1
Not all iconicity is diagrammatic: pictures and metaphors are also iconic representations, the logic of which is still being sought. The Compositionality of Meaning and Content. Volume I: Foundational Issues. Edited by Markus Werning, Edouard Machery, & Gerhard Schurz. c
2005 Ontos Verlag. Printed in Germany.
248
Ahti-Veikko Pietarinen
strong notions of compositionality in which the meaning of expression would be exhaustively defined by its parts), since context may itself be provided as a set of utterances, or perhaps some representational schema, script, frame, or what have you, to which a linguistic or logical description can be attached. In other words, one may encode any contextual information into the model-theoretic consequence relation.2 Compositionality has often thought to be a prerequisite for learnability, systematicity, or even communicativity and comprehension of language (Pagin, 2003). For if proper subexpressions or subformulas are context dependent, it is said, they may not have a self-supporting meaning. It is for this reason, advocates of the compositional approach claim, that these expressions cannot be considered to be natural constituents of some larger unit, typically a formula or a sentence, the meaning of which ought to be morphically imaged on those constituents. It is, however, by no means clear that such a claim for compositionality is valid from the historical point of view. I seek to substantiate this claim by investigating Peirce’s pragmatic conception of logic deriving from the late 19th century. 2
Interpreting Existential Graphs
In Peirce’s diagrammatic logic, graphs are used to assert things about individuals. The main components of such a system involve (1) spots, which are bounded regions of a surface, the sheet of assertion (the Phemic Sheet) upon which graph-instances are scribed, (2) closed, continuous, simple curves or cuts around a graph-instance representing negation of that instance, and (3) continuous, thick lines connected at the hooks on the peripheries of spots called lines of identities, representing existence, identity and predication. Spots represent unsaturated predicate terms Cuts are recursively nested. Identity lines are graph-instances, but their compositions, ligatures, are not.3 The outermost free 2
Encoding may be done directly into the sub-expressions of language as in contextcarrying in logic programming or in dynamic theories of meaning, or else into extensions of assignment as in compositional semantics for ‘Independence-Friendly’ (IF) first-order logic, intended to provide proper semantic attributes via non-empty sets of sequences of assignments instead of merely sequences of them (Hintikka, 1996; Hodges, 1997; Janssen, 1997, 2002; Sandu & Hintikka, 2001). 3 To be precise, Peirce held ligatures to be “composites” of several graph-instances (MS 669, 1911, Assurance Through Reasoning, available at http://www.helsinki.fi/˜pietarin/ with another unpublished manuscript with the same title, MS 670). The reference is to the Peirce Manuscripts (Peirce, 1967) by manuscript
The Compositionality of Concepts and Pragmatic Logic
Figure 1:
BETA
249
graph with two cuts, two ligatures and two spots.
extremity of an identity line or ligature determines whether the quantification it represents is universal (the outermost free extremity lies within an odd number of cuts) or existential (the outermost free extremity lies within an even number of cuts). We must skip over much of the details concerning EGs and refer to Pietarinen, 2004a,c,2005a,b; Roberts, 1973; Zeman, 1964). To see an example, Figure 1 depicts an EG that pertains to the BETA part of the system, in which the ligatures correspond to the existential (∃y, denoted by the innermost extremity) and universal (∀x, denoted by the outermost extremity) quantifiers of predicate logic. There is also a one-place spot S1 and a two-place spot S2 .4 This graph may be correlated with the symbolic sentence ∀x∃y(S1 (x) → S2 (x, y)) of first-order logic. However, when doing things in this iconic and diagrammatic fashion, some characteristics are brough to the fore in iconic representations but not in symbolic expressions. Among them is continuity (see section 5). Graph-instances are interpreted according to the Endoporeutic Principle (EP): the outermost cuts (contexts) are evaluated on the model M before proceeding to the inner, contextually-constrained cuts. The evaluation of lines involves an instantiation of a value to the outermost end of an identity line, and this value then propagates continuously along the identity line towards the interiors of the inner cuts and to the spots to which the lines are attached. Peirce termed the process of instantiation, together with the type of the identity line, the selective (Figure 2). The EP thus provides the basic fashion in which contexts are created and updated. As Figure 2 testifies, the evaluation takes place between two parties “in our make believe”,5 the GRAPHIST (alias the UTTERER) who scribes the graphs and proposes modifications to them, and the GRAPHEUS (alias the INTERPRETER) who authorises the modifications. We may think of them as players much in the sense of the game-theoretic semantics (Hilpinen, 1982; Hintikka, 1973; Pietarinumber, and, if applicable, followed by page number. If known, I also give the title and the year of the manuscript when first referred to. 4 If identity lines and hooks are omitted, one gets the ALPHA part of EGs, the theory of which is isomorphic to the theory of propositional logic. 5 MS 280, c.1905, The Basis of Pragmaticism.
250
Ahti-Veikko Pietarinen
Figure 2:
BETA
graph, first two steps of the evaluation.
nen, 2003). The graphs scribed by the GRAPHIST are true, because “the truth of the true consists in his being satisfied with it” (MS 280: 29).6 The GRAPHIST is therefore the verifier of the graphs. To end with a true atomic graph amounts to a win for the GRAPHIST, and to end with a false one amounts to a win for the GRAPHEUS. The truth of the whole graph agrees with the notion of the existence of a winning strategy for the GRAPHIST, in Peirce’s terms a habit that is of “a tolerable stable nature” (MS 280: 30). Likewise, falsity is the existence of such habits for the GRAPHEUS. Peirce assumes further that these players have common knowledge concerning the universe of discourse and thus a reasonable common ground well understood between the two of them, without which the discoursing that proceeds according to the presumption of collaborative enterprise would not be possible (Pietarinen, 2004e). Summarising, a semantic game is played between the GRAPHIST and the GRAPHEUS on a given graph-instance Gi and a model M , according to the following conventions: 1. Juxtaposition of graph-instances on positively (negatively) enclosed areas: the GRAPHEUS (resp., the GRAPHIST) chooses between two graphs. The evaluation proceeds with respect to that choice. The winning conventions change with respect to choices on negative areas. 2. Polarity of the area of the outermost extremity of a ligature determines whether the GRAPHIST (if positive area) or the GRAPHEUS (if negative area) is to make a choice from the dom(M ) to be the interpretation of the identity line. Evaluation proceeds with respect to that choice and with the graphinstance that has the ligature removed up to its connexions with spots. What 6
Note the resonance “being satisfied with” has with Tarski’s somewhat less concrete idea of satisfaction. Also L¨owenheim used a similar term of being “satisfied” [“erf¨ullt”] in his early model-theoretic explorations well before Tarski (Badesa, 2004: 139).
The Compositionality of Concepts and Pragmatic Logic
251
will remain at the hooks of spots are zero-dimensional dots.7 3. When a spot is reached, its truth-value determines the winner: a spot that is satisfied in an interpretation means that the GRAPHIST wins a particular play on the graph, and the spot that is not satisfied in an interpretation means that the GRAPHEUS wins a particular play. 4. The existence of a winning strategy (i.e., a stable habit that leads to a win in all plays) for the GRAPHIST agrees with Gi being true in M . The existence of a winning strategy (i.e., a stable habit that leads to a win in all plays) for the GRAPHEUS agrees with Gi being false in M . As regards to the EG in Figure 2, the evaluation commences with the selections of two individuals for the ligatures, the first by the GRAPHEUS and the second by the GRAPHIST. After that, the GRAPHEUS chooses between the spots S1 and S2 with a cut around it. If he chooses S1 , and S1 is satisfied in M , then the GRAPHEUS wins. If the GRAPHIST chooses S1 , and S1 is not satisfied in M , then she wins. The whole EG is true precisely in the case there exists a stable habit for the GRAPHIST, and false precisely in the case there exists a stable habit for the GRAPHEUS. What are Peirce’s views on the matters related to the compositionality of such graphs? In speaking about his EGs circa 1905, he made the following note in an unpublished draft: The meaning of any graph-instance is the meaning of the composite of all the propositions which that graph-instance would under all circumstances [= in all contexts] empower the interpreter to scribe. (MS 280: assorted pages 35). A similar statement exists among other draft pages of the same manuscript: The meaning of any graph-instance is the meaning of the sum total or aggregate of all the propositions which that graph-instance enables the interpreter to scribe, over and above what he would have been able to scribe. (MS 280: assorted pages 35). What the I NTERPRETER (i.e., the G RAPHEUS) is empowered to scribe are thus, on the one hand, experimental and evidenced facts derived from experiments upon these diagrams, and on the other, inferential propositions that follow from graphs by the rules of transformation. According to Peirce, meaning therefore involves both inductive and deductive elements of reasoning. Peirce’s renowed Pragmatic Maxim (PM) says that the meaning of a concept is the sum total of its implications for possible observations and possible actions: 7
Uninstantiated dots attached to hooks may be regarded as free variables of the extension of the system of EGs whose graph-instances would correspond to formulas.
252
Ahti-Veikko Pietarinen
The rule for attaining the third grade of clearness of apprehension is as follows: Consider what effects, that might conceivably have practical bearings, we conceive the object of our conception to have. Then, our conception of these effects is the whole of our conception of the object. (5.402, 1878, How to Make our Ideas Clear).8 This formulation of the PM first appeared in the January 1878 issue of Popular Science Monthly. Several versions of it exist in Peirce’s large corpus. A very succinct and unambiguous one says that “the maxim of logic [is] that the meaning of a word lies in the use that is to be made of it” (CN 2.184, 2 February 1899, Matter, Energy, Force and Work).9 Given the PM, according to which the meaning of an assertion is the sum totality of all its actual and possible practical consequences under a given interpretation, a pragmatic principle of compositionality would be thus: The Pragmatic Principle of Compositionality. The meaning of an assertion A is the meaning of all assertions that follow from A either by inductive or deductive principles and permissions under all authorised circumstances (i.e., those arising out of mutual consent by the GRAPHIST and the GRAPHEUS). Here we have an outward-looking, indefinitely-progressing principle for meaning. Moreover, in the light of such a pragmatic approach to logic, the Context Principle would be thus:10 The Pragmatic Principle of Context. A proposition has no meaning in isolation from its consequences. Accordingly, a proposition that has no consequences is meaningless. 3
The Endoporeutic Principle and Compositionality
In her exploration of the ALPHA and BETA parts of Peirce’s EGs, Shin (2002) argues that one should dispense with the Endoporeutic Principle (EP). The claim Shin puts forward, namely that Peirce’s obsession with the EP has foiled any comprehensive understanding of his system and its establishment in a wider 8
The reference is to Peirce (1931–58) by volume and paragraph number. The reference is to Peirce (1975–87) by volume and paragraph number. 10 According to the usual formulation of the Context Principle, a word meaning cannot exist unless there is sentence in which words are embedded. Such views are found in Frege’s (1884) Grundlagen der Arithmetik: “The meaning [Bedeutung] of a word must be asked for in the context of a proposition [Satzzusammenhang], not in separation” (cf. Beaney, 1997: 90). 9
The Compositionality of Concepts and Pragmatic Logic
253
perspective, is nevertheless mistaken. Her claim turns out to be a plea for compositionality, the notion of which we should not anachronistically attribute to Peirce. Shin does not refer to compositionality, but she laments that no real challenge has been voiced against the EP for reading the graphs. According to her, the principle does not “reflect visually clear facts in the system”, and in fact “forces us to read a graph in only one way” (Shin, 2002: 63). What are these allegedly visually clear facts? Shin refers to the impossibility of reading graphs so as to determine which are oddly enclosed and which are evenly enclosed by cuts. While the endoporeutic reading may deliver correct truth conditions for graphs, she thinks that the substitution of corresponding implications for nested cuts is often forced in an unnecessary way. Also, there is no mention in such a reading of a disjunction, namely the juxtapositions of sub-graphs encircled within an even number of cuts. Likewise, Shin claims, no vocal difference obtains between existential and universal quantification in the endoporeutic, outside-in reading. To rectify these defects, Shin strives to reproduce the ALPHA and BETA parts in their respective equivalent negation-normal forms, namely those in which cuts are pushed in as deeply as possible. She terms the preferred reading a “direct reading” or a “multiple reading”, whereas the endoporeutic reading is “indirect”. Stenning (2002) has employed similarly vague terminology in his discussion of the iconic nature of diagrammatic methods. Symptomatically, there is no mention of the dialogical and communicational interpretation of graphs in either Shin’s or Stenning’s works. I strongly doubt that a comprehensive overview of EGs can be obtained without such an interpretation. For instance, the difference between existential and universal quantification is precisely that which determines which of the parties, the GRAPHIST or the GRAPHEUS, is to select an instance from the universe of discourse as intended by the proposition that the graph depicts. The switching between them is then facilitated by the system of cuts. Likewise, the distinction between reading the graphs as representing conjunctions and reading them as representing disjunctions amounts to this role-playing view of game-theoretic semantics. Shin’s remarks turn out to be a reflection of the wider and more far-reaching issues implicit in her discussion than ones merely related to the direction of reading the graphs. Although she does not spell it out explicitly, the absence of the two interlocutors in Shin’s discussion is characteristic of a more general presupposition, namely that of compositionality. It is this presupposition that lurks behind her intention to read iconic features of graphs in a way that she claims is not brought out by endoporeutic reading. The communicative and dialogical interpretation of EGs nevertheless reveals that such preoccupations are ill conceived. Given the EP, there would be nothing missing in the way EGs
254
Ahti-Veikko Pietarinen
are read, since the distinction between different quantifiers and different logical connectives is exposed by the system of choices performed by the utterers and interpreters associated with the graph whose meaning is to be disclosed. Another way of putting across a similar point is to note the equivocation in Shin’s use of the term “endoporeutic” when compared to the way in which Peirce uses it. Shin speaks about “endoporeutic reading” of the graphs, while for Peirce, the EP was not a matter of reading the graphs, but a necessary consequence of the diagrammatisation that readily dictates how the graphs are interpreted. The EP provides a method for expressing the truth of the graphs, and only perhaps as a consequence, a method for reading them. Full understanding necessitates the two interlocutors, who choose and assign semantic values to each component in a definite order respecting the passage from outer instances to the inner, contextually-constrained instances. This is spelled out in an unambiguous manner in the game-theoretic interpretation, and is evoked in Peirce’s notion of “nests” of graphs (4.472, c.1903; 4.494, c.1903; 4.617, 1908).11 Peirce noted the importance of the contextual and model-theoretic aspects encountered by those who interpret EGs: The rule that the interpretation of a graph must be endoporeutic, that is, that the graph of the place of a cut must be understood to be the subject or condition of the graph of its area, is clearly a necessary consequence of the fundamental idea that the Phemic Sheet itself represents the Universe, or primal subject of all the discourse. (MS 500: 48, 1911, A Diagrammatic Syntax). The usefulness of this interpretation is beyond dispute in the interpretation and understanding of anaphoric expressions (and not only nominal anaphora but also those involving temporal co-references). In comprehending discourse, what other way is there to bring the values of anaphoric pronouns into being than by looking at what has happened in previous rounds of interpretation, interspersed with the contextual and environmental input that the Phemic Sheet, namely all that is well understood between the utterer and the interpreter, is supposed to provide? According to Peirce, “Endoporeusis, or inward-going” (MS 650: 18), is like a global clock that synchronises interpretation and arranges it in an outside-in fashion. Metaphorically, it is like: 11
“The interpretation of existential graphs is endoporeutic, that is proceeds inwardly; so that a nest sucks the meaning from without inwards unto its centre, as a sponge absorbs water” (MS 650: 18, 1910, Diversions of Definitions). The “nest” is here a technical term meaning the sequence of cuts from those that have the largest number of enclosures to those that have the fewest.
The Compositionality of Concepts and Pragmatic Logic
255
a march to a band of music, where every other step only is regulated by the arsis or beat of the music, while the alternate steps go on of themselves. For it is only the iteration into an evenly-enclosed area that depends upon the outer occurrence of the iterated graph, the iteration into an oddly enclosed area being justified by your right to insert whatever graph you please into such an area, without being strengthened or confirmed in the least by the previous occurrence of the graph on an evenly-enclosed area. So the analogy to a march is pretty close. (MS 650: 18–19). The question for Peirce was therefore not how to read off what the graphs are intended to convey. There are numerous logically equivalent ways of doing so. The question was: What is the most appropriate “system for expressing truth endoporeutically” (MS 650: 19)? This is what the pragmatist has in view, namely a definite purpose in investigating logical questions, since “he wishes to ascertain the general conditions of truth” (2.379, 1901, Negation). Peirce predicted that “if anybody were to find fault with the system . . . I should be disposed to admit that it is a poetical fault” (MS 650: 19). In pure aesthetic terms, there may well be alternatives, but such alternatives ought to achieve at least as simple a set of rules as endoporeusis does, and in addition predict the same merits for its essential features. Alternatives to EP have not fulfilled these predictions. Shin’s proposal to reduce the ALPHA graphs to the negation-normal form involves a recursive definition that starts from atomic graphs outwards. Proof-theoretically, replacing Peirce’s five transformation rules of erasure, insertion, iteration, deiteration and double cut, Shin’s reformulated inference rules come in seven parts, four of which are partial decompositions of the rules of iteration and deiteration into the rules that have been used in natural deduction systems. In the case of the BETA part, there are five rules for their transformation, which Shin rewrites into nine rules divided into twenty-six sub-cases. In general, then, compositional methods of interpretation are far more complex in that they are liable to have a considerably larger number of transformation rules than those that Peirce proposed. Furthermore, extensions of EGs, such as adding more dimensions into the sheet of assertion (Pietarinen, 2004c) is bound to increase the complexity of any compositional alternative to the EP to an unnatural extent. In stark contrast to the animadversions professed in Shin’s account of EGs, Peirce did not possess comparable predispositions concerning the preferred interpretation of natural language. For him, the reason was historical rather than systematic. His sin was not that he overlooked the distinction between what is compositional and what is non-compositional in logic, since no such distinction was forthcoming in EGs. Nor was it to remain preoccupied with only one, the
256
Ahti-Veikko Pietarinen
endoporeutic, semantics of his graphs, since this was the only context-abiding game of interpretation in town. His sin – if indeed it was such – was simply to fix his ideas by presenting a multiplicity of diagrammatic systems that far exceeded what was comprehensible (and printable) at that time. 4
The Composition of Concepts
In discussing context and compositionality, I alluded to the contributions by diagrammatic, iconic approaches to logic. These approaches are rich both in content and historical significance. What is achieved via representations such as EGs has not been brought to bear on compositionality before. The acute nature of such a discussion is shown by the well-argued similarities between, on the one hand, the thought in its relation to the mind, and on the other, iconic representations as ‘snapshots’ or images of the essence of the mind in action. I will turn to this topic next. In their argumentation concerning the status of compositionality, Fodor & Lepore (2002) started with the premise that the content of a sentence is the thought or proposition it is used to express. I will leave it to others to decide whether substantial issues exist which prohibit equating content with meaning. In the light of the PM, this premise is nevertheless clearly inept. Thoughts, let alone propositions or propositional contents, have little to do with the characterization of meaning which ought to be intersubjectively and cognitively testable, verifiable, or observable, or, in the very least, potentially or subjunctively so. What is more, Peirce’s logic and his EGs exhibit some direct links with cognitive issues. I will adduce evidence for the view that Peirce did recognise the significance of how concepts are composed and the relationship between the logic of EGs and the prospects of achieving a compositional mapping between a representational system and the mind. Peirce demonstrates the reality of such links by stating that, just as thoughts are determinations of the mind, graph-instances are determinations of the blank sheet of assertion.12 The mind is a composition of thoughts in a manner analogous to that in which graphs are compositions of their graph-instances. The mind is a comprehensive thought in a manner analogous to that in which the sheet of assertion, with all its permissible transformations and actual transformation-instances, is a well-formed graph. The second main interconnecting feature is that graph transformations and identities are continuous, and so is the thought that a graph, together with all its practical consequences, represents. I return to this feature in the next section. 12
MS 490, 1906, Introduction to Existential Graphs and an Improvement on the Gamma Graphs.
The Compositionality of Concepts and Pragmatic Logic
257
Unfortunately, some crucial passages, including parts of manuscripts 498 and 499 in which Peirce promises to tackle the nature of the composition of concepts and the nature of the proposition through EGs, have not been recovered among his vast corpus of unpublished manuscripts.13 It is likely that these drafts were left incomplete and Peirce never brought the promised issues to a conclusion. In MS 498 he notes that: The solid core of all this discussion is the question of how concepts can be compounded. Suppose two concepts, A and B, to be combined. What unites them? There must be some cement; and this must itself be a concept C. So then, the compound concept is no AB but ACB. Hereupon, obviously arises the question how C is combined with A or with B. The difficulty is obvious, and one might well be tempted to suspect that compound concepts were impossible, if we had not the wish manifest evidence of their existence. Here, then, are the two puzzles of logic upon which I am going [to] try what light can be shed by the system of Existential Graphs. They are the puzzle of the relation of signs to minds, and of their communication from one mind to another, and the puzzle of the composition of concepts and the nature of the judgment, or, as we of the antipsychological school say, of the proposition. The rest of the manuscript provides a preparatory discussion of the system of EGs. In the slightly later MS 499, he also starts in a promising way: Another great puzzle of logic is that of the composition of concepts, or thoughts. It is evident that a thought may be complex. . . . there are thoughts compounded of thoughts. But let A and B be two simple thoughts which can be compounded. But how are they compounded? They are compounded in thought. Very well, then, the composition must then be a third thought, which we may denote by C; so that the compound is not AB but is ACB. Then the question arises, how are A and C compounded, as they certainly are in ACB; and how are C and B compounded? This puzzle has been formulated in all its generality by some great logicians. There is then a lengthy intermission concerning the nature of assertions and judgements. A couple of pages further on, Peirce characteristically admits that: 13
These drafts were prepared as preliminary addresses to one of the 1906 meetings of the National Academy of Sciences. They are entitled On Existential Graphs as an Instrument of Logical Research (MS 498) and On the System of Existential Graphs Considered as an Instrument for the Investigation of Logic (MS 499). MS 490 is the version Peirce ended up presenting at April 1906 meeting in Washington DC.
258
Ahti-Veikko Pietarinen
It is still unsettled and is the most prominent perhaps among unsettled questions in the logical literature of recent years. I wish in this communication to exhibit to you the unexpected solution not merely of this problem but of the more general problem of the composition of concepts to which the systems of existential graphs leads. I do not propose today to enter into the demonstration of the truth of the solutions suggested by existential graphs, of the two problems in regard to the relation of signs to the mind and of the composition of concepts. The reasons to exclude this part of the discussion are, first, that it would render my paper tedious since the proof presents no very striking idea, or other great novelty, and second, that if I should go into the tedious development it is unlikely that any one member would carry it away from an oral statement without dropping out some point which is essential to its cogency; and at any rate it would be unintelligible to the great majority. In case the paper should be printed, I will append the proof for the benefit of those who may desire to examine it. I have now sufficiently indicated two great logical puzzles. I do not call them problems because it is the nature of logical difficulties that until they are solved we cannot distinctly state what the problem is. Even after they are solved, it is often no easy matter to say what the problem was. The puzzle of the measure of force is one of [many] instances of this that I might adduce. These two puzzles relate to the mode of composition of concepts in general and to the nature of the proposition in particular, and to the relation of concepts and signs to the mind. Following this, Peirce goes on to discuss some general issues concerning the system of EGs. Luckily, however, when addressing related issues in 4.572, Peirce briefly notes how EGs are intended to solve the puzzle concerning the composition of concepts. The problem is that: the ways in which Terms and Arguments can be compounded cannot differ greatly from the ways in which Propositions can be compounded. A mystery, or paradox, has always overhung the question of the Composition of Concepts. Namely, if two concepts, A and B, are to be compounded, their composition would seem to be necessarily a third ingredient, Concept C, and the same difficulty will arise as to the Composition of A and C.14 The solution to this is, in Peirce’s own words, the following: 14
From Prolegomena to an Apology for Pragmaticism published in The Monist in 1906, submitted in April that year.
The Compositionality of Concepts and Pragmatic Logic
259
As far as propositions go, and it must evidently be the same with Terms and Arguments, there is but one general way in which their Composition can possibly take place; namely, each component must be indeterminate in some respect or another; and in their composition each determines the other. Speaking extensionally (viz., ‘on the recto’ of the EGs), this is effectuated such that the indefinite sentence “Some man is rich” is composed of “Something is a man” and “something is rich,” and the two somethings merely explain each other’s vagueness in a measure. Two simultaneous independent assertions are still connected in the same manner; for each is in itself vague as to the Universe or the “Province” in which its truth lies, and the two somewhat define each other in this respect.15 The interconnectedness of signs on the sheet is thus but one manifestation of Peirce’s synechist doctrine of continuous connections and mutual attraction between signs (representamens) that the mind composes in order to form wholes and thus becoming capable of comprehending and interpreting. As far as the conditional is concerned, a similar interplay between the concepts of the antecedent and the consequent is manifest: The composition of a Conditional Proposition is to be explained in the same way. The Antecedent is a Sign which is Indefinite as to its Interpretant; the Consequent is a Sign which is Indefinite as to its Object. They supply each the other’s lack. (4.572) In a similar vein, in parts of manuscript 490 that were omitted from its publication in 4.582, Peirce notes that graphs are compounded and united by combinations of “being the relate and the correlate of a relation”. This natural union of the relate and the correlate as a special determination of the sheet of assertion, the union signifying existential relations, verb actions and identities alike, was the diagrammatic counterpart of the special determination of the mind by an idea.16 15
See also 4.561fn, which is from MS 300: 33–39, the intended follow-up to the Prolegomena series entitled The Bed-Rock Beneath Pragmaticism, written in 1905. Cf. 499 (supplement): “Existential Graphs shew beyond all doubt to the discerning mind that the Composition of Concepts can only take place by the reciprocal precisions of indefiniteness”. 16 Furthermore, it was this union of the relate and the correlate that gave rise to the difficulties of any straightforward composition with respect to modal and actual universes and to the puzzles concerning possible individuals, subsequently part of ‘cross-world identification’ of quantified modal logics (Pietarinen, 2005a).
260
5
Ahti-Veikko Pietarinen
Continuity
To provide an example of the application of Peirce’s synechist outlook on logic, the following remark can be made. It has been one of the grand mysteries in his later logical writings as to what he may have meant in speaking about continuous predicates in relation to his EGs. For instance, at an age of 69, he wrote to Victoria Welby the following: A predicate which can thus be analyzed into parts all homogeneous with the whole I call a continuous predicate. It is very important in logical analysis, because a continuous predicate obviously cannot be a compound except of continuous predicates, and thus when we have carried analysis so far as to leave only a continuous predicate, we have carried it to its ultimate elements. I wont lengthen this letter by easily furnished examples of the great utility of this rule. (SS: 72, 1908, Letter to Lady Welby).17 As far as I have been able to see, no explanation of what Peirce could have meant by continuous predicates is forthcoming elsewhere in his writings. However, an enlightening possibility here is to think of predicates p as subsets A of X in the mappings from a space X to the set {0, 1}, where X is an arbitrary topological space. Continuity of the predicate would now depend on the topology defined for {0, 1}. Moreover, continuous functions to a two-element lattice now correspond to open sets of X. Hence, a relationship has been drawn between predicates and open sets. This is, in fact and in brief, the formal way to derive continuous predicates. Now, is this something that Peirce could have envisioned? The answer is positive, because as Peirce stated in the above quotation, a continuous predicate is also “a predicate which can thus be analyzed into parts all homogeneous with the whole”. This is but an alternative way of saying that a continuous predicate can be characterised as the set of all predicates consistent with it. Moreover, the iconic, or diagrammatic, method of representing assertions and reasoning about them is, as Peirce noted, best conducted by establishing logical principles upon topological, continuous principles. The state of these principles was quite elementary at that time, but also quite rich in the sense of not being preoccupied with the later developments concerning set theory and the nature of continuum during the early phases of the symbolic era in logic (Pietarinen, 2005a). The explication of continuous predicates may seem somewhat cursory or overly specialised for the main purposes here, but I believe that it illustrates 17
The reference is to Peirce (1977) by page number.
The Compositionality of Concepts and Pragmatic Logic
261
the more general and serious concern this early contribution to cognitive science and logic raised than the technicalities indicate. Namely, the general point is that semantics can and should be based on principles that have a continuous core. This may happen either in the semantic attributes or in describing the parts of the representations (say, in iconic, diagrammatic or metaphoric mental images) as being continuously connected (‘path-connected’) with one another. To substantiate the associations, we may allude to works such as Wolfgang K¨ohler’s Field Semantics for the theory of perception dating back to the 1920s, namely that “neural functions and processes with which the perceptual facts are associated . . . are located in a continuous medium” (K¨ohler, 1940: 55). Notable is also Peirce’s remark in his statement quoted above that spots, propositions and inferences are to be continuously connected with one another. Topologically, this expresses connectivity between different parts of the surface, thus forming propositions, and in terms of propositions being connected, in the equally topological sense, under continuous transformations from one graph to another, thus forming inferential arguments. By itself, the conceptualisation of a continuous predicate and more generally the continuous connections between separate parts of the representations already bore the traces of the later Gestalt or perceptive schemes rather than anticipating anything like the Tarski-type logical semantics which have an inherently compositional and recursive nucleus. However, Peirce did not make the connection with the later truth-conditional and semantic issues explicit, as he was first and foremost striving to account for what would count as necessary reasoning in terms of the dynamic understanding of ‘motion’, for instance through continuous morphing from one diagram onto another according to given permissions (Pietarinen, 2005b). Hence, these wider concerns were easily sidestepped by the advocates of discrete-symbolic logic who started to gain ground shortly after the forgetfulness of these early contributions commenced. 6
Dynamics and Discourse
The aforementioned perspectives on Peirce’s pragmatic logic also highlight what is misconceived in recent dynamic theories of meaning. They have attempted to characterise central semantic notions in terms of context-change potential (van Benthem, Muskens & Visser, 1997). What these formalisms have been seeking, though, is not meaning in the pragmatic sense of a list of practical consequences, but a transfer between varying ways of assigning interpretations to structures within which expressions are evaluated. It is effected by inputoutput pairs of assignments to a formula.18 18
By doing this, dynamic semantics has come to characterise a limited form of relevance for expressions in terms of context change, but certainly not in terms of their
262
Ahti-Veikko Pietarinen
Note how smoothly this links with the agenda laid out in the introduction (note 2), in which I implicitly called for proper motivation for providing the semantic entities in some of the recent formulations of compositional semantics. Like dynamic semantics, these compositional semantics have accomplished little more than an extension of the concept of assignment. For instance, in building compositional semantics for logics of imperfect information (a.k.a. IF logic) the extension is from sequences of assignments to sets of sequences of assignments. In dynamic semantics, the extension is from sequences of assignments to pairs of sequences of assignments. Neither case explains what the relevance of the excessive semantic material smuggled in by the extensions of assignments amounts to in the course of an interpretation. Surely not all meaning lies in the assignments.19 The centrality of context is cogently emphasised in comparing the EP of EGs with that of dynamic semantics. Given the iconic and topological nature of the method by which graphs are scribed on the assertion sheets, the logic is readily and inherently dynamic. For consider the EGs with two spots and an identity line that has one free extremity in Figure 3. These three graphs are equivalent. But they do not make a similar distinction as symbolic sentences that have two different accounts of the scope of the existential quantifier, namely ∃x (S1 (x) ∧ S2 (x)) and ∃x S1 (x) ∧ S2 (x).
Figure 3: Three equivalent existential BETA graphs. Accordingly, the logic of EGs is dynamic in that is does not differentiate between, on the one hand, S2 (x) within the syntactic scope of the initial existential quantifier but not within its semantic scope as in ∃x S1 (x) ∧ S2 (x), and, on the other hand, S2 (x) being within both the syntactic and semantic scopes of the meaning. 19 What the relevance-theoretic framework of Sperber & Wilson (1995) attempts is to accept only those semantic entities that are context-effective in the cognitive, strategic and collateral senses, in other words items that have some tangible, realistic and pragmatic role to play in future cycles of interpretation.
The Compositionality of Concepts and Pragmatic Logic
263
quantifier as in ∃x (S1 (x) ∧ S2 (x)). As soon as the system of lines of identities, the ligature, extends to both of the spots juxtaposed on the sheet and branches to attach to the hooks on the peripheries of both S1 and S2 , the spots become semantically bound by the line. Such a line always has an interpretation. Since the spots are thus assigned a semantic value, there is no fear of any violation of the compositionality of the truth of these graphs in the sense of including components that lack an interpretation. On the other hand, conjunction as juxtaposition is commutative, which nevertheless is not the case in dynamic semantics. However, similar dynamism obtains just as well with connectives other than conjunction. For instance, we may have an EG that represents a dynamic conditional, which Peirce termed iconically the scroll, in other words the continuous line of a cut folded in on itself. The scroll shows how information on the outer interior of the cut is transmitted to the inner interior of the cut. (This was used already in Figure 1.) Figure 4 now depicts three equivalent EGs in a fashion similar to those in Figure 3.
Figure 4: Three equivalent existential BETA graphs. These three graph-instances are all equivalent in that, as soon as the ligature extends from the outer area of the scroll to the inner area, the corresponding spots become bound by it, and accordingly are saturated by the values selected for the identity line in question.20 The selection of values and their instantiation in identity lines (i.e., the selectives) is crucial. As noted, such a selection is performed by evoking two make-believe parties, the GRAPHEUS and the GRAPHIST, both of whom have a specific, truth-conditional objective in view. Purpose is, of course, also at the heart of pragmatic logic itself. Peirce elaborated this by saying that the pragmatist has in view a “definite purpose in investigating logical questions”, and that he “wishes to ascertain the general conditions of truth” (2.379). 20
It is a measure of the generality of diagrammatisation that pronouns may, with no extra effort, be also anaphorically bound to a universal quantifier (Every player picks a card. He puts it in his hand). This has proved to be beyond the expressive power of standard dynamic semantics.
264
Ahti-Veikko Pietarinen
The parties are assumed to have acquired collateral information from observation and experience. It is thus the purpose, together with the common knowledge that constitutes their common ground, including the memory of their own factual and conceptual knowledge, which enables them to check for relevance and to make correct predictions (Pietarinen, 2004b). But they also update that knowledge and the common ground as interpretation proceeds, since in encountering logical constants while evaluating the graphs according to the EP, the values chosen for them are added to their stock of information. By doing this, the parties also come to a mutual realisation of the general conditions concerning the truth of the assertions that they undertook to discourse upon. It is precisely in this manner that the logic of EGs fulfils the pragmatist agenda that Peirce set out. It is also in these pragmatistic mindsets that the logic of EGs and their endoporeutic interpretation is at the same time both verificationist (in a manner and extent similar to that in which the pragmatic maxim is verificationist) and truth-conditional, taking the best of both worlds.21 John Sowa (1997) has argued that EGs are isomorphic to the discourserepresentation structures (DRS) of the discourse-representation theory (DRT; Kamp, 1981; Kamp & Reyle, 1993). DRT is nevertheless non-compositional in that no algorithm exists that would reproduce the structure of natural language in a compositional way at the level of DRSs. Dynamic semantics was created to make the processing compositional by eliminating the level of DRSs and replacing it with the language of symbolic logic (Groenendijk & Stokhof, 1991, 2000). The non-compositionality of DRT appears, according to this view, to be on a par with that of the non-compositionality of EGs. However, I also noticed that EGs are similar to dynamic theories of logic in relaxing the notion of the scope of quantifiers, and in distinguishing between binding a variable and logical connectives being syntactically subordinated in terms of adhering to linear, step-by-step interpretation. Far from exhibiting any tension, this affinity of EGs to both DRT and dynamic semantics only serves to vindicate the versatility of setting up logical systems in an iconic, diagrammatic manner. It would be a non-sequitur from this to expect a thoroughly compositional process to run all the way from natural language to its interpretation, possibly via some intermediate iconic representation, since compositionality is understood quite differently in diagrams. The sense in which EGs may be rendered compositional is to take the dynamic approaches that extend the concept of the assignment as carrying over to the corresponding extensions of the semantics of EGs, starting with Tarski-type semantics for the 21
Accordingly, arguments for game-theoretic semantics (Hintikka, 1987) also hold for EGs.
The Compositionality of Concepts and Pragmatic Logic
265
existential BETA graphs (Burch, 1997; Hammer, 1998).22 Without such an extension, the EP of EGs is indeed non-compositional in the broad sense in which DRT is non-compositional, namely representing natural language in more than one dimension and then utilising an outside-in interpretation of instantiation of selectives. Equivocation in compositionality is imminent when moving from discrete-symbolic representations to continuous-iconic ones.
7
Conclusions
The question of the relative priority of compositional and non-compositional systems was something that simply did not present itself as the most-central issue in Peirce’s logical investigations or, in broad terms, in his semeiotics. In setting up EGs, several issues were encountered which, only in retrospect, pertain more or less intimately to issues concerning the compositionality of interpretation. While Peirce made several choices that ultimately led to compositional meaning, it is unlikely that these choices were motivated by considerations featuring in contemporary literature, including those of learnability, systematicity or communicativity of language. The most important choice of all, the Endoporeutic Principle, erupted upon him with such cataclysmal force that it remained the one and the only game in town that he ever came to consider and endorse.23 That Peirce would have been keen to repudiate the utility of any alternative compositional version of semantics over his non-compositional one is supported not only by his overall pragmatic and exterior-to-interior method of interpretation, but also by his ever-present holistic attitude to recognition, perception, and other cognitive tasks. Such an Aristotelian holon takes graph-instances, definite but incomplete snapshots of the contents of the mind, as a unity regulated by the true continuity in synechism, mathematically rendered as topological, truth-preserving mappings from one graph-instance to another. If, as a result of violating transformation rules, the relative positions of graph-instances on a given graph change, this change must affect the composition of the given graph as a whole. Analogously, if the consequences that follow from the graph and which constitute the 22
These authors fail to note that their semantics should not distinguish between the three graphs depicted in Figures 3 or 4. 23 The reviewer notes the paper of Pagin & Westerst˚ahl (1993), which presents a flexible outside-in binding of variables for predicate logic to account for natural language anaphora, and which has a postscript in which the authors briefly compare their system with Zeman’s (1964) rendering of Peirce’s EGs.
266
Ahti-Veikko Pietarinen
aggregate of meaning as the collection of graph-instances change, this change must affect the nature of the meaning of the given graph. In doing this, Peirce came to anticipate the later Gestalt theories of representation and Gestalt-like organisation and the “mental composite photograph”24 type of composition of percepts (Pietarinen, 2004d). He also anticipated the later non-monotonic logics via his tychims (Marostica, 1997), a doctrine of absolute chance which, synthesised with pragmatism, gave rise to early excursions in logically exact accounts of continuity. Acknowledgements Supported by the Academy of Finland (Project No. 1103130). My thanks go to an anomyous reviewer as well as to the organisers and participants of the Compositionality, Concepts and Cognition Conference held in D¨usseldorf in 2004. References Badesa, C. (2004). The birth of model theory: L¨owenheim’s theorem in the frame of the theory of relatives. Princeton: Princeton University Press. Beaney, M. (1997). The Frege reader. Oxford: Blackwell. Benthem, J. van, Muskens, R., & Visser, A. (1997). Dynamic semantics. In J. van Benthem & A. ter Meulen (Eds.), Handbook of logic and language (pp. 587–648). Amsterdam: Elsevier. Burch, R. W. (1997). A Tarski-style semantics for Peirce’s beta graphs. In J. Brunning & P. Forster (Eds.), The rule of reason: The philosophy of Charles Sanders Peirce (pp. 81–95). University of Toronto Press. Dewey, J. (1931). Context and thought. University of California Publications in Philosophy, 12, 203–224. Fodor, J., & Lepore, E. (2002). The compositionality papers. Oxford: Clarendon Press. Groenendijk, J., & Stokhof, M. (1991). Dynamic predicate logic. Linguistics and Philosophy, 14, 39–100. 24
2.438, c.1893, The Grammatical Theory of Judgment and Inference.
The Compositionality of Concepts and Pragmatic Logic
267
Groenendijk, J., & Stokhof, M. (2000). Meaning in motion. In K. von Heusinger & U. Egli (Eds.), Reference and anaphoric relations (pp. 47–76). Dordrecht: Kluwer. Hammer, E. M. (1998). Semantics for existential graphs. Journal of Philosophical Logic, 27, 489–503. Hilpinen, R. (1982). On C. S. Peirce’s theory of the proposition: Peirce as a precursor of game-theoretical semantics. 65, 182–188. Hintikka, J. (1973). Logic, language-games and information. Oxford: Oxford University Press. Hintikka, J. (1987). Game-theoretical semantics as a synthesis of verificationist and truth-conditional meaning theories. In E. Lepore (Ed.), New directions in semantics (pp. 235–258). London: Academic Press. Hintikka, J. (1996). The principles of mathematics revisited. New York: Cambridge University Press. Hodges, W. (1997). Compositional semantics for a language of imperfect information. Logic Journal of the IGPL, 5, 539–563. Janssen, T. M. V. (1997). Compositionality. In J. van Benthem & A. ter Meulen (Eds.), Handbook of logic and language (pp. 417–473). Amsterdam: Elsevier. Janssen, T. M. V. (2002). On the interpretation of IF logic. Journal of Logic, Language and Information, 11, 367–387. Kamp, H. (1981). A theory of truth and semantic representation. In J. Groenendijk (Ed.), Formal methods in the study of language (pp. 475–484). Amsterdam: Mathematical Centre. Kamp, H., & Reyle, U. (1993). From discourse to logic. Introduction to modeltheoretic semantics of natural language, formal logic and discourse representation theory. Dordrecht: Kluwer. K¨ohler, W. (1940). Dynamics in psychology. New York: Liveright. Marostica, A. (1997). A nonmonotonic approach to tychist logic. In N. Houser, D. D. Roberts, & J. Van Evra (Eds.), Studies in the logic of Charles Sanders Peirce (pp. 545–559). Bloomington: Indiana University Press. Pagin, P. (2003). Communication and strong compositionality. Journal of Philosophical Logic, 32, 287–322.
268
Ahti-Veikko Pietarinen
Pagin, P., & Westerst˚ahl, D. (1993). Predicate logic with flexibly binding operators and natural language semantics. Journal of Logic, Language and Information, 2, 89–128. Peirce, C. S. (1931–58). Collected papers of Charles Sanders Peirce. Cambridge: Harvard University Press. Peirce, C. S. (1967). Manuscripts in the Houghton Library of Harvard University. Amherst: University of Massachusetts Press. (Annotated Catalogue of the Papers of Charles S. Peirce) Peirce, C. S. (1975–1987). Contributions to the Nation. Lubbock: Texas Tech University Press. Peirce, C. S. (1977). Semiotics and significs. The correspondence between Charles S. Peirce and Victoria Lady Welby. Bloomington: Indiana University Press. Pietarinen, A.-V. (2003). Peirce’s game-theoretic ideas in logic. Semiotica, 144, 33–47. Pietarinen, A.-V. (2004a). The endoporeutic method. Digital Encyclopedia of Charles S. Peirce. (www.digitalpeirce.fee.unicamp.br/endo-p.htm) Pietarinen, A.-V. (2004b). Grice in the wake of Peirce. Pragmatics & Cognition, 12, 295–315. Pietarinen, A.-V. (2004c). Peirce’s diagrammatic logic in IF perspective. In A. Blackwell, K. Marriott, & A. Shimojima (Eds.), Diagrammatic representation and inference, lecture notes in artificial intelligence (Vol. 2980, pp. 97–111). Berlin: Springer-Verlag. Pietarinen, A.-V. (2004d). Logic, phenomenology and neuroscience: in cahoots? In G. B¨uchel, B. Klein, & T. Roth-Berghofer (Eds.), Proceedings of the first international workshop on philosophy and informatics (Vol. 112). Technical University of Aachen (RWTH): CEUR Workshop Proceedings. Pietarinen, A.-V. (2005a). Signs of logic: Peircean themes on the philosophy of language, games, and communication. Springer. Pietarinen, A.-V. (2005b). Peirce’s magic lantern: moving pictures of thought. Transactions of the Charles S. Peirce Society: A Quarterly Journal in American Philosophy. Roberts, D. D. (1973). The existential graphs of Charles S. Peirce. The Hague: Mouton.
The Compositionality of Concepts and Pragmatic Logic
269
Sandu, G., & Hintikka, J. (2001). Aspects of compositionality. Journal of Logic, Language and Information, 10, 49–61. Shin, S.-J. (2002). The iconic logic of Peirce’s graphs. Cambridge: MIT Press. Sowa, J. F. (1997). Matching logical structure to linguistic structure. In N. Houser, D. D. Roberts, & J. Van Evra (Eds.), Studies in the logic of Charles Sanders Peirce (pp. 418–444). Bloomington: Indiana University Press. Sperber, D., & Wilson, D. (1995). Relevance theory: Communication and cognition. Oxford: Blackwell. Stenning, K. (2000). Distinctions with differences: comparing criteria for distinguishing diagrammatic from sentential systems. In M. Anderson (Ed.), Diagrams 2000, lecture notes in artificial intelligence (Vol. 1889, pp. 132– 148). Springer. Zeman, J. J. (1964). The graphical logic of C.S. Peirce. University of Chicago. (Online edition, 2002, http://web.clas.ufl.edu/users/jzeman/)
Semantic Holism and (Non-)Compositionality in Scientific Theories Gerhard Schurz
1
The Semantics of Theoretical Terms
It is a well-established view in the Philosophy of Science that scientific theories introduce new terms, so-called theoretical terms. They denote properties, usually functional properties, which go beyond what is observable or given by common sense. Their meaning is not pre-theoretically given. It is specified by nothing else but the theory itself. The same holds for the denotations of theoretical terms – these denotations are not pre-theoretically given but are hypothetically postulated by the theory. Consider, as our main example, the way how mass and force are characterized by the axioms of Newtonian physics: (N1) For all physical objects x and times t: The sum of all forces acting on x at t equals the mass of x times the acceleration of x at t. (N2) Whenever x exhibits a force onto y, then y exhibits a force of the same amount and opposite direction onto x. (NG) For all x, y and t, the gravitational force between x and y at t g·m(x)·m(y) equals the amount of |s(x,t)−s(y,t| 2 with the direction pointing from x to y, and vice versa. Three things are very typical in this theory T : (1.) The theory T itself is specified by a finite set of axioms. So we may identify T with a single statement – the conjunction of its axioms. (2.) The only observational or pre-theoretical concepts in Newtonian physics are object, time, and position as a dependent function of object and time, plus its derivatives w.r.t. time, velocity and acceleration. The meaning of the two Address for correspondence: Department of Philosophy, Heinrich-Heine University D¨usseldorf, Universit¨atsstraße 1, 40225 D¨usseldorf, Germany. E-mail: [email protected]. The Compositionality of Meaning and Content. Volume I: Foundational Issues. Edited by Markus Werning, Edouard Machery, & Gerhard Schurz. c
2005 Ontos Verlag. Printed in Germany.
272
Gerhard Schurz
T -theoretical concepts, namely mass and force, is simultaneously characterized by the axioms of T . (3.) The meaning of mass and force is not given by observation, common sense, or in any other theory-independent way. For example, it would be inadequate to regard the common sense meaning of force as ‘something which pushes or pulls’ as the semantic core of the meaning of this term. (4.) The theory is expandable by adding more special force laws such as frictional forces, which emerge only under certain conditions. We call a theoretical term more specifically a T -theoretical term; this indicates that the meaning of the term is characterized by or within theory T . When we call a term T -theoretical we always assume that T contains all of the theory which is relevant for specifying the meaning of this term. Those terms in a theory T whose meaning is given independently of T , for example by observation, common sense or by means of pre-theories of lower level, are called pre-T -theoretical terms. So the general picture and notation is this: A theory T = T (τ 1 , . . . , τ n ) is a statement with T -theoretical terms τ 1 , . . . , τ n , which are usually predicate or function terms, and with preT -theoretical terms p1 , . . . , pm , whose meaning is independently given (by observation or pre-theoretical means). The underlying language L (T ) consists of all formulas recursively built up from logical and mathematical symbols, pre-T -theoretical terms p1 , . . . , pm and T -theoretical terms τ 1 , . . . , τ n . L p (T ) is the pre-T -theoretical (observational) sublanguage of L (T ), consisting of all formulas recursively built up from logico-mathematical and the pre-T -theoretical terms alone. 2
Six Problems of Semantic Theory Holism
That the meaning of T ’s theoretical terms is characterized by the theory T itself is usually called the phenomenon of semantic holism. Let me now explain six frequently discussed problems associated with semantic holism in scientific theories: 1.) The problem of definability: In the times of Hilbert and Schlick, characterizations of terms by sets of theoretical axioms have been called implicit definitions. But usually these implicit definitions are not at all definitions, because they do not uniquely fix the extensions of the implicitly characterized terms (this is made especially clear by the theorem of Beth). It seem that we have to conclude, as many philosophers of science did, that theoretical terms are simply undefinable. 2.) The problem of analytic-synthetic-distinction: Moreover, if the whole of T characterizes the meaning of the T -theoretical terms, then it seems to be
Theories, Holism and Compositionality
273
impossible to divide T ’s axioms in a non-arbitrary manner into analytic stipulations and synthetic conjectures. That the analytic-synthetic-distinction is generally unclear has been urged by Quine. In the case of scientific theories this consequence seems to be especially forcing. 3.) The problem of compositionality: According to a standard view in semantics, the meaning of sentences or complex terms is compositional, that is, it is determined as a function of the meaning of the syntactic constituents or parts. For example, the meaning of Peter is tall is determined by the meanings of the name Peter, the predicate tall and the copula is. Prima facie, the meaning determination of theoretical terms conflicts with this principle. For the meaning of the T -theoretical terms τ 1 , . . . , τ n occurring in the sentence T (τ 1 , . . . , τ n ) is determined by the meaning of the sentence T (τ 1 , . . . , τ n ): you must understand the theory in the first place in order to understand its T -theoretical terms τ 1 , . . . , τ n . So the meaning determination does not go from the parts of T to the whole of T , but from the whole of T to its T -theoretical parts – this is the core thesis of semantic holism. The same holds for the determination of (extensional) denotations. Very generally, meanings are uniquely associated with functions from possible worlds to denotations. So every way of meaning determination brings with it a structurally similar way of denotation determination within a given world. 4.) The problem of circularity: If the theory T is theoretically non-trivial in the sense that it is not synonymous with a pre-theoretical statement, then T ’s meaning must depend on the meaning of its theoretical terms. But on the other hand, the meaning of T ’s theoretical terms is determined by the meaning of T . We must understand T in order to understand T ’s theoretical terms, and we must understand T ’s theoretical terms in order to understand T . So the computation of the meaning of T ’s theoretical terms seems to be circular and hence does not lead to any result. Something must have gone badly wrong here. 5.) The problem of incommensurability: How can the defenders of the two rivalizing physical theories, say T 1 and T 2 , rationally disagree with each other? Let us assume that T 1 and T 2 contain the same theoretical terms, e.g. mass and force, but they make different theoretical assertions about these terms. Then it seems to follow that the meaning of these terms is different and, hence, the two physicists don’t really disagree with each other but just speak about different things. It seems that distinct theories are semantically incommensurable. This is one version of Kuhn’s and Feyerabend’s famous incommensurability thesis, and it is also one of Fodor’s main arguments against strong semantic holism. 6.) The problem of synthetic (non-analytic) content: Connected with the problem of incommensurability is the danger of analytic triviality: if it is solely my theory T which determines the meaning of my T -theoretical terms, how can I ever be wrong? Whoever attacks some of my theoretical claims seems
274
Gerhard Schurz
to speak about different things (cf. Papineau 1996). Does semantic holism turn every theory into an analytically true statement? How can the synthetic content of a theory be characterized in a non-circular way? So far, so good. In the next two sections I will reconstruct the RamseyCarnap-Lewis account of theoretical terms, and I will try to explain how this account solves five of these six problems in a straightforward way. The exception is the problem of compositionality which requires further explorations. 3
The Ramsey-Carnap-Account of Scientific Theories
According to an idea which goes back to Ramsey (1931, pp. 212–215) and was taken up by Carnap, the synthetic content of a theory can be expressed without T ’s theoretical terms at all. We just have to existentially quantify over these terms. So the synthetic content of (1) T : T (τ 1 , . . . , τ n )
(T ’s postulate)
is expressible in a non-circular manner by: (2) R(T ): ∃X 1 , . . . , X n T (X 1 , . . . , X n )
(the Ramsey sentence of T).
Here I use X i as second order variables. The assumption that R(T ) expresses all of T ’s synthetic content is perfectly reasonable under the assumption that the meaning of T ’s theoretical terms is solely given by what T says about them – then all what T says about the world is R(T ): there exists an n-tuples of properties which satisfies what the theory says about these properties. Obviously, (3) T ` R(T ), where ` stands for logical consequence, and it can easily be proved that (4) ∀A ∈ L p (T ) : T ` A iff R(T ) ` A (cf. Ketland, 2004, p. 293. Th. 3), so the Ramsey-sentence has the same pre-theoretic consequences as T , which additionally strengthens the view that R(T ) expresses T ’s synthetic content (for more details cf. Tuomela, 1973, and Ketland, 2004). Accordingly, Carnap (1963) (see also Carnap, 1966) proposed to formulate the global analytic content of the theory T as: (5) C(T ):
R(T ) → T
(the Carnap sentence).
It is easily seen that (6) ` (R(T ) ∧C(T )) ↔ T so the Ramsey- and Carnap-sentences are in sum logically equivalent with T , and it can be proved that
Theories, Holism and Compositionality
275
(7) ¬∃A ∈ L p (T ) : 6` A, but C(T ) ` A (cf. Tuomela, 1975, p. 59), so C(T ) has no synthetic pre-T -theoretical consequences at all, which gives additional support to the view that C(T ) expresses T ’s global analytic content. All this seems to be very nice. I call this the RC- (Ramsey-Carnap) account of theoretical terms. Let me first mention a subtle problem connected with the Ramsey-sentence: is it a T -theoretical or a pre-T -theoretical statement? This depends: 1.) If you let the 2nd order variables range over arbitrary extensions, i.e. over arbitrary subsets of the domain, and if the domain consists only of objects to which one has pre-theoretical or empirical access, then the Ramsey-sentence is a pre-T -theoretical statement. Its content is then the same as the empirical or pre-theoretical content of the theory, and the same holds for the associated classes of pre-theoretical models. This position corresponds to what is called the instrumentalistic theory-view: in this view, no entities beyond observable entities are postulated. This view has been taken by many philosophers of science, from Sneed (1971) and the structuralists up to Ketland (2004). 2.) On the other hand, if you let the 2nd order variables range over theoretical properties, or if you postulate the existence of new theoretical individuals, then the Ramsey statement is a T -theoretical statement. Then its content is logically stronger than its pre-theoretical content, because then only those expansions of T ’s pre-theoretical models in which the extensions of the theoretical terms correspond to existing properties can be models of T . This position corresponds to what is called the realistic theory-view. After this clarification let me try to show that the RC-account solves already four of our six problems – all except definability and compositionality. (1.) The problem of the synthetic content of T (No. 6) is solved: this content is non-circularly specified by R(T ). (2.) The problem of incommensurability (No. 5) is solved provided we assume that the two rivalizing theories T 1 and T 2 share a common underlying pre-theoretical language. The defender of T 1 can understand the rivalizing theory T 2 perfectly well because she can formulate T 1 ’s synthetic content in her own pre-theoretical language. And vice versa. The proponents of two rivalizing theories can rationally disagree because both can formulate their Ramseysentences in their shared pre-theoretical language, and their Ramsey-sentences may logically contradict each other. (3.) The problem of the analytic-synthetic-distinction (No. 2) is solved, or at least dissolved, but only in a global way, by the division of T into the Ramseyand Carnap sentence of T . There is no analytic-synthetic-distinction among T ’s axioms: each subset of T ’s axioms has its Ramsey- and Carnap-sentence. So every one of T ’s axioms figures simultaneously as a synthetic assertion and as part of a meaning characterization. However, this situation creates a successor
276
Gerhard Schurz
problem – namely to separate the semantic core axioms of a theory which define the identity of the denotations of the T -terms from peripherical axioms which can be changed without changing the reference of T ’s theoretical terms. This distinction is important when considering theory-expansions or -revisions. I cannot enter this problem here (cf. Papineau, 1996, on this problem). (4.) The circularity problem (No. 4) is solved: the meaning of T ’s theoretical terms is characterized in a non-circular way, by the Carnap-sentence C(T ) saying this: in every world or model, the n-tuple of T ’s theoretical concepts denotes some n-tuple of properties which satisfy the open theory formula T (X 1 , . . . , X n ), provided there exists such an n-tuple at all. Otherwise T ’s theoretical concepts are denotationless and T is false. (5.) However, the problem of definability is not solved by the RC-account, on three reasons. First, the meaning characterization of T ’s theoretical terms is global but not local – it does not apply to the terms separately. Second, this meaning characterization is not unique: there may exist several different n-tuples of properties in our world which satisfy T ; so the meaning remains indeterminate. Thirdly, the meaning-characterization is only partial: it does not specify the denotation in worlds where the theory is false. (6.) Finally, the compositionality problem remains unsolved; I will turn to this problem in the section 5. 4
The Lewis-Strengthening of the Ramsey-Carnap Account
Lewis (1970) attempts to repair the troublesome fact that the Ramsey-sentence does not uniquely fix the denotation of the T -terms in models of T . Lewis (1970) and Papineau (1996) argue that a situation of non-uniqueness might be possible in abnormal cases, but normal or at least good scientific theories do fix the denotation of their T -terms uniquely in our world. At least this is something which the defender of a theory should expect and hence, should implicitly assert. So Lewis strengthens the Ramsey-sentence to what I call the LewisRamsey-sentence:1 (8) LR(T ): ∃!X 1 , . . . , X n T (X 1 , . . . , X n ) (the Lewis-Ramsey-sentence), V that is: ∃ X 1 , . . . , X n ∀Y 1 , . . . ,Y n (T (Y 1 , . . . ,Y n ) ↔ X i = Y i )). 1≤i≤n
Observe that (9) T 6` LR(T ). 1
Lewis reconstructs properties by singular terms and relates them to individuals via a relation of “having”; but like Papineau (1996) I think it is more appropriate to reconstruct properties by general terms.
Theories, Holism and Compositionality
277
Since LR(T ) ` T , LR(T ) is stronger than T . But Lewis claims that every defender of T implicitly postulates LR(T ). So Lewis takes LR(T ) to express the entire synthetic content of the theory T understood in this implicit way. The major clue of Lewis is the following: based on the fact that the theory strengthened in this way implies the uniqueness condition, Lewis suggests to express T ’s analytic content by the following local definitions of T ’s theoretical terms by definite descriptions: (10) Def(T ) (the definition set of T ) for 1 ≤ i ≤ n: τ i := (that X i : ∃!X 1 , . . . , X i−1 , X i+1 , . . . , X n T (X 1 , . . . , X n )). In words: τ i denotes the ith entity in the unique n-tuple of entities satisfying the open theory-formula. First note that if we treat definite descriptions classically as in Russell’s theory (cf. Russell, 1905) (11) A(that x : Fx) :↔ ∃x(A(x) ∧ ∀y(Fy ↔ x = y)), which entails the uniqueness condition ∃ ! xFx, then these ‘definitions’ would be false in all models where the uniqueness condition fails, in other words, where the Lewis-Ramsey-sentence is false. So these ‘definitions’ would not really be analytic statements and, hence, would not be definitions. Therefore Lewis assumes a free logic of the Dana Scott type (for an overview on free logics cf. Bencivenga, 1986). In this free logic, definite descriptions with false uniqueness conditions are denotationless; identity formulas are true iff both sides have the same denotation or are both denotationless. In such a free logic Lewis’ definition sentences can indeed be verified in all models of the defining terms by assigning either the unique denotation or no denotation to the defined T -terms. So the Lewis definitions can indeed be regarded as analytically true. Moreover, it can be proved that the Lewis definition set is free-logically equivalent with the following global meaning postulate, which I call the Lewis-Carnap-sentence: V (12) LC(T ): (LR(T ) → T ) ∧ (¬LR(T ) → ¬ ∃X(X = τ i )) 1≤i≤n
(the Lewis-Carnap-sentence). In words: if the open T -formula is uniquely realized then T is true, and else T ’s theoretical terms are denotationless. Again the synthetic content and the analytic content of T are clearly separated in Lewis’ account. But now we have a formulation of T ’s analytic content by local and seemingly explicit definitions, namely definitions by definite 2nd order descriptions. If we eliminate the theory’s theoretical terms by their Lewisdefinitions, we get the so called expanded postulate of T :
278
Gerhard Schurz
(13) EP(T ): (the expanded postulate of T ) T (that X i : ∃!X 1 , . . . , X i−1 , X i+1 , . . . , X n T (X 1 , . . . , X n ) : 1 ≤ i ≤ n). In words: T is true for that n-tuple of entities which uniquely satisfy the open T -formula. The expanded postulate reflects what the theory T asserts in a most direct way, and it will become significant below when we discuss compositionality. It is important for Lewis’ account that one assume the realistic theory view, that is, the 2nd order quantifiers range over theoretical properties and not over arbitrary subsets of the domain (cf. Papineau, 1996, p. 6, fn. 5). Therefore the Lewis-Ramsey-sentence and the Lewis definitions are T -theoretical, and not pre-T -theoretical in nature. Would the quantifiers range over arbitrary extensions of the domain, then the demanding uniqueness assumption could not be satisfied. This follows from well-known results about theoretical measurement. The value of theoretical functions is usually only determined for objects under special measurement conditions, and even then only up to scale invariance. Only if the quantifiers range over intensionally specified properties there is realistic hope for the uniqueness conditions to be satisfied. Under this condition I think Lewis has partially solved the definability problem insofar as he has given the best kinds of definitions which are possible. Nevertheless I want to point out that definitions by definite descriptions are different from properly explicit definitions in two crucial respects: (1.) First: Properly explicit definitions, such as ‘bachelors are unmarried men’, specify the denotation of the defined term in every model of the defining language. They are unconditional and full definitions. In contrast, definitedescription-definitions specify the denotation of the defined term only in those models of the defining language in which the unique existence condition is satisfied. So they are conditional and merely partial definitions. Lewis argues that in his free logic definite description definitions provide a full meaning characterization because they assign to the term a formal denotation, namely the ‘non-denotation’, even in models where the unique existence condition fails. But I regard this merely as a formally useful trick: a ‘non-denotation’ is no more a denotation than ‘nothing’ is an object. I conclude that also in Lewis’ free logic definite descriptions are conditional definitions which provide only a partial meaning specification. (2.) Second, properly explicit definitions provide us with a measurement method for the defined term in the following sense: they give us a computation method of the (truth) value of the defined term, for any given instance, from the given (truth) values of the defining terms for that instance. If the values of the defining terms are empirically measurable, then this computation method yields a measurement method for the defined term. In contrast, definitions by
Theories, Holism and Compositionality
279
definite descriptions do not give us any computation or measurement method at all. A 2nd order definite description tells us only that there exists a unique property satisfying a certain theory. In order to evaluate applications of this property for certain instances, we would first have to run through an enumeration of all properties and check for properties in this list satisfying the open theory formula. But an enumeration of all properties is impossible. Theoretical measurement requires the derivation of empirically contentfull 1st order consequences from the theory under special boundary conditions; this question is not even touched by the Lewis method. In this respect, note that the Lewis definition method works also for theories without any empirical content. Consider the example of Feynman (1973, p. 12–2): (14) Vorce := that property which causes objects free of forces to keep their constant velocity. This is a perfectly reasonable Lewis-definition, although it does not produce any new empirical content, whence physicists do not believe in the existence of vorces. I conclude: only if a theory has strong empirical content it is reasonable to interpret its theoretical terms realistically. So far I have shown that the RCL-account solves all of the six problems except the problem of compositionality. Let me turn to this final problem. 5
The Question of Compositionality in the RCL-Account
Consider again the expanded postulate of T in (13) above, which reflects T ’s assertion is a most direct way. To simplify considerations, consider a most simple sentence of type (13) with just one F-theoretical and singular term: (15) F(that x : Fx). Note: (15) is analytically equivalent with the uniqueness condition ∃!Fx. We assume, as Lewis does, that the sentence (15) receives the truth-value false if the term that x: Fx is denotationless. This implies that the Russell-schema (11) is true in every possible model and, hence, (15) is analytically equivalent with the uniqueness condition. Now, how is the meaning or denotation of the sentence (15) and that of its terms computed? By computing first the extension of the predicate Fx, second the truth value of the uniqueness condition ∃!Fx and hence the truth value of (15), and third, computing therefrom the denotation of the singular term that x: Fx. So the meaning (or denotation) of the singular term is computed from the meaning (or denotation) of the entire sentence. This is the opposite direction of a compositional meaning computation in the standard bottom-up way, which would have to proceed from the meaning of the
280
Gerhard Schurz
subsentential terms to the meaning of the sentence. Therefore the computation of (15) is not compositional in this standard sense. I think that this is exactly the point which defenders of theory-holism have in mind. And this point is right. But this flies in the face of a theorem of Hodges (2001). To understand this theorem, we first explain the formal notion of compositionality. We assume a syntactic term language consisting of a set of terms ∆ built up recursively from a set of atomic terms ∆at by a set of syntactic operations Σ. A complex term of the language has the syntactic structure σ (t 1 , . . . ,t n ) where the t i are terms and σ is a syntactic operation. A semantics for such a term language is a meaning function µ : ∆ → M which assigns to each term t its meaning µ(t); M a is set of meanings, whatever these meanings are (for example, they many be denotations, or functions from possible worlds into denotations). Such a semantics is called compositional if the following holds: (16) Formal compositionality of (∆, Σ, µ): For each syntactic operation σ ∈ Σ there exists a semantic operation µ σ such that for all complex terms s(t 1 , . . . ,t n ) ∈ ∆: µ(s(t 1 , . . . ,t n )) = µ σ (µ(t 1 ), . . . , µ(t n )). Now Hodges (2001) proves an extension theorem which implies the following: (17) If S is the set of all (meaningful) sentences and µ : S → M is a partial meaning function over S which is compositional and Husserlian over S, then µ has a compositional extension to the set of all terms ∆ which is uniquely determined up to isomorphism. The proof of (17) rests on the fact that S is cofinal in ∆, which means that every term in ∆ occurs in some sentence in S. The condition of compositionality and Husserlianicity are clearly met for the set of sentences of a first order language including definite descriptions.2 It follows that there exists a compositional semantics for sentences of the form (15) so that (18) µ(F(that x : Fx)) = µ ∈ (µ(Fx), µ(that x : Fx)), where ‘µ ∈ ’ is the meaning function of ‘predication’. True enough. But the point is: the existence of such a compositional bottom up meaning function does imply that we actually compute the meanings along this bottom up function. There may exist at the same time several different meaning functions. Some of them may start somewhere in the middle of the complexity of terms and may spread from there both downwards and upwards in complexity. Some of them may even be entirely top down functions, starting from the meaning of sentences and computing the meaning of subsentential terms therefrom. In fact, the proof of Hodges extension theorem (Hodges, 2001, p. 21) 2
Husserlianicity of S boils down to the condition that whenever µ(A) = µ(B) for A, B ∈ S, then for every sentence C[A] ∈ S which contains A, also C[B] is in S.
Theories, Holism and Compositionality
281
rests on a general method of constructing such a top down meaning function from S to ∆ which determines the meaning of subsentential terms in ∆ up to isomorphism, in a way, that at the same time a compositional bottom up function exists which determines the meaning of sentences from their subsentential terms uniquely. The point I want to make is that we should distinguish between (i) the question of the existence of a formally compositional meaning function, and (ii) the question of the meaning function according to which we actually compute the meanings of terms. In the case of sentences of type (15), although there exists a compositional bottom-up meaning function, we actually compute the meaning according to a partially top-down meaning function, which goes from the meaning of the predicate (the open theory formula) upwards to the meaning of the sentence (the theory) and from there downwards to the meaning of the singular terms (the theoretical terms). More precisely, the meaning function we actually use is this one (where ‘|X|’ denotes the cardinality of X): (19) µ(Fx) is given in a pre-F-theoretical way. µ(F(that x : Fx)) = µ ∃! (µ(Fx)) , := 1 if |µ(Fx)| = 1, and else = 0. µ(that x : Fx) = µ that (µ ∃! (µ(Fx)), µ(Fx)) := the element of µ(Fx) if µ ∃! (µ(Fx))) = 1, and else = ‘denotationless’. The moral which I want to draw is this: whether the actual semantics over a language is compositional does not only depend on the formal existence of a bottom up compositional meaning function, but also on the question whether we actually use this function in our computation (or determination) of meanings. The latter question depends crucially on the following: for which terms can meanings be independently specified, for example by observation or by pre-theoretical means, and for which terms is this not the case? In other words: which meanings figure as independent variables, and which meanings figure as dependent variables, in our actual meaning computation? I suggest to strengthen the formal notion of compositionality into an epistemic notion of procedural compositionality. To explicate this condition, I need to introduce certain cognitive concepts. A semantic computation algorithm A can be implemented in a human or non-human mind, a computer program, or whatever. Such an algorithm accepts as input the meanings of some terms in ∆. Let M be our range of possible meanings. An input function is a function I : ∆I → M which assigns meanings to a subset ∆I of the set of all terms ∆. The terms in ∆I need not all be syntactically primitive; top down computations methods are admitted. Starting with the input meanings I(t) for t ∈ ∆I the algorithm A calculates the meanings of certain other terms in ∆, one after the other, in a certain ordering. With ∆(I) we denote
282
Gerhard Schurz
the subset of all terms in ∆ which A computes from the input meanings of the terms in ∆I . So we have: (20) ∆I ⊆ ∆(I) ⊆ ∆, where in nontrivial computation cases, the left ⊆-relation is a proper subset relation. With µ I : ∆(I) → M we denote the meaning function which A computes over ∆(I) when we feed it with I. It is an important condition that our algorithm outputs every meaning it has computed immediately after computation. So the ordering of the algorithm’s outputs reflects the computational route which the algorithm takes. I suggest to formulate the condition of procedural compositionality as follows: (21) Procedural compositionality: The meaning function generated by a semantic algorithm A is procedurally compositional over a set of terms ∆ iff: (a) for every possible input I : ∆I → M, the meaning function µ I : ∆(I) → M computed by A agrees with some formally compositional meaning function ν : ∆(I) → M (that is, for all t ∈ ∆(I) : µ(t) = ν(t)), and (b) For every possible input I : ∆I → M and complex term σ (t 1 , . . . ,t n ): whenever A computes the meaning of σ (t 1 , . . . ,t n ), then A will first have computed the meanings of t 1 , . . . ,t n . In this sense, the meaning of F(that x : Fx) is not procedurally compositional for us, because we compute µ(F(that x : Fx)) – not after but – before we can compute µ(that x : Fx). In conclusion, the semantics of theories is indeed procedurally noncompositional and holistic. This explains several typical semantic facts associated with scientific theories. I mention only one of them: it is often mentioned in the literature on the didactics of physics, that in order to explain the meaning of a theoretical term of a physical theory such as ‘force’, we really must explain the entire relevant theory. It is impossible to explain the meaning of ‘force’ first, and then proceed to explain the meaning of Newton’s axiom. Finally a word on Fodor (1987, pp. 73–94), who has argued against semantic holism in scientific theories. Some of Fodor’s points – for example his arguments about the danger of semantic circularity – are refuted in a straightforward way by the RCL-account. One of Fodor’s central arguments in favor of compositionality says that compositionality implies the following condition of systematicity which is crucial to explain the semantical creativity of human language speakers (cf. Fodor, 1997): (22) Systematicity: If the meanings of ‘a is F’ and of ‘b is G’ are understood, then also the meanings of ‘a is G’ and ‘b is F’ are understood.
Theories, Holism and Compositionality
283
I want to demonstrate, finally, that systematicity is also guaranteed for theories according to the RCL-analysis, although these theories are not procedurally compositional. Assume a and F are T 1 -theoretical terms, and b and G are T 2 theoretical terms. According to the RCL-account we have : (23) F = (that X : ∃!x(T 1 (X, x))) a = (that x : ∃!X(T 1 (X, x))) G = (that X : ∃!x(T 2 (X, x))) b = (that x : ∃!X(T 2 (X, x))). One might suspect that because there is no theory which unifies both T 1 and T 2 , systematicity is not possible. But this is not true. Our actual meaning computation is this: (24) µ(Fa) = µ ∈ (µ(that X : ∃!x(T 1 (X, x))), µ(that x : ∃!X(T 1 (X, x)))), µ(Gb) = µ ∈ (µ(that X : ∃!x(T 2 (X, x))), µ(that x : ∃!X(T 2 (X, x)))). So, understanding the meanings of Fa and of Gb presupposes to understand the meanings of ‘that X : ∃!x(T 1 (X, x))’, ‘that x : ∃!X(T 1 (X, x))’, ‘that X : ∃!x(T 2 (X, x))’, and of ‘that x : ∃!X(T 2 (X, x))’. But this implies that also the meanings of Fb and Ga are understood, namely as follows: (25) µ(Fb) = µ ∈ (µ(that X : ∃!x(T 1 (X, x))), µ(that x : ∃!X(T 2 (X, x)))), µ(Ga) = µ ∈ (µ(that X : ∃!x(T 2 (X, x))), µ(that x : ∃!X(T 1 (X, x)))). In conclusion, although epistemic (or procedural) theory holism is true, it does not do any harm – it even does not destroy the central feature of systematicity. References Bencivenga, E. (1986). Free logics. In F. G. D. Gabbay (Ed.), Handbook of philosophical logic (Vol. III, pp. 373–426). Reidel. Carnap, R. (1963). Replies and systematic expositions. In P. A. Schilpp (Ed.), The philosophy of Rudolf Carnap (pp. 958–965). Open Court, La Salle. Carnap, R. (1966). Philosophical foundations of physics. Basic Books, New York. Feynman, R. (1973). Lectures on physics. Oldenburg Bilingua, Munich. Fodor, J. (1987). Psychosemantics. MIT Press, Cambridge/MA. Fodor, J. (1997). Connectionism and the problem of systematicity. Cognition, 62, 109–119.
284
Gerhard Schurz
Hodges, W. (2001). Formal features of compositionality. Journal of Logic, Language, and Information, 10, 7–28. Ketland, J. (2004). Empirical adequacy and Ramsification. British Journal for the Philosophy of Science, 55, 287–300. Lewis, D. (1970). How to define theoretical terms. Journal of Philosophy, 67, 427–446. Papineau, D. (1996). Theory-dependent terms. Philosophy of Science, 63, 1–20. Ramsey, F. P. (1978). Theories. In R. Braithwaite (Ed.), The foundations of mathematics. London: Routledge and Kegan Paul. Russell, B. (1905). On denoting. Mind, 14, 479–493. Sneed, J. D. (1971). The logical structure of mathematical physics. Dordrecht: Reidel. Tuomela, R. (1973). Theoretical concepts. Berlin: Springer.
Right and Wrong Reasons for Compositionality Markus Werning In this paper I would like to cast a critical look on the potential reasons for compositionality. I will, in particular, evaluate if and to which extent the most often cited reasons in favor of compositionality, viz. productivity, systematicity and inferentiality – each taken as properties of either language or cognition – may be justly regarded as justifications for compositionality. The results of this investigation will be largely negative: Given reasonable side-constraints, the reason of productivity faces counterexamples of productive languages that cannot be evaluated compositionally. Systematicity has less to do with compositionality than with the existence of semantic categories. The belief that inferentiality is only warranted in compositional languages is a pious hope rather than a certainty. Alternative reasons will be explored at the end of the paper. Before I turn to its reasons, I will explicate the notion of compositionality and say something about its alleged vacuity. 1
The Notion of Compositionality
Although the idea of compositionality is perhaps much older, the locus classicus is Frege’s posthumously published manuscript Logic in Mathematics: [...] thoughts have parts out of which they are built up. And these parts, these building blocks, correspond to groups of sounds, out of which the sentence expressing the thought is built up, so that the construction of the sentence out of parts of a sentence corresponds to the construction of a thought out of parts of a thought. And as we take a thought to be the sense of a sentence, so we may call a part of a thought the sense of that part of the sentence which corresponds to it. (Frege, 1914/1979, p. 225) Frege’s claim can be put in more general terms to make it acceptable even to someone who refuses to adopt thoughts and senses into his universe of discourse. To do so, we must distinguish three aspects of Frege’s statement. Address for correspondence: Department of Philosophy, Heinrich-Heine University D¨usseldorf, Universit¨atsstraße 1, 40225 D¨usseldorf, Germany. E-mail: [email protected]. The Compositionality of Meaning and Content. Volume I: Foundational Issues. Edited by Markus Werning, Edouard Machery, & Gerhard Schurz. c
2005 Ontos Verlag. Printed in Germany.
286
Markus Werning
First, he claims that there is a part-whole relation between sentences and less complex items of language (‘groups of sounds’). Cases of ambiguous expressions call for a distinction between mereological and syntactic parts (or constituents). The mereological relation is defined as follows: Definition 1 (Mereological constituency). A (spoken or written) utterance s is called a mereological part (or constituent) of an utterance t if and only if for any physical token of t in some region of space in some interval of time, s is physically tokened in the same region and the same interval. Mereological constituency, thus, is a relation of spatio-temporal cooccurrence. If we, however, contented ourselves with mereological constituency, we would be likely to run into a problem with the second aspect of Frege’s statement: Sentences express thoughts, where for Frege thoughts are nothing but the meanings of sentences. Mereological constituency is a relation apparently too weak to cope with the difficulty that some expressions are either lexically or grammatically ambiguous and need disambiguation in order to be related to their meanings by a function rather than by a many-many relation. The common way to achieve disambiguation is to construe expressions as (abstract) terms combined from syntactic parts and not as (material) utterances combined from mereological parts: Definition 2 (Syntactic constituency). A term s of a language L is called a syntactic part (or constituent) of a term t of L if and only if a) there is a partial function α from the n-th Cartesian product of the set of terms of L into the set of terms of L such that t is a value of the function α with s as one of its arguments and b) there is a syntactic rule of L according to which α(s1 , ..., sn ) is a wellformed term of L if α is defined for (s1 , ..., sn ) and if s1 , ..., sn are wellformed terms of L. In short, a term and any of its syntactic parts stand in the relation of value and argument of a syntactic operation. To guarantee unique reference to terms, they should be identifiable by their syntactic parts and the way they have been combined therefrom. This is expressed by the property of unique term identification (cf. Hendriks, 2001): Definition 3 (Unique term identification). The terms of a language are called uniquely identifiable just in case, for any terms t 1 , ...,t n ,t 0 1 , ...,t 0 n and syntactic operations σ , σ 0 of the language, the following conditional holds: σ (t 1 , ...,t n ) = σ 0 (t 0 1 , ...,t 0 n ) ⇒ σ = σ 0 ∧ t 1 = t 0 1 ∧ ... ∧ t n = t 0 n .
Reasons for Compositionality
287
To link terms to utterances, it is common to introduce a surface function for a language, i.e., a surjective function that maps the set of terms onto the set of (types of) utterances. The third aspect Frege maintains is that the part-whole relation in the linguistic realm, which I will now identify with the relation of syntactic constituency, corresponds to some part-whole relation in the realm of meaning. The sort of correspondence that best fits in place is that of a homomorphism. This analysis of Frege’s statement leads us to the modern (and precise) notion of semantic compositionality as it has been successively developed by Montague (1970/1974), Janssen (1986), Partee, ter Meulen and Wall (1990), and Hodges (2001). We define the grammar G of a language L as a pair G = hT, Σi, where T is the set of terms of L and Σ is the list of basic syntactic operations α 1 , ..., α j of L. The set T is the closure of a set of primitive terms with regard to recursive application of the syntactic operations. The set of atomic terms is uniquely determined by the grammar as the set of terms that are not in the range of any basic syntactic operation. For technical reasons, we allow terms to have variables ξ , ξ 0 , ξ 1 , etc. as syntactic parts. The set of grammatical terms GT (G) is a set of terms such that the terms of the set do not contain any variables. We understand a meaning function µ of a language to be a function that maps a subset of the language’s set of grammatical terms to their µ-meanings. A grammatical term of the language is called µ-meaningful if the term is in the domain of the meaning function µ. Having introduced all these notions, we can now define the notion of a compositional meaning function: Definition 4 (Compositional meaning function). Let µ be a meaning function for a language with grammar G, and suppose that every syntactic part of a µmeaningful term is µ-meaningful. Then µ is called compositional if and only if, for every syntactic operation α of G, there is a function µ α such that for every non-atomic µ-meaningful term α(t 1 , ...,t n ) the following equation holds: µ(α(t 1 , ...,t n )) = µ α (µ(t 1 ), ..., µ(t n )). A language is called compositional just in case it has a total compositional meaning function. A language, it follows, is compositional just in case the algebra of its grammatical terms hGT (G), {α 1 , ..., α j }i is homomorphous to a semantic algebra hµ[GT (G)], {µ α(1) , ..., µ α(j) }i.
288
2
Markus Werning
The Alleged Vacuity of Compositionality
The condition of compositionality can fairly easily be trivialized in various ways. Van Benthem was the first to raise this issue: The general outcome may be stated roughly as ‘anything goes’ – even though adherence to the principle [of compositionality] often makes for elegance and uniformity of presentation. [...] we are entitled to conclude that by itself, compositionality provides no significant constraint upon semantic theory. (van Benthem, 1984, p. 57) First, for every syntax whatsoever one can take the identity mapping as a compositional meaning function. Every syntax may serve as a semantics for itself. For, in that case we have an isomorphism between syntax and semantics and consequently the homomorphism required by the principle of compositionality is warranted. The price though is a certain form of semantic hyper-distinctness: meanings are as fine-grained as expressions because the meaning function is injective. In languages with a hyper-distinct meaning function there are no synonymous expressions at all. Not even directly logically equivalent sentences would be synonymous. To avoid hyper-distinctness, one may supplement the principle of compositionality by a requirement for non-hyper-distinctness as defined as follows: Definition 5 (Non-hyper-distinctness). Given a language L with the set of grammatical terms GT (L). A meaning function µ with domain GT (L) is called non-hyper-distinct if there are grammatical terms s,t ∈ GT (L) such that s 6= t and µ(s) = µ(t). Second, if one does not in some way restrict the surface function, which maps terms to utterances, the syntax algebra one chooses as underlying a language is virtually free. The danger of vacuity with regard to the principle of compositionality of meaning is a side product of the dissolution of ambiguities. After we have disambiguated expressions, our semantic theory, which initially only had to deal with material utterances and meanings, has been amplified by a realm of terms: the syntax algebra. In contrast to the other two realms, the latter figures as a black box in our theory: We can take for granted that the structure and elements of the realm of utterances are directly accessible through observation. Let’s also grant for the moment that we, by intuitive judgements on analyticity, synonymy and other semantic issues, have some, though still meager, access to the structure and elements of the semantics. Terms, though, are nothing but unobservable posits.
Reasons for Compositionality
289
To nevertheless explore the syntax of a language, constraints are required that sufficiently strongly reduce the degrees of freedom. Compositionality in itself is too weak a constraint because it only links the realm of terms to the realm of meanings, but leaves the relation between utterances and terms unrestrained. 3
Mereological Surface and Unique Readability
In the attempt to curb the arbitrariness of the surface function, the practice of calling the arguments of syntactic operations the parts of the operations’ outcome may easily lead to confusion. For, calling an argument of a syntactic operation, that underlies a certain expression and is responsible for its syntactic structure, a part of the expression might suggests that there would be a determinate relation between terms regarded as elements in the syntax and expressions regarded as material utterances. It might arouse the illusion that the formal principle of compositionality in its own light would allow only those terms that stand in a part-whole relation to an utterance to contribute their semantic values to that of the expression. If so, the introduction of hidden terms in the syntactic structure of the sentence would be prevented. This sleight of hand lets the formal principle of compositionality appear much stronger than it really is. For, calling a term part of an utterance simply is a category mistake. There is no way to decide whether the term pcooktr q or pcookintr q is a part of the utterance ‘I want my cat to cook’.1 In fact, the principle of compositionality does not exclude the introduction of arbitrary terms in the syntactic analysis of a complex expression for whatsoever reason. It hands out a carte blanche to the syntactical analysis of language. Moreover, compositionality downright invites us to postulate hidden terms if other means of achieving compositionality like the postulation of homonymy or the differentiation of syntactic structure seem inappropriate. The idea of a true mereological part-whole relation between a complex expression and its syntactic parts really makes use of a constraint of syntax that is logically independent from the principle of compositionality. It will here be called the mereological surface property. It is not included in the formal notion of compositionality as it was first presented by Montague (1974), but it had already been anticipated in the Fregean picture as cited above: Definition 6 (Mereological surface property). Let fS : T →U 1
I am using inverted commas to denote material utterances, corner quotes for terms and square brackets for concepts.
290
Markus Werning mereological constituency
UO mereological surface
fS
syntactic constituency 7
T compositionality
µ
MW semantic constituency
Figure 1: The Montagovian picture of semantics. The set of utterances or materially understood expressions U is distinguished from the set of terms T . Utterances are the surfaces of terms in the sense that T is surjectively mapped onto U by a surface function f S . Terms are mapped to meanings in the set M by a compositional meaning function µ. be a surface function for the syntax algebra S = hT ; {σ 1 , ..., σ n }i and the set of (types of) utterances U. Then S is said to have a mereological surface if and only if it is true that, for every i = 1, ..., n, if f S (σ i (t 1 , ...,t k , ...,t ji )) is well defined and occurs in some region of space at some interval of time, then f S (t k ) also occurs in that region of space at that interval of time, for each k = 1, ..., ji . Figure 1 illustrates the picture of a theory of semantics one attains if one differentiates between terms and utterances. One may justly call it the Montagovian picture because Montague already made this difference and at the same time advocated the principle of compositionality. The Montagovian picture collapses into the original Frege-picture again, if we require the syntax of a language to be uniquely readable from the utterances of the language:
Reasons for Compositionality
291
Definition 7 (Unique readability). Let fS : T →U be a surface function with the set of terms T and the set of materially individuated utterance (types) U of the language. Then the syntax S = hT ; σ 1 , ..., σ n i is uniquely readable from U if and only if f S is injective. The reason for the collapse is obvious: If a syntax algebra S = hT ; {σ 1 , ..., σ n }i with a set of terms as carrier is uniquely readable from a set of utterances U, then the surface function f S is bijective – injectivity comes from the definition of unique readability, surjectivity from the definition of a surface function. In this case, the syntax S is isomorphic to an algebra U with the set of utterances as carrier, viz. U = hU; { f S ◦ σ 1 ◦ f S −1 , ..., f S ◦ σ n ◦ f S −1 }i. The isomorphism makes sure that any homomorphism between the algebra S of terms and an algebra M = hM, {µ 1 , ..., µ n }i of meanings transfers to the new algebra of utterances. In other words, the compositionality of the syntax S with respect to the semantic algebra M implies that the algebra of utterances U , in its own right, is compositional with respect to the algebra of meanings. The distinction between terms and utterances becomes superfluous. Mereological structure within utterances becomes syntactical structure. See Figure 2 for an illustration. Given the numerous lexical and syntactical ambiguities of natural languages, most linguistics would nowadays probably reject the unique readability for natural languages. A more controversially disputed issue, however, is whether natural languages are compositional if the mereological surface property is assumed. As Lewis (1986), Partee (1984) and Braisby (1998) point out, the question of whether all terms have a mereological surface becomes eminent, e.g., in the case of so-called complex nominals. These include nominal compounds like ‘color television’ and noun phrases with non-predicating adjectives like ‘musical criticism’. The meaning of ‘color television’ (television showing color), for example, is thought to be not predictable from the meanings of its mereological parts. This
292
Markus Werning mereological constituency
U^ fS
uniquely readable mereological surface
f S −1
µ∗
T
µ
MW
semantic constituency
Figure 2: The collapse of the Montagovian picture into the Fregean picture. If it is assumed that the syntax is uniquely readable from the utterances of a language, the Montagovian picture of figure 1 collapses into the Fregean picture in which utterances themselves can be compositionally evaluated by a meaning function µ ∗ = µ ◦ f S −1 . becomes plausible if one contrasts it with the related examples ‘color palette’ (palette of color), ‘color consultant’ (consultant for color), ‘pocket television’ (television that fits in pockets) and ‘oak television’ (television encased by oak). In all cases the meaning of the complex expression seems to involve some unexpressed – or mereologically surfaceless – relation concept. Something similar is happening with regard to the adjective ‘musical’ in noun phrases. A musical clock is a clock that produces music; a musical comedy is a comedy that contains music; and musical criticism is criticism of music. One way to rescue compositionality in these cases is to postulate a relational term in the syntactic structure of the expression that is either surfaceless or whose surface fails to be a mereological part of the complex expression. This would, however, clearly constitute a violation of the mereological surface property. But if one were to give up the requirement of terms to have a mereological surface, how would one then constrain the syntax with regard to the materially individuated utterances? These considerations show that if one approaches the issue of the compositionality of meaning from an empirical point of view, i.e., in face of concrete linguistic examples, one ends up in a dilemma: Either one tries to avoid van Bentham’s vacuity objection by holding on to the mereological surface prop-
Reasons for Compositionality
293
erty, then any compositional analysis of compound nouns, certain adjectivenoun combinations and many other cases seems defective. Or one drops the requirement of a mereological surface for every term, then compositional analyses of those cases are possible, but also trivial. 4
Asymmetries Between Meaning and Content
The definition of compositionality can easily be applied to mental concepts with contents as their semantic values. However, the problems related to the mereological surface property and to unique readability mark two important asymmetries between debates about the compositionality of meaning (= the semantic value of expressions) and the compositionality of content (= the semantic value of mental concepts). Regarding the first problem, the debate on whether concepts have to contain their syntactic parts as mereological parts follows considerations completely different from the debate over the mereological surface of terms and has gained wide attention in controversies between classicism and connectionism (Smolensky, 1991/1995; Fodor, 1997). I have discussed those issues elsewhere (Werning, 2003, 2005). As for the second problem, if we are to validate the principle of compositionality of content in the realm of concepts, we can outright assume that the arguments of our content function are uniquely readable from the structure of concepts and thoughts. For, as Pinker makes very explicit: ‘[...] thoughts, virtually by definition, cannot be ambiguous’ (Pinker, 1997, p. 297) . For ambiguity in natural languages roots in the fact that an expression expresses two different concepts or thought. Concepts, however, just are representations of external contents. One cannot represent two different contents by the same concept because a concept must nomologically co-vary with its content in order to have a content. 5
Productivity
The by far most frequently used justification for compositionality in language and cognition is that language and cognition are productive. Fodor (1998) summarizes the productivity argument for compositionality in the following words: There are infinitely many concepts that a person can entertain. (Mutatis mutandis in the case of natural language: there are infinitely many expressions of L that an L-speaker can understand.) Since people’s representational capacities are surely finite, this infinity of concepts must
294
Markus Werning
itself be finitely representable. In the present case, the demand for finite representation is met if (and as far as anyone knows, only if) all concepts are individuated by their syntax and their contents, and the syntax and contents of each complex concept is finitely reducible to the syntax and contents of its (primitive) constituents. (Fodor, 1998, p. 95) Fodor then concludes that concepts (mutatis mutandis: expressions) must compose and takes this to be the following claim: [...] the claim that concepts compose is the claim that the syntax and the content of a complex concept is normally determined by the syntax and the content of its constituents. (‘Normally’ means something like: with not more than finitely many exceptions. ‘Idiomatic’ concepts are allowed, but they mustn’t be productive.) (Fodor, 1998, p. 94) Fodor’s caveat regarding idiomatic concepts and, mutatis mutandis, idiomatic expressions has to do with the fact that there, for sure, are idiomatic expression in language and that there maybe are idiomatic concepts in thought. Idiomatic expressions and concepts, however, are typically regarded as exceptions to compositionality. For, their meanings, respectively, contents are commonly regarded not to be a function of the meanings/contents of their syntactic parts. The meaning of ‘red herring’ is not derivable (and hence predictable) from the meanings of ‘red’ and ‘herring’.2 A similar violation of compositionality might occur with regard to the concept [red herring] although this is less obvious because it is not clear whether the concepts [red] and [herring] are syntactic constituents of the former. Now, is Fodor’s argument in the first quote really an argument for the claim in the second quote? In the first quote Fodor states that language and cognition are productive. What Fodor says about productivity might be captured by something like the following definition: 2
I do not intend to make any substantial statements about idioms, here, but I should mention that, in an objection to the received view, which is reflected by Nunberg, Sag and Wasow (1994) and according to which some idioms violate semantic compositionality, Westerst˚ahl (1999) argues that idioms can always be embedded in compositional languages. He proposes three ways of doing so: (i) extend the set of atomic expressions by a holophrastic reading of the idiom, (ii) extend the list of syntactic operations so that the literal and the idiomatic reading of the idiom turn out to be outcomes of different syntactic operations, or (iii) take the syntactic parts of the idiom as homonyms of their occurrences in its literal reading and add them to the set of atomic expressions. Westerst˚ahl’s solution, however, strikes me as a little artificial. In our context, though, not much depends on the question whether idioms really are exceptions to compositionality or not.
Reasons for Compositionality
295
Definition 8 (Productivity). A language (mutatis mutandis: a conceptual structure) is called productive just in case the following three conditions hold: a) The syntax of the language (of the conceptual structure) comprises no more than finitely many primitive terms (concepts). b) The syntax of the language (of the conceptual structure) contains syntactic operations such that potentially infinitely many terms (concepts) can computationally be generated. c) The meaning (content) function of the language (of the conceptual structure) is computable given the meanings (contents) of the primitive terms (concepts) and the syntax of the language (of the conceptual structure). Although there are open questions as to whether finite subjects really have the potential to generate infinitely many, or at least potentially infinitely many, expressions or concepts, one might concede that language and cognition indeed are productive in this sense. In my definition the sort of finite reducibility of syntax and semantic value (meaning, respectively, content) Fodor has in mind is accounted for by the computability conditions (b) and (c). Fodor’s notion of compositionality as stated in the second quote is the same as ours.3 The question with regard to the validity of Fodor’s argument now is whether the productivity of a language or conceptual structure in the sense of definition 8 implies that the language or conceptual structure is compositional in the sense of definition 4. The answer is negative: As the following argument shows, languages with a syntactic rule of holophrastic quotation and a non-hyper-distinct meaning function are productive, but not compositional. Productivity does not imply compositionality. Assume the grammar of a language with the set of expressions T and the meaning function µ contain the following syntactic operation of quotation: q : T → T, s 7→ ‘s’ such that µ(q(s)) = s. The inclusion of this operation in a language with finitely many primitive expressions warrants that the language is productive because quotation can be iterated and the meaning function is computable. This account of quotation might 3
Although the formulation might allow for various interpretations, most, if not all compositionality arguments Fodor has given over the time undoubtedly assume a notion of compositionality in the sense of a homomorphism from syntax to semantics.
296
Markus Werning
be called holophrastic quotation because it takes well-formed phrases as unanalyzed wholes and sets them in quotation marks. The meaning of a quotation is the quoted expressions. It can now be shown that this account of quotation violates compositionality, provided the language to be considered contains synonyms and thus abides by the – as I have argued above – virtually indispensable requirement of nonhyper-distinctness. Assume that the sentences pLou and Lee are brothersq and pLee and Lou are brothersq are synonymous in the language – if you don’t agree that the two sentences are synonymous you can choose any other example of synonymous terms. If we stick to the convention of using corner quotes as our meta-linguistic quotation marks, we can express the synonymy as follows: µ(pLou and Lee are brothersq) = µ(pLee and Lou are brothersq). (1) Although the two expressions are synonymous, they are not identical: pLou and Lee are brothersq 6= pLee and Lou are brothersq. (2) From our definition of the syntactic operation of quotation q we derive the following: µ(p‘Lou and Lee are brothers’q) = µ(q(pLou and Lee are brothersq)) = pLou and Lee are brothersq. (3) µ(p‘Lee and Lou are brothers’q) = µ(q(pLee and Lou are brothersq)) = pLee and Lou are brothersq. (4) From (2), (3), and (4) we may infer: µ(q(pLou and Lee are brothersq)) 6= µ(q(pLee and Lou are brothersq)). (5) If we furthermore assume compositionality (see definition 4), there should be a semantic counterpart function µ q for the syntactic operation q such that: µ(q(pLou and Lee are brothersq)) = µ q (µ(pLou and Lee are brothersq)). (6) Substitution of identicals according to (1) yields: µ(q(pLou and Lee are brothersq)) = µ q (µ(pLee and Lou are brothersq)). (7) After another application of compositionality we get: µ(q(pLou and Lee are brothersq)) = µ(q(pLee and Lou are brothersq)). (8) This contradicts (5). The hypothetical assumption that the language was compositional must be rejected. We have thus given a counterexamples to Fodor’s – and not only Fodor’s – supposition that productivity presupposes compositionality. A language with holophrastic quotation is productive, but, non-hyperdistinctness warranted, it is not compositional. Fodor’s argument isn’t valid and productivity is to be rejected as a reason for compositionality.
Reasons for Compositionality
6
297
A Compositional Analysis of Quotation
Holophrastic quotation is not the only analysis of quotation in natural language. From the non-compositionality of holophrastic quotation in non-hyper-distinct languages, we can therefore not infer that natural language fails to conform with compositionality because it comprises some syntactic operation of quotation. The inference would only go through if holophrastic quotation were the only possible way to account for quotation in natural language. To show that quotation in natural language can also be analyzed in a compositional way, I will here introduce the method of phonological quotation, which allows us to refer to expressions of natural language by means of a description of their subsymbolic phonological structure. Unlike holophrastic quotation, phonological quotation can not be conceived of as a function from expressions – taken as unstructured wholes – to their quotations. An earlier, less explicit account of phonological quotation can be found in Davidson (1984), where it is called the spelling theory of quotation. Let us assume that we be given a productive, compositional and non-hyperdistinct language L, which does not yet allow for quotation and which has the following syntax (I will continue to use the same letter for the language and its syntax): L = hT, Σi. For reasons of simplicity, I will furthermore assume that the language be uniquely readable (see definition 7) and that the mereological surface property (see definition 6) is satisfied. Under these assumptions we need not distinguish between terms and utterances and may even assume that the surface function be the identity mapping.4 Let us finally assume that each utterance (and hence each term) be a sequence of primitive phonological parts, i.e., phonemes, or a sequence of any other primitive sub-symbolic parts, e.g., Roman letters or Chinese symbols. The utterance ‘dog’, e.g., is a sequence of the plosive voiced dental consonant ‘d’, the closed-mid back vowel ‘o’ and the plosive voiced velar consonant ‘g’. A sequence is here understood as nothing but a temporally or spatially ordered assembly of matter. Since the elements of T – due to our assumptions – can be identified with material utterances and since T , in a productive language, can be generated from a finite set of atomic terms, we can be certain that there always is a finite set of sub-symbolic mereological parts such that each and every term of the language can be uniquely produced as a sequence thereof. Having said all this, how can we now proceed to extend the language L by some method of reference to expressions of the language by means of expres4
The assumptions are not essential for the argument, but only facilitate the notation.
298
Markus Werning
sions of the language? As we have seen, it would be important if the compositionality of the language did not get lost on the way. The method of phonological quotation promises to accomplish this in that it postulates an additional set of atomic terms P that comprises names for all sub-symbolic parts necessary to generate each and every term of L. In case of a spoken natural language, this set will comprise names for all phonemes of the language. In case of a written language, a set of names for letters will do the job. We will here assume that P be a representation of the English alphabet plus the empty space symbol – I am aware that this only amounts to a crude approximation of the elements of phonology: P = {p‘a’q, p‘b’q, p‘c’q, ..., p‘z’q, p‘ ’q}. Notice that the symbols quote-a-unquote, quote-b-unquote, etc. are supposed to be syntactically primitive. The quotation marks aren’t themselves terms of the language, neither are the letters in between them.5 In phonological quotation one construes a means of reference to expressions of the language in that one gives a definite description of the phonological or other sub-symbolic structure of those expressions by syntactic means within the language. In order to do so, we additionally need to introduce a new syntactic operation into L that, on the level of syntax, reflects the operation of sequencing on the level of semantics. This syntactic operation is the binary operation of concatenation σ _ , where P be the closure of P with respect to σ _ . The operation of concatenation σ_ : P×P → P (X,Y ) 7→ X _Y maps pairs (X,Y ) of phonological descriptions of sequences onto phonological descriptions of larger sequences such that the sequence denoted by X is the first and the sequence denoted by Y the second (and last) part of the larger sequence. Notice that some, but usually not all sequences so described are terms of the language. The following mappings are examples for the operation of concatenation in the English language with a meaning function µ – italics are used to signify sequences in the meta-language and meaning is identified with denotation: σ _ (p‘d’q, p‘o’q) = p‘d’_ ‘o’q, it denotes the sequence do. That is, µ(σ _ (p‘d’q, p‘o’q)) = µ(p‘d’_ ‘o’q) = do. 5
The sole exceptions are the first and ninth letter of the alphabet, which correspond to an indefinite article and, respectively, a pronoun in English.
Reasons for Compositionality
299
And σ _ (p‘d’_ ‘o’q, p‘g’q) = p‘d’_ ‘o’_ ‘g’q, denoting the sequence dog. That is, µ(p‘d’_ ‘o’_ ‘g’q) = dog, where pdogq (consisting of the sequence dog) itself is a term of the language and denotes the set of dogs. In the algebraic picture, the syntax of the extended language, which is capable of phonological quotation, now becomes: L∗ = hT ∗ , Σ∗ i Here, the extended set of terms is T∗ = T ∪P and the extended set of syntactic operations amounts to Σ∗ = Σ ∪ {σ _ }. What about compositionality in L∗ , now? Is the operation of concatenation semantically compositional? Does it have a semantic counterpart function that maps the meanings of the arguments of concatenation to the meanings of its values? – Yes, it does. The function µ _ just is such a semantic counterpart function. It maps a pair of sequences onto a larger sequence such that the pair of sequences make up the first and second part of the larger sequence. Here is an example: µ(σ _ (p‘d’_ ‘o’q, p‘g’q)) = µ _ (µ(p‘d’_ ‘o’q), µ(p‘g’q)) = µ _ (do, g) = dog.
(9) (10) (11)
Equation 9 exemplifies the compositionality condition. Notice that both, the noun pdogq (which is identical to an utterance consisting of the sequence dog) and its phonological description p‘d’_ ‘o’_ ‘g’q are terms of L∗ . We may conclude that the existence of quotation in natural language is completely consistent with the claim that natural language is compositional. Quotation is no exception to compositionality, if only it is analyzed appropriately as phonological quotation in the sense of this section. The does not infringe the previous result that languages with holophrastic quotation pose a counterexample to the implication from productivity to compositionality.
300
7
Markus Werning
Systematicity
If productivity fails to provide a justification for compositionality, what about the often cited justification by the claim that language and cognition are systematic (Fodor & Pylyshyn, 1988). The underlying observation is that intentional and linguistic capacities do not come isolated, but in groups of systematic variants (cf. McLaughlin, 1993). The capacity to imagine a red square in a green circle, e.g., is nomologically correlated with the capacity to imagine a red circle in a green square. Likewise the capacity to understand the sentence ‘The red square is in a green circle’ is nomologically correlated with the capacity to understand the sentence ‘The red circle is in a green square’. Many authors cite the systematic correlation of linguistic capacities and mental capacities as a reason for semantic compositionality. Minds must have the capacity to compose contents and meanings, so it is argued. Otherwise, they would not show a systematic correlation among intentional and linguistic capacities. If a mind is capable of certain intentional states in a certain intentional mode, it most probably is also capable of other intentional states with related contents in the same mode. Mutatis mutandis: If a mind is capable of understanding certain linguistic expressions, it most probable is also capable of understanding other expressions with related meanings. Does systematicity really presuppose compositionality? Why isn’t mere syntactic recombination sufficient for systematicity? The systematic correlation among both, contents and meanings, seems, indeed, to imply more than mere syntactic recombination on the level of natural language or conceptual structure. The capacity to think that a child with a red coat is distracted by an old herring is not correlated with the capacity to think that a child with an old coat is distracted by a red herring. The thoughts ought to be correlated, though, if the fact that one is a syntactic re-combination of the other was sufficient for systematic correlation. For, both thoughts are syntactically combined from exactly the same primitives by exactly the same operations. One may, however, well have the capacity to think of red coats and old herrings, even though one lacks the capacity to think of red herrings. That the two thoughts fail to be correlated follows from the fact that the concept [red herring] is idiomatic and thereby violates semantic compositionality. Likewise, the violation of compositionality by idioms might be held responsible for the fact that the capacity to understand the sentence ‘A child with a red coat is distracted by an old herring’ fails to be systematically correlated with the capacity to understand the sentence ‘A child with an old coat is distracted by a red herring’. From the apparent conditional violation of compositionality ⇒ violation of systematicity
Reasons for Compositionality
301
one may be inclined to infer by contraposition that systematicity presupposes semantic compositionality, both, in the case of cognition and language. This inference may be too quick, though. It has been often overlooked that the phenomenon of systematic correlation is relatively unstable. An often cited pair of systematically correlated sentences (mutatis mutandis: thoughts) is: (1) Mary loves John. (2) John loves Mary. But consider in contrast the following pair: (3) Mary loves ice cream. (4) *Ice cream loves Mary. While (3) is grammatical, (4) is not. This is so despite the fact that the apparent syntactic structure of both sentences is the same: We have a noun phrase followed by verb phrase that takes a direct object. The reason for the violation of grammaticality by (4) seems to be that the verb plovesq does usually not tolerate an inanimate substance in the subject position. There a numerous other examples of systematically correlated pairs of sentences that by replacement of some term may be transformed into a pair that is not systematically correlated: Take for example the correlated pair: (5) The cock pecks the hen. (6) The hen pecks the cock. The replacement of the noun phenq by the noun pcornq leads to a pair whose grammaticality is not systematically correlated: (7) The cock pecks the corn. (8) *The corn pecks the cock. Another pair of correlated sentences is: (9) The boy is reaching for the girl. (10) The girl is reaching for the boy. Replacing pgirlq by pcookieq destroys the correlation: (11) The boy is reaching for the cookie. (12) *The cookie is reaching for the boy.
302
Markus Werning
Why is it that in some case a systematic correlation holds, whereas in others it does not? A suitable answer I would suggest is that systematic correlation is warranted only if the permuted words belong to the same Bedeutungskategorie – semantic category or simply category – as defined by Husserl (1970). Husserl observed that the words and phrases of a language can be organized into classes – the semantic categories – so that (i) for any two expressions of the same class, one expression can replace the other in any non-ambiguous meaningful context without making the context nonsensical and (ii) for any two expressions of different classes the replacement of one expression by the other will make at least some non-ambiguous meaningful contexts nonsensical.6 If we apply the notion of a semantic category to the examples (1)–(4), we can say that pJohnq and pice creamq must belong to different categories. For, a replacement of pJohnq by pice creamq transforms the meaningful sentence (2) into the meaningless sequence of words (4). pMaryq and pJohnq, in contrast, probably belong to the same category. They can be replaced for each other in (1), which leads to (2). (I cannot image a context in which the two proper names cannot be replaced for each other. But this is not certain and it hence remains uncertain if they really belong to the same category.) Analogous comments can be given to the rest of the above examples. pcockq and phenq probably are in the same category, whereas pcockq and pcornq aren’t. Likewise pboyq and pgirlq belong to the same category, while pboyq and pcookieq don’t. In the above cases taking into account the sameness of category is necessary to decide whether the grammaticality of two sentences systematically correlates or not. Sameness of category, however, also suffices to predict whether a systematic correlation holds or not. For, if two words or phrases occurring in the same sentence belong to the same category, it is always warranted that they can be exchanged for each other without affecting the meaningfulness of the sen6
Husserl takes this classification among expressions to be the consequence of some apriori constitution in the realm of meaning. He postulates a law that governs the formation of unitary meanings out of syntactic materials falling under definite categories having an a priori place in the realm of meanings, a formation according to syntactic forms which are likewise fixed a priori, and which can be readily seen to constitute a fixed system of forms. (Husserl, 1970, p. 513) For Husserl the reason why some expressions cannot be replaced for each other in every context without destroying the meaningfulness of the context lies in the fact that the meanings of such expressions belong to different categories. In a remark on Marty he claims that any ‘grammatical division rests on an essential division in the field of meaning’ (Husserl, 1970, p. 500n).
Reasons for Compositionality
303
tence. This way a new pair of systematically correlated sentences can always be generated. Given all this, what role does compositionality play for systematicity? On the one hand, knowledge that a language is compositional is not sufficient to predict systematic correlations. One can’t do without judgements about the sameness of categories. On the other hand, judgements about the sameness of categories themselves suffice to predict systematic correlations. Isn’t this indication enough that systematicity is no good a reason for compositionality?7 8
Inferentiality
For centuries, it had been considered a mystery how any syntactically specified manipulations on internal symbols could preserve semantic properties like truth. It was among others G¨odel’s and Turing’s achievement to tell how it goes. What you need is a language (syntax plus semantics) with a logic calculus that is sound, i.e., syntactic derivability must secure semantic validity. If soundness is warranted for the logic of concepts, cognition is possible. Otherwise, it would remain a mystery how internal manipulations of concepts could secure truth-conduction, which is the main goal of cognition. Now, violations of compositionality, at least in some cases, lead to violations of soundness and some authors have alluded to soundness or – as they sometimes call it – inferentiality as a reason for compositionality (Fodor & Pylyshyn, 1988; McLaughlin, 1993). Assume your logic specify a rule of inference that may be called adjective elimination: T his NOUN 1 is a ADJ NOUN 2 ∴ T his NOUN 1 is a NOUN 2 . In accordance with the rule of adjective elimination one may syntactically derive This fruit is a pear from This fruit is a red pear. This derivation is semantically valid: The truth of the premise guarantees the truth of the conclusion. However, if we choose a syntactic [ADJ NOUN]construction that violates compositionality because it is idiomatic, applications of the rule of adjective elimination will no longer be semantically valid. Take 7
For a more elaborate discussion of categories in the context of systematicity and compositionality see Johnson (2004) and Werning (2004).
304
Markus Werning
for example: Bush’s speech is a red herring ∴ Bush’s speech is a herring This is a syntactic derivation in accordance with the rule of adjective elimination. It, however, fails to be a semantically valid inference: Bush’s speech is not a herring, even if it is a red herring. Since in this case we have derivability without validity, soundness is violated. The reason for the violation of soundness seems to be that the semantic value of the syntactic [ADJ NOUN]-operation as applied to the pair (predq, pherringq) is not a function of the semantic values of predq and pherringq. But can we generalize? Is the soundness of a language’s logic always undermined if compositionality is violated as in the case of idioms? If we could generalize, we, by contraposition, could conclude that the soundness of a logic presupposes that the language be compositional. Since cognition would remain a mystery unless its conceptual structure were to warrant soundness – we might conclude – any conceptual structure should be compositional. Notice that the mere presence of idioms does not seem to inflict soundness. Take a propositional calculus, for instance. The derivation Bush’s speech is a red herring and (but) the Mars will be populated anyway ∴ The Mars will be populated anyway is obviously semantically valid. Somehow only such rules of inference are critical that break up phrases which have been syntactically combined in a semantically non-compositional way. But even here we find counterexamples: Take again a language with an operation of holophrastic quotation as it has been defined in section 5. We already know that such a language is not compositional, provided that it is not hyperdistinct. Now assume, furthermore, that our language contain a truth-predicate pis trueq and consider the following rule of inference: ‘SENTENCE’ is true ∴ SENTENCE. An application of this rule of inference, e.g., is the derivation: ‘Snow is white’ is true ∴ Snow is white. First, the rule of inference does break up phrases which have been syntactically combined in a semantically non-compositional way, viz. holophrastic quotations. This example is thus different from the one above, which employs
Reasons for Compositionality
305
the propositional calculus and does not break up the critical [ADJ NOUN]operation. The individual term p‘Snow is white’q – corner quotes again for quotation marks in the meta-language – contains the sentence pSnow is whiteq as a proper syntactic part. The syntactic operation is q of section 5. Second, the derivation also is semantically valid. Due to the definition of the meaning function with µ(q(s)) = s, it is impossible for the conclusion to be false if the premise is true, provided that the truth-predicate is interpreted in a common way (according to the deflationary theory of truth, for example). It looks as if soundness or inferentiality are unimpeded even in cases where a rule of inference is to break up a non-compositional syntactic structure. This might be not the last word, I am ready to concede. As we know, the coincidence of a holophrastic rule of quotation with the truth-predicate easily leads to paradoxes (the liar paradox is such a case). This issue certainly deserves further investigation. But so long, inferentiality does not serve as a better reason for compositionality than productivity and systematicity do. 9
Compositionality and the Principle of Interchangeability of Synonyms
I don’t want to close this paper without giving, at least, a hint where to look for a reason for compositionality. Where we even have a proof, is the entailment of the principle of compositionality in the principle of interchangeability of synonyms salva significatione. The latter principle says that the substitution of synonyms for expressions in any linguistic context leaves unchanged the meaning of the context. The principle can be regarded as the meaning (or intensional) counterpart of the principle of extensionality, also called the principle of interchangeability of co-extensionals salva veritate. It claims that the substitution of co-extensional expressions for each other leaves unchanged the truth value of the embedding linguistic context. While the principle of extensionality is violated in intensional contexts – contexts like ‘It is necessary that ...’, ‘S believes that ...’ – the principle of interchangeability of synonyms salva significatione even pertains to those cases.8 The following theorem, which is due to Hodges (2001), proves the equivalence between the principle of compositionality and the principle of interchangeability salva significatione. Meaning functions are called substitutional salva significatione if they abide by the principle that the substitution of syn8
There is some discussion on the scope of the principle of interchangeability salva significatione. Kripke (1979) tries to construe some counterexamples. I do however think that the principle can be defended. When doing so, one has to take care, though, not to individuate meanings too finely grained. Otherwise one is in danger of jeopardizing non-hyper-distinctness.
306
Markus Werning
onyms for expressions in any linguistic context leaves unchanged the meaning of the context (we write: p ≡ µ q if and only if µ(p) = µ(q)). Theorem 1. Let µ be a meaning function for a language with grammar G, and suppose that every syntactic part of a µ-meaningful term is µ-meaningful. Then the following are equivalent: a) µ is compositional. b) µ is substitutional salva significatione, i.e., if s is a term and p0 , ..., pn−1 , q0 , ..., qn−1 are grammatical terms such that s(p0 , ..., pn−1 |ξ 0 , ..., ξ n−1 ) and s(q0 , ..., qn−1 |ξ 0 , ..., ξ n−1 ) are both µ-meaningful and, for all m < n, pm ≡ µ qm , then s(p0 , ..., pn−1 |ξ 0 , ..., ξ n−1 ) ≡ µ s(q0 , ..., qn−1 |ξ 0 , ..., ξ n−1 ). Proof. (a) ⇒ (b). Assuming (a), we prove (b) by induction on the complexity of s. In case n = 0, s is a µ-meaningful term and the conclusion s ≡ µ s is trivial. We now consider the case where s is the term α(t 0 , ...,t m−1 ). In this case we get s(p0 , ..., pn−1 |ξ 0 , ..., ξ n−1 ) by substituting the terms pi for ξ i , with 0 ≤ i < n, in all syntactic parts t 0 , ...,t m−1 of the term s. We analogously proceed with s(q0 , ..., qn−1 |ξ 0 , ..., ξ n−1 ) and thus have: s(p0 , ..., pn−1 |ξ 0 , ..., ξ n−1 ) = α(t 0 (p0 , ..., pn−1 |ξ 0 , ..., ξ n−1 ), ...,t m−1 (p0 , ..., pn−1 |ξ 0 , ..., ξ n−1 )), and s(q0 , ..., qn−1 |ξ 0 , ..., ξ n−1 ) = α(t 0 (q0 , ..., qn−1 |ξ 0 , ..., ξ n−1 ), ...,t m−1 (q0 , ..., qn−1 |ξ 0 , ..., ξ n−1 )). Since s(p0 , ...|ξ 0 , ...) and s(q0 , ...|ξ 0 , ...) are assumed to be µ-meaningful, their syntactic parts t i (p0 , ...|ξ 0 , ...) and t i (q0 , ...|ξ 0 , ...), respectively, are also µmeaningful. By induction hypotheses we may, therefore, presume that t i (p0 , ...|ξ 0 , ...) ≡ µ t i (q0 , ...|ξ 0 , ...). According to (a) the µ-meanings of s(p0 , ...|ξ 0 , ...) and s(q0 , ...|ξ 0 , ...), respectively, are a function of the meanings of their syntactic parts. Thus, the identity of the µ-meanings of the parts of both terms implies the identity of the µ-meanings of both terms.
Reasons for Compositionality
307
(b) ⇒ (a). (a) follows at once from the special case of (b) where s has the form α(ξ 0 , ..., ξ n−1 ). For, in that case (b) just claims the functionality of the relation µ α = {h(µ(ξ 0 ), ..., µ(ξ n−1 )), µ(α(ξ 0 , ..., ξ n−1 ))i|ξ 0 , ..., ξ n−1 ∈ GT (G)}.
10
Conclusion
We’ve seen that the justification of a principle like compositionality that is central to the semantic analysis of language and to any theory of cognition is everything but an easy task. The first obstacle was to avoid vacuity. In addition to compositionality, the postulation of two further constraints on semantics, nonhyper-distinctness and the mereological surface property, was required. Both constraints have severe side effects, though. The mereological surface property hinders the compositional analysis of language in the light of empirical data massively. For, it forbids the introduction of hidden terms. The requirement of non-hyper-distinctness is responsible for the fact that productivity must no longer be regarded as a reason for compositionality. There are productive languages, that turn out to be non-compositional if hyper-distinctness is not an option. Systematicity, too, fails to provide a justification for compositionality. For, systematicity is a matter solely of membership in semantic categories. Inferentiality falls short of being a reason for compositionality because the soundness of a calculus does apparently not presuppose that the syntactic combinations it breaks up be semantically compositional. The only reason we found was the principle of interchangeability of synonyms. It is logically equivalent to compositionality, but doesn’t this imply that any appeal to it is likely to be a petitio? The prospects of a justification of compositionality are not entirely bleak, but less than comfortable. References Braisby, N. (1998). Compositionality and the modelling of complex concepts. Minds and Machines, 8, 479–508. Davidson, D. (1984). Quotation. In Truth and interpretation (pp. 79–92). Oxford: Clarendon Press. Fodor, J. (1997). Connectionism and the problem of systematicity (continued): Why Smolensky’s solution still doesn’t work. Cognition, 62, 109–119. Fodor, J. (1998). Concepts: Where cognitive science went wrong. New York, NY: Oxford University Press.
308
Markus Werning
Fodor, J., & Pylyshyn, Z. (1988). Connectionism and cognitive architecture: A critical analysis. Cognition, 28, 3–71. Frege, G. (1979). Logic in mathematics (P. Long & R. White, Trans.). In H. Hermes, F. Kambartel, & F. Kaulbach (Eds.), Gottlob Frege. Posthumous writings (pp. 203–50). Oxford: Basil Blackwell. (Original work published 1914) Hendriks, H. (2001). Compositionality and model-theoretic interpretation. Journal of Logic, Language and Information, 10(1), 29–48. Hodges, W. (2001). Formal features of compositionality. Journal of Logic, Language and Information, 10, 7–28. Husserl, E. (1970). Logical investigations (Vol. II; J. N. Findlay, Trans.). London: Routledge & Kegan Paul. Janssen, T. (1986). Foundations and applications of Montague grammar. Part 1: Philosophy, framework, computer science. Amsterdam: Centrum voor Wiskunde en Informatica. Johnson, K. (2004). On the systematicity of langauge and thought. Journal of Philosophy, 101(3), 111–39. Kripke, S. (1979). A puzzle about belief. In A. Margalit (Ed.), Meaning and use (pp. 239–283). Dordrecht: Reidel. Lewis, D. (1986). On the plurality of worlds. Oxford: Blackwell. McLaughlin, B. P. (1993). The connectionism/classicism battle to win souls. Philosophical Studies, 71, 163–190. Montague, R. (1974). Universal grammar. In R. H. Thomason (Ed.), Formal philosophy. Selected papers of Richard Montague (pp. 222–46). New Haven: Yale University Press. (Reprinted from Theoria, 1970, 36, 373–98) Nunberg, G., Sag, I., & Wasow, T. (1994). Idioms. Language, 70, 491–538. Partee, B. (1984). Compositionality. In F. Landman & F. Veltman (Eds.), Varieties of formal semantics (pp. 281–311). Dordrecht: Foris Publications. Partee, B., ter Meulen, A., & Wall, R. (1990). Mathematical methods in linguistics. Dordrecht: Kluwer Academic Publishers. Pinker, S. (1997). How the mind works. New York: Norton. Smolensky, P. (1995). Connectionism, constituency and the language of thought. In C. Macdonald & G. Macdonald (Eds.), Connectionism (pp. 164–198). Cambridge, MA: Blackwell. (Original work published 1991)
Reasons for Compositionality
309
van Benthem, J. (1984). The logic of semantics. In F. Landman & F. Veltman (Eds.), Varieties of formal semantics (pp. 57–80). Dordrecht: Foris Publications. Werning, M. (2003). Synchrony and composition: Toward a cognitive architecture between classicism and connectionism. In B. L¨owe, W. Malzkorn, & T. Raesch (Eds.), Applications of mathematical logic in philosophy and linguistics (pp. 261–78). Dordrecht: Kluwer. Werning, M. (2004). Compositionaltity, context, categories and the indeterminacy of translation. Erkenntnis, 60, 145–78. Werning, M. (2005). The temporal dimension of thought: Cortical foundations of predicative representation. Synthese, 146(1/2), 203–24. Westerst˚ahl, D. (2002). On the compositionality of idioms. In D. BarkerPlummer, D. Beaver, J. van Benthem, & P. Scotto di Luzio (Eds.), Proceedings of LLC8. CSLI Publications.