171 92 5MB
English Pages 494 Year 2007
Ontolinguistics
≥
Trends in Linguistics Studies and Monographs 176
Editors
Walter Bisang (main editor for this volume)
Hans Henrich Hock Werner Winter
Mouton de Gruyter Berlin · New York
Ontolinguistics How Ontological Status Shapes the Linguistic Coding of Concepts
edited by
Andrea C. Schalley Dietmar Zaefferer
Mouton de Gruyter Berlin · New York
Mouton de Gruyter (formerly Mouton, The Hague) is a Division of Walter de Gruyter GmbH & Co. KG, Berlin.
앝 Printed on acid-free paper which falls within the guidelines 앪 of the ANSI to ensure permanence and durability.
Library of Congress Cataloging-in-Publication Data Ontolinguistics : how ontological status shapes the linguistic coding of concepts / edited by Andrea C. Schalley, Dietmar Zaefferer. p. cm. ⫺ (Trends in linguistics. Studies and monographs ; 176) Includes bibliographical references and index. ISBN 978-3-11-018997-1 (hardcover : alk. paper) 1. Linguistics. 2. Ontology. 3. Concepts. I. Schalley, Andrea C., 1972⫺ II. Zaefferer, Dietmar, 1947⫺ P121.O58 2007 410⫺dc22 2006035513
ISBN 978-3-11-018997-1 ISSN 1861-4302 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.d-nb.de. ” Copyright 2007 by Walter de Gruyter GmbH & Co. KG, D-10785 Berlin All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording or any information storage and retrieval system, without permission in writing from the publisher. Cover design: Christopher Schneider, Berlin. Printed in Germany.
Acknowledgements
The plans for this volume go back to a workshop organized by the editors as part of the 25th Annual Meeting of the German Linguistic Society entitled Ontological Knowledge and Linguistic Coding, which took place from the 26th to the 28th February 2003 at Ludwig-Maximilians-University in Munich. Some of the contributions to that inspiring workshop found their way, partly in considerably revised form, into the volume, others have been written specifically for it, still others had already appeared in other places, and given that they fit especially well into the overall picture, they are reprinted here. First, we would like to thank Helen Wilson, Rights Manager of Elsevier Ltd., for granting permission to reprint the article ‘The emergence of a shared action ontology: Building blocks for a theory’ by Thomas Metzinger and Vittorio Gallese, which had first appeared in Consciousness and Cognition 12: 549–571, Copyright 2003. Then we gratefully acknowledge the permission (with pleasure) by Chief Editor Dah-an Ho to reprint Leonard Talmy’s paper ‘The representation of spatial structure in spoken and signed language: A neural model’, which had first appeared in Language and Linguistics 4 (2): 207–250 (April 2003). We are grateful for the invaluable help and support we received from Alexander Borkowski and Dorothy Meyer while preparing the camera-ready copy of the manuscript. We benefited also from the comments of an anonymous referee whose suggestions have been extensively implemented, especially in the first two chapters. Last but not least, we would like to thank for her unfailing and cheerful support Birgit Sievert of Mouton de Gruyter. Andrea Schalley and Dietmar Zaefferer
Contents
Acknowledgements
I
Introduction
Ontolinguistics – An outline Andrea C. Schalley and Dietmar Zaefferer Ontologies across disciplines Matthias Nickles, Adam Pease, Andrea C. Schalley, and Dietmar Zaefferer
II
v
3
23
Foundations, general ontologies, and linguistic categories
The emergence of a shared action ontology: Building blocks for a theory Thomas Metzinger and Vittorio Gallese
71
Formal representation of concepts: The Suggested Upper Merged Ontology and its use in linguistics Adam Pease
103
Linguistic interaction and ontological mediation John A. Bateman
115
Semantic primes and conceptual ontology Cliff Goddard
145
Using ‘Ontolinguistics’ for language description Scott Farrar
175
Language as mind sharing device: Mental and linguistic concepts in a general ontology of everyday life Dietmar Zaefferer
193
viii
Contents
III
Concepts with closed-class coding
The representation of spatial structure in spoken and signed language: A neural model Leonard Talmy
231
Postural categories and the classification of nominal concepts: A case study of Goemai Birgit Hellwig
279
Spatial ‘on’ – ‘in’ categories and their prepositional codings across languages: Universal constraints on language specificity Marija M. Brala
299
Semantic categorizations and encoding strategies Stavros Skopeteas
331
IV
Categories with open-class coding
Taxonomic and meronomic superordinates with nominal coding Wiltrud Mihatsch
359
Motion events in concept hierarchies: Identity criteria and French examples Achim Stein
379
On the ontological, conceptual, and grammatical foundations of verb classes Martin Trautwein
395
The ontological loneliness of verb phrase idioms Christiane Fellbaum
419
Relating ontological knowledge and internal structure of eventity concepts Andrea C. Schalley
435
About the contributors Index of names Language index Subject index
459 465 473 475
Part I: Introduction
Ontolinguistics – An outline Andrea C. Schalley and Dietmar Zaefferer Current progress in linguistic theorizing is more and more informed by crosslinguistic investigation. Comparison of languages relies crucially on those concepts which are essentially the same across human minds, cultures, and languages, and which therefore can be activated through the use of any human language. These instances of mental universals join other less common concepts to constitute a complex structure in our minds, a network of crossconnected conceptualizations of the phenomena that make up our world. Following more and more widespread usage we call such a system of conceptualizations an ontology, and we submit that the most reliable basis for any cross-linguistic research lies in the common core of the different individual human ontologies. This is the basic tenet of all approaches that can properly be called ontology-based linguistics or ontolinguistics for short. While concept activations depend on episodic linguistic and non-linguistic stimuli and therefore are subject to permanent change, recorded in short-term memory, the conceptual system itself, after its development, differentiation, and stabilization in the ontogeny of each agent,1 is assumed to be relatively stable and stored in long-term memory. Therefore, the emphasis of ontolinguistic research is less on processing than on structure. The initial idea behind the present volume is to further instigate progress in linguistics by asking a rather underexplored question: What is the relation between the ontologies in our minds and the languages we participate in? Obviously, the relation is not arbitrary. To mention just two examples, across different types, languages tend to provide monomorphemic lexical codings for ontologically central concepts like SUN or OLD or SAY and closed-class codings (grammatical morphemes, functional elements) for more general structural concepts such as POSSESSION or REPETITION or PAST.2 The working assumption underlying this volume is that the options provided by human languages for expressing a given concept are constrained by the ontological status of this concept. Ontological status is defined as the position a concept holds within a given ontology; this position is determined by the ontological relations the concept entertains with other concepts in the same system.
4
Andrea C. Schalley and Dietmar Zaefferer
Ontological relations are interconceptual relations and most of them belong to one of the following two groups (A, B, and C stand for the related concepts; ‘iff’ abbreviates ‘if and only if’). There are five taxonomic relations: 1. 1.1. 1.2. 2. 2.1.
2.2. 2.3.
Two weak orderings (transitive, reflexive, antisymmetric relations): Conceptual subordination: Concept A is c-subordinated to concept B iff every instance of A is also an instance of B, and Conceptual superordination, its converse. Three symmetric relations: Conceptual equivalence: Concept A is c-equivalent to concept B iff every instance of A is also an instance of B and vice versa. (C-equivalence is just the intersection of c-subordination and c-superordination; c-equivalent concepts are co-extensional.) Conceptual compatibility: Concept A is c-compatible with concept B iff it is conceivable that some entity instantiates both A and B, and Conceptual incompatibility, its complement.
Meronomic relations are much more diversified,3 but they show similar structures. Depending on the domain, they involve different notions of inclusion: those which go down to minimal included entities or atoms, those that go up to maximal including entities or wholes, those that do both and those that do neither. The following sample definitions concern only noncumulative concepts, concepts where the inclusion relation has an upper bound (is u-bounded) and where the distinction between a complete and an incomplete instance makes sense. There are five meronomic relations of this u-bounded kind, or rather families of relations, depending on the kind of part-of relation chosen (with x being a variable for the kind of part-of relation): 3. 3.1. 3.2. 4. 4.1.
4.2.
4.3.
Two families of strict orderings (transitive, irreflexive, asymmetric relations): Meronomic x-subordination: Concept A is m-x-subordinated to concept B iff every complete instance of B properly x-includes an instance of A. Meronomic x-superordination, its converse. Three families of symmetric relations: Meronomic x-cosubordination to C: Concepts A and B are m-x-cosubordinated to concept C iff every complete instance of C properly x-includes both an instance of A and an instance of B. Meronomic x-compatibility under C: Concepts A and B are m-x-compatible under concept C iff it is conceivable that a complete instance of C properly x-includes both an instance of A and an instance of B. Meronomic x-incompatibility under C, its converse.
Ontolinguistics – An outline
5
As an illustration of these ontological relations we will consider two groups of concepts, the first one having to do with the concept of that part of the normal human body that touches the ground when the body is standing and which can form different angles with the body part it immediately attaches to. The second group of concepts has to do with the concept of that kind of motion which instances of the first concept perform when they are used to moving the corresponding body forward, concepts we shall refer to using the small caps letter sequences HUMAN FOOT and HUMAN STEP FOR WARD , respectively, because they can be activated in the minds of English speaking people with the help of the nominal group human foot and uses of the nominal group step forward involving a human performer, respectively (note that neither concept has a lexical coding in English). Here are examples of the ten (families of) ontological relations defined above: (1) Conceptual subordination: HUMAN FOOT is c-subordinated to the concept HUMAN BODY PART because it is inconceivable that an instance of the former is not an instance of the latter, too, and HUMAN STEP FORWARD is c-subordinated to the concept HUMAN STEP for the same reason. (2) Conceptual superordination: HUMAN FOOT is c-superordinated to the concept HUMAN LEFT FOOT because it is inconceivable that an instance of the latter is not an instance of the former, too, and HUMAN STEP FORWARD is c-superordinated to the concept HUMAN LEFT FOOT STEP FORWARD for the same reason. (3) Conceptual equivalence: HUMAN FOOT is c-equivalent to the concept HU MAN LEFT OR RIGHT FOOT because it is inconceivable that an entity instantiates only one of these two concepts (humans have only two feet, one left and one right one), and HUMAN STEP FORWARD is c-equivalent to the concept HUMAN LEFT OR RIGHT FOOT STEP FORWARD for the same reason. (4) Conceptual compatibility: HUMAN FOOT is c-compatible with the concept INJURED because it is conceivable that some entity instantiates both of these two concepts (there may be injured human feet), and HUMAN STEP FORWARD is c-compatible with the concept QUICK MOTION because humans may make quick steps forward. (5) Conceptual incompatibility: HUMAN FOOT is c-incompatible with the concept FIN because it is inconceivable that some entity instantiates both of these two concepts, and HUMAN STEP FORWARD is c-incompatible with the
6
Andrea C. Schalley and Dietmar Zaefferer
concept NOD for the same reason (no instance of a human step can be an instance of a nod at the same time). So far for the taxonomic and therefore purely conceptual ontological relations. Meronomic ontological relations are also conceptual, but in contrast to the purely conceptual taxonomic relations they are rooted in a relation betwen their instances, or rather a whole range of different relations, namely part-of or inclusion relations between their instances. A case of an autonomous, i.e. non-attached, part-of relation holds for instance between a left shoe and the corresponding complete pair of shoes, which is therefore said to a-include the left shoe (for a discussion of collections like clothes and their linguistic coding, cf. Mihatsch this vol.). By contrast, the following examples all involve a relation called i-inclusion which is defined as that relation that holds between two entities a and b if and only if b properly includes a and a is integrated with b, i.e., a is an integral part of b and hence not easily detachable from it. (6) Meronomic i-subordination: HUMAN FOOT is m-i-subordinated to the concept HUMAN BODY because every complete instance of the latter properly i-includes an instance of the former. Similarly, HUMAN STEP FORWARD is m-i-subordinated to the concept HUMAN DOUBLESTEP FORWARD for the same reason. (7) Meronomic i-superordination: HUMAN FOOT is m-i-superordinated to the concept HUMAN BIG TOE because every complete instance of the former properly i-includes an instance of the latter. Similarly, HUMAN STEP FOR WARD is m-i-superordinated to the concept FINAL STAGE OF HUMAN STEP FORWARD for the same reason (a complete token of a human step forward includes its final stage, which is not easily detachable). (8) Meronomic i-cosubordination to C: The concepts HUMAN FOOT and HU MAN HEAD are m-i-cosubordinated to the concept HUMAN BODY because every complete instance of the latter properly i-includes both an instance of the first and an instance of second concept. Similarly, the concepts INITIAL STAGE OF HUMAN STEP FORWARD and FINAL STAGE OF HUMAN STEP FORWARD are m-i-cosubordinated to the concept HUMAN STEP FORWARD for the same reason. (9) Meronomic i-compatibility under C: The concepts HUMAN BIG TOE and HUMAN SIXTH TOE are m-i-compatible under HUMAN FOOT because there are complete instances of the latter that properly i-include both an instance of the first and an instance of second concept (a condition called polydactyly).
Ontolinguistics – An outline
7
Similarly, the concepts INITIAL STAGE OF HUMAN STEP FORWARD and HESITATING STAGE OF HUMAN STEP FORWARD are m-i-cosubordinated to the concept HUMAN STEP FORWARD for the same reason. (10) Meronomic i-incompatibility under C: The concepts HUMAN BIG TOE and NUMERICAL DIGIT are m-i-incompatible under HUMAN FOOT because it is inconceivable that a complete instance of the latter properly i-includes both an instance of the first and an instance of second concept (that would be a meronomic version of a category mistake). Similarly, the concepts INI TIAL STAGE OF HUMAN STEP FORWARD and FINAL STAGE OF HUMAN STEP BACKWARD are m-i-incompatible under HUMAN STEP FORWARD for the same reason. Similarly, illustrations of the meronomic relations based on a-inclusion could be given, which is defined as that relation that holds between two entities a and b if and only if b properly includes a and a is autonomous within b, i.e., a is a part of b, but not necessarily attached to it: LEFT SHOE is m-asubordinated to the concept PAIR OF SHOES, etc. As indicated above, despite the structural similarities there are also major differences between taxonomic and meronomic relations. First, taxonomic relations exist only on the conceptual level and have no counterpart at the instance level. If we consider for example Edward Teller’s (known as “the father of the hydrogen bomb”) right foot at the instance level and compare its possible conceptualizations as T ELLER ’ S RIGHT FOOT, RIGHT FOOT, and FOOT, respectively, this corresponds to different photographs of increasingly coarse grain of the same entity, but not to different entities. This is why the relation of conceptual subordination that holds among them has been called ‘purely conceptual’ above. By contrast, if we consider in addition to Edward Teller’s right foot his right leg and his body and the concepts T ELLER ’ S RIGHT FOOT, T ELLER ’ S RIGHT LEG , and T ELLER ’ S BODY , respectively, this gives raise to the conceptual relation between the concept of some entity and the concepts of other entities it is a part of, which is a special case of meronomic subordination. If we compare these concepts to different photographs, they are not photographs of the same entity, but of different entities which stand in a material relation which is not that of identity, and they may be of exactly the same grain (although they need not). Second, by contrast with taxonomic relations, there are different meronomic relations depending on the (upper, lower, double or un-) boundedness
8
Andrea C. Schalley and Dietmar Zaefferer
of the relevant inclusion relation and these in turn are parametric in the sense that the x in the expression ‘x-part’ has to be given a value, before the definitions can be made to work. For instance, the parts that are required for a functionally complete whole are fewer, in many cases, than those that are required for an culturally complete whole: no man needs a beard for survival, but a Hajj pilgrim is incomplete without. And anybody who wears strong glasses knows how incomplete he is without them, so the integral-part-of relation considered above is different from the autonomous-part-of relation with respect to aggregates formed by humans and their equipment. The example of Teller’s right foot shows also the importance of relativizing the relation of meronomic subordination, where possible, to complete instances of the superordinated concept, because Teller lost his right foot in 1928 as a student at the University of Munich when he jumped from a moving streetcar. Still, his right foot is part of his right leg (conceived as a complete entity), and the loss simply entails that afterwards his right leg was not complete anymore. The above relations would be incomplete in a crucial way without a further central relation which is presupposed by all of them, the relation of instantiating a concept or falling under it. The fundamental and in fact foundational part of this relation links non-conceptual, concrete, spatiotemporally located parts of the world with the corresponding concepts in an ontology. Linking extra-conceptual and conceptual entities, this part of the instantiation relation is non-ontological in nature, it is the one that ‘grounds’ ontologies. The remainder of the instantiation relation connects concepts with meta-concepts, or, to put it differently, lower-level concepts with the higher-level concepts they fall under (such as the concept SUN, which instantiates or falls under the concept CONCEPT). This other part of the instantiation relation, being an interconceptual relation, is an ontological relation. Even though they are seldom explicitly mentioned, the definitions of ontological relations given above play a crucial role in the present volume. The relations themselves are important because they stabilize – much more than explicit definitions could do – the concepts at their nodes, and the definitions show in principle what a formalization could look like. (For a first outline of a formal representation, cf. Schalley this vol.) Next we have to take a stand on another core issue: What are the limits of the ontological knowledge we assume as a prerequisite of linguistic knowledge? We submit that ontological knowledge is not the same as encyclopedic knowledge or world knowledge, because the latter two include knowledge about how the world happens to be, whereas ontological knowledge is about
Ontolinguistics – An outline
9
how the world has to be, given the way we conceptualize it. Assuming that definitions may be implicit one can also characterize ontological knowledge as definitional knowledge or analytic knowledge. The fact that Kant’s analytic-synthetic distinction has been heavily attacked by some philosophers, notably Quine,4 does not require special attention here since a Quinean would reject the whole ontolinguistic research program anyway for its heavy reliance on concepts and other mental entities. But even Quine himself admitted: “Analyticity undeniably has a place at a common-sense level . . . ” (Quine 2004: 59), and when he writes: “I recognize the notion of analyticity in its obvious and useful but epistemologically insignificant applications” (Quine 2004: 61), this shows that apart from the epistemological significance his position is not too far from the ontolinguistic view. The common-sense relevance of the analytic-synthetic distinction shows up for instance in discourse: Ontological knowledge can be safely presupposed in normal use of language, other knowledge cannot. It is rather difficult to explain the different judgments of appropriateness of (2) below as a reaction to (1a) and (1b) respectively, if we blur the distinction, and it is rather trivial, if we keep it. (1)
a. b.
As a student in Munich Teller lost his right foot. He had to learn to walk with a prosthesis.
(2)
Oh, I didn’t know he had one.
Similarly, consider the question-answer exchange in (3): (3)
a. b.
Who was Edward Teller’s wife? The woman he was married to.
Explaining the fact that such an answer is considered insulting is hard without the notion of analyticity and very straightforward with it: If the concept of being a person’s wife is c-equivalent to the concept of being the woman this person is married to, then (3b) as an answer to (3a) is analytically true (against the backdrop of the presupposition of (3a) that he was married exactly once), it thus gives no additional information to anybody familiar with the relevant parts of English, and giving tautological answers can be considered an insult. Replacing (3b) above by (3c) below gives a completely different result: (3)
c.
Augusta Maria Harkanyi, the sister of a longtime friend.
10
Andrea C. Schalley and Dietmar Zaefferer
Answer (3c) can be looked up in any good encyclopedia, it represents a prototypical case of encyclopedic knowledge. World knowledge is sometimes used as a synonym of encyclopedic knowledge, but it can also be used to denote knowledge that properly includes the latter and therefore leaves space for knowledge that will never make it into an encyclopedia, like the one expressed in the following answer to (3a): (3)
d.
The woman John Smith saw on Christmas eve 1953 at Sather Gate.
The lines between the different kinds of knowledge are of course sometimes hard to draw, but the clear cases show that they are nevertheless indispensable. The fact that the line between green and blue is hard to draw too and that sometimes people disagree about its proper location is no reason to give up the green-blue distinction altogether. Having clarified the relation between ontological knowledge, encyclopedic knowledge and episodic world knowledge it is now time to turn to the question of the relation between these kinds of knowledge and linguistic knowledge. From the ontolinguistic point of view, linguistic knowledge is a special kind of ontological knowledge since, (i) to know a language implies to know the signs of this language, (ii) they are what they are by definition, and (iii) definitional knowledge is ontological knowledge. Here is an illustration: Part of knowing French is to know (a sufficient proportion of) the words of French, and part of knowing the French word canard is to know that its meaning relates it (among other things) to the concept DUCK, and this is part of its ontological identity: If there is a bisyllabic acoustic pattern that sounds exactly the same, but whose meanings do not include the concept DUCK, then this cannot possibly be the French word canard. So to know a language means to have a special kind of ontological knowledge. But ontological knowledge enters the picture another time, namely with respect to the semantic relations of the language. To continue with the example from French, part of knowing the concept DUCK is to know that it is subsumed under (and hence entails) the concept BIRD, and therefore for a person to know French implies knowing that the relevant part of the meaning of canard also includes the concept BIRD.5 So semantic knowledge, being part of knowledge of language, not only is a special case of ontological knowledge, it also presupposes non-linguistic and therefore cross-linguistically invariant ontological knowledge.6
Ontolinguistics – An outline
11
It seems safe to assume that the common core of ontological knowledge, the shared conceptual knowledge all the different individual linguistic competences and all the different cultures are based on, contributes crucially to limiting the diversity of both human cultures and human languages. This entails that translation requires cultural explication only if the limits of this common core of ontological knowledge are transgressed, whereas it is straightforward as long as they are respected. Without abandoning the idea of linguistics as a coherent scientific discipline, recent advances in this field have been characterized by an emphasis on explanatory adequacy and thus on relating linguistic to non-linguistic phenomena.7 The approach taken in this volume, dubbed ontolinguistics, adds a decisively distinct and new aspect to this emerging new contextualization of the field by focusing not only on the conceptual contents of linguistic signs, but also on their decompositional structures and on the network of interconceptual relations they are embedded in. This focus is both narrow enough to delineate a research program and a new exciting subfield of linguistics and more generally cognitive science, and wide enough to encompass different more specific theories. Investigating the linguistic means that exist for the coding of related concepts yields results that are interesting among other things because they contribute to constraining the arbitrariness of linguistic signs. C-subordinated concepts, for instance, tend to have a ‘heavier’ (more syllables, more morphemes) minimal coding than their immediate c-superordinates, if the latter are basic level elements in the sense of Rosch (1978):8 left foot has two syllables, foot only one. Similarly for m-i-subordinated concepts, when their instances are derivatively conceptualized, as compared to their basic level m-i-superordinates: Spanish dedo del pie (lit. ‘finger of the foot’) has four syllables, pie (foot) only one. By contrast, the minimal English codings of the same concepts, toe and foot, do not reflect a derived conceptualization, so they have the same coding cost, one syllable each. With respect to the distinction between onomasiological and semasiological investigations, the former asking for the different forms a given concept can be coded by in and across languages and the latter establishing the concepts that are coded by a given form of a given language, the ontolinguistic approach is helpful in both enterprises, but in view of its extralinguistic anchoring it is especially promising in circle-free cross-linguistic onomasiology.
12
Andrea C. Schalley and Dietmar Zaefferer
The methodology of ontolinguistic research, however, seems to be characterized by a vicious circle: Investigating those relations between linguistic forms that are induced by the relations between the concepts they encode presupposes an ontology. On the other hand, it is precisely the analysis of linguistic signs which offers the best access to the reconstruction of everyday ontologies. Fortunately, there are other ways of getting at conceptual knowledge than through language (such as psychological experiments, for instance) and so the circle can be avoided by tapping such non-linguistic resources. Having defined and illustrated the major ontological relations, delineated our concept of ontological knowledge and outlined the research program of ontolinguistics we are now in a good position to introduce the contributions to this volume. Two primary foci of interest are represented on the following pages: 1. The first and central focus is the genuinely linguistic question of how the ontological status of a concept restricts the ways it can be coded in the world’s languages. 2. The second and more encompassing focus is a question of cognitive science in general, the question which concepts figure in everyday ontologies of humans and how they are structured.
The majority of the papers in this volume, the predominantly linguistic ones, contribute to the first focus. The other papers, those that are mainly concerned with the second focus, reflect and add to the cognitive science aspects of the ontolinguistic enterprise in that they integrate insights from other disciplines such as artificial intelligence, philosophy, and neuroscience. The volume consists of 17 articles, which are arranged into four parts. The first introductory part comprises, apart from this outline, an overview of the conceptual space occupied by the different notions associated with the word ontology and related terms, of the different fields in which they occur and especially of the ways they are used in linguistics, in computer science in general, and in artificial intelligence (and distributed AI) in particular. This contribution by Nickles et al. is intended to pave the way to a cross-disciplinary understanding of the other contributions to the volume. The second part is concerned with foundational issues, general ontologies, and linguistic categories. It deals with the basic, underlying issues of the ontolinguistic enterprise, comprising a discussion of neuroscientific philosophical foundations, an outline of formal approaches to upper ontologies that are intended to build the basis for general ontological enterprises such as
Ontolinguistics – An outline
13
the Semantic Web, ontologies for linguistic descriptions, and discussions of the ontological status of linguistic signs. Certainly, the papers can only highlight certain aspects of the ontolinguistic enterprise, but they also demonstrate rather different approaches to it – looking at it from different disciplines’ backgrounds. The resulting picture may not be a comprehensive one, but the remaining gaps will certainly stimulate further research in ontolinguistics in general and its different applications in particular. The first contribution to this foundational part, The emergence of a shared action ontology: Building blocks for a theory, develops basic ingredients for a both neuroscientifically and philosophically founded theory of ontology sharing in the linguistically crucial domain of action. The philosopher Thomas Metzinger and the neuroscientist Vittorio Gallese argue in this paper that the brain, in a figurative sense, possesses an ontology too, creates primitives, and makes existence assumptions. By reviewing a series of neuroscientific results and integrating these within a wider philosophical perspective, they discuss specifically the contribution of certain areas of the premotor and parietal cortex, the likely homologue of the monkey brain areas in which mirror neurons were first detected, where the human mirror neuron system9 seems to construct goals, actions, and intending selves as basic components of the world it interprets. This paper can be seen as a foundational paper of this volume, in that it shows how ontologies exist in the human mind – that the brain itself constructs such systems of conceptualizations. It thus provides a basis for inquiries into ontolinguistics. At the same time it offers some pointers towards answering the question why it is reasonable to assume that there are systems of conceptualization that are likely to evolve in a particular way and hence why we as humans are able to share such systems in our use of language. The second paper, Formal representation of concepts: The Suggested Upper Merged Ontology and its use in linguistics by Adam Pease, presents the outcome of a recent AI effort to provide a semantic underpinning for enterprises such as the Semantic Web and compares it to other proposals. It is based on the assumption that human language can be meaningfully mapped to a formal ontology for use in computational processing of natural language expressions. A corresponding formal ontology was developed, which is widely used today. The paper gives an overview of this ontology, the Suggested Upper Merged Ontology (SUMO). The SUMO is specified in a first order logic language, and an index has been created, which links WordNet (Fellbaum 1998) synsets, i.e. sets of synonymous readings of English words, to terms in SUMO, thereby linking SUMO to linguistic endeavors. The latter link makes
14
Andrea C. Schalley and Dietmar Zaefferer
it possible to look up the ontological status of most concepts coded by words of English. For instance the concept HUMAN FOOT used to illustrate taxonomic relations above shows up as one reading of ‘foot’ which is a subclass of BodyPart and the concept HUMAN STEP appears under ‘stepping’ as a subclass of BodyMotion. Similarly, all ontological relations discussed in this volume can be compared with their treatment in SUMO. In Linguistic interaction as ontological mediation, John Bateman deals with relating two basic ontological strata, linguistic information on the one hand and contextual/conceptual information on the other. The question of how these are to be related is a central problem and task for any research into ontolinguistics. Bateman argues that in work that has addressed this traditional question so far, a crucial component has been overlooked – linguistic interaction. The construction of the relation between the two strata is suggested to essentially be one of negotiated mediation, as instantiated by linguistic interaction. Thus, Bateman favors a more flexible account of the discussed relation, concluding that such an account results in a greater adequacy when faced with real linguistic behavior. Specifically, he claims that much can be learnt if one looks at natural human dialogue, an approach that has not been taken so far in the domain of ontologies – and ontolinguistics. Semantic primes and conceptual ontology discusses the relationship between semantic primes and their underlying ontological categories. Specifically, Cliff Goddard’s paper is targeted at the ontology and hence the system of conceptualizations that is implicit in the NSM approach to semantics (Wierzbicka 1972, 1996; Goddard and Wierzbicka 1994, 2002). The NSM system does not distinguish between linguistic and conceptual entities, and therefore does not distinguish between the two strata that are emphasized in other contributions to this volume. Presupposing the meta-semantic adequacy of natural languages – a subset of the lexemes (comprising the exponents of semantic primes and grammatical constructions of any language are assumed to suffice for semantic descriptions of this language – the paper focuses on those semantic primes that code ‘thing-like’ entities, referred to as ‘substantives’ in the paper. Based on empirical work done over decades within the NSM framework, the resulting ontology (reflected by specific semantic primes) is understood as being universal. Due to this universality claim the ontology could thus be seen as constituting part of the system of conceptualizations that are shared by all humans. The fifth contribution in this part, Using ‘Ontolinguistics’ for language description by Scott Farrar, describes the General Ontology for Linguistic
Ontolinguistics – An outline
15
Description (GOLD), a general framework for linguistic knowledge and nonlinguistic knowledge (the latter is also referred to as ‘world knowledge’). Apart from the fact that Farrar decidedly distinguishes between linguistic and non-linguistic elements, he explores the ontological nature of the linguistic sign, and he argues that claims of how world (or non-linguistic, for that matter) knowledge affects linguistic structures presuppose a good theory of the world. In other words, before one investigates the first focus mentioned above and discusses how the conceptual system behind linguistic coding affects just this linguistic coding, one needs to be able to account for this conceptual system and hence for non-linguistic knowledge as well. This is what this contribution sets out to tackle. The last contribution in the second part of the volume, Dietmar Zaefferer’s Language as mind sharing device: Mental and linguistic concepts in a general ontology of everyday life, explores the consequences of the neuroscientifically supported assumptions about shared action ontologies outlined by Metzinger and Gallese for the conceptualization of a specifically human kind of action, namely linguistic action. According to Zaefferer, human language is a highly elaborated mind sharing device that facilitates dramatically not only goal sharing, but also emotion sharing and, above all, knowledge sharing. Based on the assumption that the major dividing lines between ontological categories of everyday life are reflected in the semantically anchored grammatical categories of human languages, it outlines a general ontology that subsumes both the thing-related distinctions between count and mass entities and the event-related distinctions between the familiar aspectualities. It contains several subontologies of concrete, i.e. spatio-temporally located, and abstract entities and includes a domain ontology of mental entities and a domain ontology of external-mental hybrids. Prime among the former are different kinds of propositional contents and of propositional attitudes, both being connected by a domain ontology of modal categories. Putting together the building blocks provided by these ontologies, a domain ontology of linguistic phenomena is constructed that defines among other things linguistic action tokens as external-mental hybrids which consist of an external transient process together with the mental transition caused by it, and that characterizes the major speech act types in terms of their potential for the sharing of goals, knowledge, emotions, and combinations thereof. The third part and the fourth part of the book primarily address the central focus of how the ontological status of a concept restricts the ways it can be coded in the world’s languages. The papers are sorted into the parts accord-
16
Andrea C. Schalley and Dietmar Zaefferer
ing to semasiological criteria: the third part deals with categories that can be coded by closed-class elements in most languages, whereas the fourth part focuses on concepts that are typically coded by open-class elements. The tendency expressed above, that closed-class codings tend to be available for more general or elementary concepts, is reflected in the papers of the third part. Here, such elementary conceptual categories – spatial structure and relations, postures and existence – and the ways they can be linguistically expressed, namely by closed-class or grammatical elements, are investigated. The contributions to this part are thus in-depth studies targeted at the first ontolinguistic question posed above. The first paper of this part, Leonard Talmy’s The representation of spatial structure in spoken and signed language: A neural model, compares how spatial structure is coded in spoken and signed language. Following an analysis of spatial schemas in spoken language, Talmy catalogues basic elements out of which these spatial schemas are made up. Such basic elements are generally coded by closed-class elements. Then he moves on to compare the results on spoken language with the coding of spatial structure in signed language and discusses the classifier subsystem of signed language (again closed-class elements). Finally, as a result of the comparison of these two modalities, he proposes a new neural model of the language capacity. From the ontolinguistic perspective, the closed-class elements in both modalities, their underlying conceptual categories, and a comparison of these are extremely interesting and revealing objects of research. The next article, Postural categories and the classification of nominal concepts: A case study of Goemai, investigates a very interesting way of classifying objects, namely via their default posture. Birgit Hellwig discusses the coding of postural information in the nominal classifier system found in Goemai, a West Chadic language of Nigeria. Usually, classifier systems are based on inherent properties, which makes Goemai and its postural information based system appear as an exception. However, Goemai uses the default or canonical position and hence a time-stable property to set up disjoint nominal classes, and thus is not that extraordinary after all. Nevertheless, a classifier system based on postural information (with the additional possibility to override the default classifier assignment and to trigger pragmatic implicatures) is ontolinguistically very attracting, because it ‘adds’ to a potential catalog of general or elementary concepts that are reflected in classifiers a rather unexpected (yet not totally exceptional) concept.
Ontolinguistics – An outline
17
Marija M. Brala’s contribution Spatial ‘on’–‘in’ categories and their prepositional codings across languages: Universal constraints on language specificity investigates those spatial relations that are expressed by English on and in (the ‘on-in range’) and their cross-linguistic prepositional codings. In line with the ontolinguistic idea, she asks how the cognitive organization or ‘ontology’ of these spatial relations is reflected in linguistic codings across languages. The study demonstrates that the underlying ontology plays a crucial role in constraining the linguistic coding and that it is in particular possible to systematize this ‘on-in’ range in terms of different combinatorial feature patterns. These systematic patterns are based on only three feature domains or dimensions of variation (with different values) of the spatial realm, namely DIMENSIONALITY (point, axis/line, volume), ATTACHMENT (absence, presence, and quantity of contact) and ORIENTATION (orthogonal vs. parallel and directionality). Like the previous contribution, Semantic categorizations and encoding strategies, is concerned with spatial relations. In this paper, the last one of the third part, Stavros Skopeteas discusses the cross-linguistic coding of spatial relations in which an entity is localized on or above the topside of another entity. Languages differ in the coding of such SUPERPOSITION relations. They either lexicalize the concepts ON vs. ABOVE, which are subordinate concepts to SUPERPOSITION, and thereby distinguish whether there is a contact established between the two related entities. Or, they directly encode the superordinated SUPERPOSITION concept. Skopeteas also presents languages that mix these two types of coding in different ways, including the case of a constructional split. The codings are not only studied in terms of lexicalization patterns, but also in relation to the handling of discourse situations and in terms of pragmatic inference patterns. As indicated above, the fourth part of this volume concentrates on concepts that are typically encoded by open-class elements. The first paper in this part focuses on nominally coded concepts, whereas the other papers are concerned with concepts of events or similar entities (for short: eventities, cf. Zaefferer 2002), i.e. with categories that have prototypically verbal coding. They deal with the temporal and spatial structure of eventities and the structure and ontological status of these concepts. The contribution Taxonomic and meronomic superordinates with nominal coding is the only paper in this volume that decidedly concentrates on concepts that in general are only coded by nouns. Wiltrud Mihatsch discusses
18
Andrea C. Schalley and Dietmar Zaefferer
the nominal coding of both taxonomic and meronomic superordinates with respect to basic level concepts. She demonstrates that in everyday language the conceptual relations underlying noun taxonymies are not uniform inclusion relations. Specifically above the basic level, other conceptual organization principles are at work, and these are well reflected in the linguistic coding. Her results show that codings of superordinates are either placeholders with pragmatic and grammatical functions, or are lexical items based on the conceptual conjunction or disjunction of basic-level schemata. In particular, many collective nouns can be found on the superordinated levels, because conjunctions are easier to process than disjunctions. This explains an interesting change in the career of some codings of c-superordinated concepts of basic level elements, namely those which evolved from labels for m-asuperordinated concepts by means of dropping an originally required masscount shifter like piece of. The English noun garment, for instance, meant originally a collection of body covering elements. Motion events in concept hierarchies: Identity criteria and French examples is also concerned with taxonomies, but Achim Stein focuses on eventity taxonomies, specifically on conceptual hierarchies of motion eventities. On the basis of a detailed semantic analysis which conceives of verb meanings as having a composite internal structure, the paper both gives a critical account of the representation of verb meanings in existing conceptual hierarchies and makes suggestions for a more consistent representation of verb meanings in such hierarchies. Inspired by work on ontological engineering and by approaches in lexical semantics, the status of argument structure in lexicon and conceptual representation and the relations between verbally coded concepts within hierarchies are discussed in considerable detail. As a result a lean meaning representation for French pousser ‘to push’ is argued for which corresponds to a concept that is c-subordinated both to CONTACT and MOTION, but treats the displacement element as some kind of non-rigid property. Martin Trautwein’s contribution On the ontological, conceptual, and grammatical foundations of verb classes also deals with verbal meanings. A modular approach to natural-language semantics is developed, which essentially presumes that underspecification of lexical content is a major factor responsible for natural language’s high adaptability to changing contexts and hence to its expressiveness. The paper discusses the relation between aspectual information and temporal interpretation, exploring how much temporal information verbs actually lexicalize, to what ontological kinds of temporal structure we refer when uttering verbal constructions, and the interpre-
Ontolinguistics – An outline
19
tations we are able to come up with when processing verbal constructions. In investigating the coding of temporal structure in both lexical and phrasal linguistic expressions, foundational issues of ontolinguistics are addressed. Furthermore, an explanation for the existence of semantic classes such as the aspectual classes of verbs is offered, which utilizes the proposed modular approach. In The ontological loneliness of verb phrase idioms, Christiane Fellbaum looks at German verb phrase idioms from an ontolinguistic perspective. Given that there are generally regular lexicalization patterns for the linguistic expression of concepts, she asks whether and how such verb phrase idioms can be suitably embedded in the structure of the lexicon and represented in WordNet (Fellbaum 1998). In order to answer this and thus to determine the verb phrase idioms’ status in the lexicon, the nature of these complex concepts encoded by verb phrase idioms is studied. It turns out that in addition to idioms whose underlying concepts entertain straightforward conceptual-semantic relations to concepts coded by simple verbs, there are several distinct classes of ‘lonely’ German verb phrase idioms. These ‘lonely’ verb phrase idioms have an interesting ontological status in that they are only remotely connected by conceptual-semantic links to the rest of the lexicon. The last paper in the present volume, Relating ontological knowledge and internal structure of eventity concepts, primarily addresses – in contrast to the preceding papers – the cognitive science focus mentioned above. It concentrates on the role ontological knowledge plays in the conceptualization of eventities and asks which meaning components are needed for decompositional representations of verbal meanings. Andrea C. Schalley is in particular concerned with the structural categories such meaning components belong to and hence with the structural aspects of eventity conceptualizations. As representational framework, the Unified Eventity Representation (UER, Schalley 2004) is deployed, which features a way of explicitly distinguishing different structural categories in semantic representations. The contribution discusses and represents parts of the ontology natural languages rely on. In addition, ideas and aspects of the other contributions to the volume and thus from different approaches to ontolinguistics are taken up in the course of the paper. The integration of these aspects demonstrates their relevance and importance both specifically for the domain of eventity conceptualizations and generally for the broader ontolinguistic enterprise.
20
Andrea C. Schalley and Dietmar Zaefferer
Notes 1. Evidence for a top-down differentiation of the conceptual system during the development of infants can be found, e.g., in Mandler 2004. 2. We use small caps to indicate when we are not talking about English words, but the concepts that can be activated by an appropriate use of these words. 3. For the different subtypes of meronomic relations cf. Winston, Chaffin, and Herrmann (1987), Schalley (2004: 136–147, and this vol.), Bateman (this vol.), and Zaefferer (this vol.). 4. For a recent overview of the debate on the analytic-synthetic distinction compare Rey 2003. 5. Marija Brala (personal communication) has rightly pointed out that analogous cases like PENGUIN entailing BIRD and WATER MELON entailing either FRUIT or VEGETABLE seem to be much less obvious (the reader is invited to ask herself which of the latter entailments holds): The fact that “many of my younger students are not sure whether a penguin is a bird or not, and I have personally thought until a few months ago that a watermelon was a fruit” does not entail “that my students do not know the meaning of the word penguin, or that I do not know the meaning of the word watermelon”, she remarks. We agree, because the problem is at the other end of the ontological relation: The concept the relevant students associate with the syllable bird, lets call it BIRD s , seems to be narrower than the more common concept BIRD, and the everyday concept connected with the four syllables of vegetable can be both c-compatible and c-incompatible with the concepts evoked by the syllable fruit, because vegetable is not a botanical term and therefore the everyday concept VEGETABLE is compatible with the botanical concept FRUIT b , whereas the everyday concept FRUIT e may exclude VEGETABLE . (Compare also the famous U.S. Supreme Court case Nix v. Hedden, where the botanical classification of tomatoes as fruits was ruled irrelevant in view of the fact in that according to common language tomatoes are vegetables http://laws.findlaw.com/us/149/304.html.) 6. The consequences this has for instance for the grammar of coordination are illustrated in Nickles et al. (this vol.). 7. Here explanatory adequacy is of course not to be taken in the technical sense of Chomsky 1964 (explaining language acquisition), but in a generic sense that includes properties of the conceptual-intentional system. 8. This is not at variance with Zipf’s law (Zipf 1949), according to which the length of a word is inversely related to its frequency, because basic level elements are probably the most frequently used ones. Note that there may be some tension between general and special group use: What is only moderately frequent in general use may be highly frequent in an in-group. In its standard form, the German word Universit¨at ‘university’ has five syllables and final stress, by contrast, the form almost exclusively used by students in peer talk has two syllables and the prototypical trochaic stress pattern of German: Uni. 9. The most recent literature shows that the existence of a human mirror system is not controversial anymore, the discussion seems to focus on other issues like the question of how closely it is related to human language (Aziz-Zadeh et al. 2006).
Ontolinguistics – An outline
21
References Aziz-Zadeh, Lisa, Lisa Koski, Eran Zaidel, John Mazziotta, and Marco Iacoboni 2006 Lateralization of the human mirror neuron system. The Journal of Neuroscience 26 (11): 2964–2970. Bateman, John this vol. Linguistic interaction as ontological mediation. Chomsky, Noam 1964 Current Issues in Linguistic Theory. (Janua linguarum. Series minor 38.) The Hague: Mouton. Fellbaum, Christiane (ed.) 1998 WordNet: An Electronic Lexical Database. (Language, Speech, and Communication.) Cambridge, MA/London: MIT Press. Goddard, Cliff, and Anna Wierzbicka (eds.) 1994 Semantic and Lexical Universals: Theory and Empirical Findings. (Studies in Language Companion Series 25.) Amsterdam/Philadelphia: John Benjamins. 2002 Meaning and Universal Grammar. Theory and Empirical Findings. Vol. 1 & 2. (Studies in Language Companion Series 60, 61.) Amsterdam/Philadelphia: John Benjamins. Mandler, Jean M. 2004 The Foundations of Mind. Origins of Conceptual Thought. Oxford/New York: Oxford University Press. Mihatsch, Wiltrud this vol. Taxonomic and meronomic superordinates with nominal coding. Nickles, Matthias, Adam Pease, Andrea C. Schalley, and Dietmar Zaefferer this vol. Ontologies across disciplines. Quine, Willard V. 2004 Reprint. Two dogmas in retrospect. In Quintessence: Basic Readings from the Philosophy of W. V. Quine, Roger F. Gibson, Jr. (ed.), 54– 63. Cambridge, MA: Belknap. Original edition, Canadian Journal of Philosophy, 21 (3), 1991: 265–274. Rey, Georges 2003 The analytic/synthetic distinction. The Stanford Encyclopedia of Philosophy (Fall 2003 Edition), Edward N. Zalta (ed.), http://plato.stanford.edu/archives/fall2003/entries/analyticsynthetic/. Rosch, Eleanor 1978 Principles of categorization. In Cognition and Categorization, Eleanor Rosch and Barbara B. Lloyd (eds.), 27–48. Hillsdale, NJ: Lawrence Erlbaum.
22
Andrea C. Schalley and Dietmar Zaefferer
Schalley, Andrea C. 2004 Cognitive Modeling and Verbal Semantics. (Trends in Linguistics. Studies and Monographs 154.) Berlin/New York: Mouton de Gruyter. this vol. Relating ontological knowledge and internal structure of eventity concepts. Wierzbicka, Anna 1972 Semantic Primitives. Translated by Anna Wierzbicka and John Besemeres. (Linguistische Forschungen 22.) Frankfurt: Athen¨aum. 1996 Semantics: Primes and Universals. Oxford/New York: Oxford University Press. Winston, Morton E., Roger Chaffin, and Douglas J. Herrmann 1987 A taxonomy of part-whole relations. Cognitive Science 11 (4): 417– 444. Zaefferer, Dietmar 2002 Polysemy, polyvalence, and linking mismatches. The concept of RAIN and its codings in English, German, Italian, and Spanish. DELTA – Documentac¸a˜ o de Estudos em Ling¨u´ıstica T´eorica e Aplicada 18 (spe.): 27–56. Special Issue: Polysemy. this vol. Language as mind sharing device: Mental and linguistic concepts in a general ontology of everyday life. Zipf, George K. 1949 Human Behavior and the Principle of Least-Effort. Cambridge, MA: Addison-Wesley.
Ontologies across disciplines Matthias Nickles, Adam Pease, Andrea C. Schalley, and Dietmar Zaefferer The English word ontology, together with its counterparts in many languages, has made a breathtaking career during the last decades especially in information science, but also in other disciplines. Since its definitions vary considerably within and especially across disciplines, and since this volume, although clearly focused on linguistic matters, is conceived as tying together several disciplines, it seems appropriate to provide a short survey of these uses in order to make the different contributions and their interconnections more accessible for those readers who are not familiar with all the fields that are represented between the covers of this book (presumably the majority).
1.
Readings of ontology: Carving up a conceptual space
1.1.
The major dimensions of variation
The traditional notion of ontology has a long and venerable history in philosophy. The most usual word for it, however, the compound built from the Greek forms onto- ‘of being’ and logos ‘speech, reason’ in the guise of its later derivative logia ‘science’, is a comparatively recent invention. It seems to have originated in the context of the early Enlightenment, since its first attested appearance in print is in 1606 on the front page of the textbook Ogdoas Scholastica (‘Scholastic Eightfold’) by Jacob Lorhard,1 and it became popular about one century later when Christian Wolff used it in the title of his 1729 book Philosophia Prima sive Ontologia (First Philosophy or Ontology). There, Wolff gives the following definition: Ontologia seu Philosophia Prima est scientia entis in genere, seu quatenus ens est (Ontology or First Philosophy is the science of Being in general or as Being). But the study of Being as Being goes back at least to Aristotle’s Metaphysics. To quote the philosopher Nino Cocchiarella: Aristotle was the founder not only of logic in western philosophy, but of ontology as well, which he described in his Metaphysics and the Categories as
24
Matthias Nickles et al. a study of the common properties of all entities, and of the categorial aspects into which they can be analyzed. The principal method of ontology has been one or another form of categorial analysis, depending on whether the analysis was directed upon the structure of reality, as in Aristotle’s case, or upon the structure of thought and reason, as, e.g., in Kant’s Critique of Pure Reason. (Cocchiarella 2001: 117)
Despite its conciseness this characterization already makes it possible to distinguish two dimensions of variation along which notions of ontology vary. One is opened up by the common properties of all entities on the one hand (being as being) and their categorial aspects on the other (categories and kinds of entities), let us call this the dimension of generality (as opposed to specificity). The other one is opened up by the distinction between external reality and the contents of thought and reason. Let us call this the dimension of objectivity (as opposed to subjectivity). Since the two are orthogonal, we can imagine them as spanning a vertical plane where generality extends from its maximum at the top through increasing degrees of specificity to the lower bound of generality at the bottom, and where objectivity extends in the depth with its maximum at the foreground and increasing degrees of subjectivity towards the back (cf. Figure 1). Objectivity subjective
Generality general
objective specific
Figure 1. The generality-objectivity plane.
So when the Dictionary of Philosophical Terms and Names defines ontology as “Branch of metaphysics concerned with identifying, in the most general terms, the kinds of things that actually exist”, (http://www.philosophypages.com/dy/o.htm#onty) we can say that this notion of ontology places it close to the upper foreground in our picture. Sim-
Ontologies across disciplines
25
ilarly when ontology is variously described as being concerned with the ultimate furniture, or the basic furniture or simply the furniture of the world, this corresponds to increasingly large regions on our plane from the top to the bottom. And when Aristotle is concerned more with the structure of reality itself, this notion of ontology is located in the foreground of our plane, whereas Berkeley’s ([1710] 1999: §6) idealist view that All the choir of heaven and furniture of the earth . . . have not any subsistence without a mind has its place considerably further back. But how does this relate to Gruber’s often-quoted definition of ontology for the purposes of Artificial Intelligence2 as “an explicit specification of a conceptualization,” where a “conceptualization is an abstract, simplified view of the world that we wish to represent for some purpose” (Gruber 1993: 199)? Is there a connection at all or is this a case of homonymy, an entirely unrelated different use of the same orthographical form? Gruber adds: “The term is borrowed from philosophy, where an ontology is a systematic account of Existence.” This is of course neither the notion of ontology we have just discussed (science of being as being; study of the common properties of all entities; branch of metaphysics concerned with the kinds of things that exist) nor something completely different, instead it is something closely related: The concept coded by the mass noun (no article, no plural) is that of a field of investigation, of a discipline (‘science’, ‘study’, ‘branch’), the concept expressed by the count noun (taking articles and plural form) is that of a specific outcome of that kind of investigation (‘account’): Aristotle’s ontology, e.g., is different from Kant’s, but both contribute to (the field of) Ontology. In order to visualize the distinction we will use an uppercase initial for the name of the field (domain of issues, etc.) – i.e. ‘Ontology’ – and a lowercase initial for the different views that are produced in the field – i.e. ‘ontology’. This kind of field-product polysemy is familiar from linguistics: Syntax as a mass noun means a field, a certain branch of linguistics; its different outcomes – like say Haider’s syntax of German (Haider 1993) – are coded by the corresponding count noun. In fact, in linguistics there is a third use of the term syntax (and a second use of the count noun), one that relates to the subject matter of the second and first use, i.e., that subsystem of a language that constrains the building of phrases from word forms. So there is an object-level use of this term (syntax as language subsystem), a meta-level use (syntax as theoretic account of this subsystem), and in a sense a transmeta-level use (syntax as subfield or branch of linguistics). Is there a similar three-level distinction with ontology?
26
Matthias Nickles et al. Objectivity
Generality
level field
specific account
purview
Figure 2. Conceptual space of the notion ontology
The answer is ‘yes’ and ‘no’. ‘Yes’ in the sense that of course there are three levels as well: What there is and its categories (object-level), specific accounts thereof (meta-level), and a field concerned with this subject matter and therefore with the production and discussion of specific accounts of being (yet another level). ‘No’ in the sense that only the last two levels are properly called ontology, the second one by transparent metonymic extension (and count noun formation) from the name for the third one, whereas the first one requires different means of expression3 such as the real world (as opposed to possible counterparts) or simply reality or rather its (ultimate or basic) furniture. Whether we include the object-level in this overview or not (and in fact we should include it for the sake of being systematic), we now have introduced a third dimension of variation and thus created a conceptual space where variants and relatives of the notion ontology can be localized (cf. Figure 2): The vertical dimension reflects generality with the most general matters at the top; the depth dimension reflects objectivity with the most objectivist view at the front; and the horizontal dimension has three segments with the world and its aspects and parts at the right, the different accounts of it in the middle and the field(s) of Ontology at the left. Note that the horizontal relations between the three blocks are somewhat heterogeneous because the field of Ontology is not located on a meta-level with respect to the different ontologies in the same way the latter are on a meta-level with respect to what they account for. That would be the level of a metaontology, an ontological account of different object ontologies. The field of Ontology is rather something that is concerned with being and its kinds by
Ontologies across disciplines
27
producing and discussing ontologies. Obviously, the present book is a product of this (ever evolving) field, and what the present subsection tries to outline is a clarification of different concepts that have come to be called ‘ontology’. It is now time to come back to the second dimension of variation, objectivity. The AI-reading of the notion of ontology mentioned above seems to be rather clear in this respect: “An ontology is an explicit specification of a conceptualization.” (Gruber 1993: 199) This points to a subjective notion of ontology: Not reality per se is its object, but reality under a given conceptualization. But the latter notion is explained by Gruber as follows: “conceptualization: the objects, concepts, and other entities that are presumed to exist in some area of interest and the relationships that hold [among] them” (Gruber 1993: 199). This is, to say the least, a little confusing: If your current area of interest is your desktop and if you presume that an object exists there, namely a pile of unread papers, is this object then part of a conceptualization? Certainly not. Gruber must mean a conceptualization of this object. But then what are the concepts that you presume to exist on your desktop? Probably there aren’t any, because wherever concepts exist, in Plato’s heaven or in people’s minds, they certainly do not exist on desktops. On the other hand, concepts fit much better in a conceptualization than objects. So the quoted definition is not very helpful, and we wouldn’t have bothered mentioning it at all if this kind of confusion of object- and meta-level didn’t seem to be quite widespread in the field.4 What Gruber must mean by conceptualization of a given domain is a system of concepts that adequately characterize everything that exists in that domain: individual concepts for individuals, property concepts for properties, relation concepts for relations, second order concepts for first order concepts, etc. But it is an open question whether these concepts are meant to be objective and the characterizations they provide are thought of as realistic (located at the foreground of our conceptual space), or more to the back in the sense of representing some other view. And maybe this is good as it is, because the depth dimension of the conceptual space of ontologies is the most difficult and philosophically most demanding one. It is to do with the independence or interdependence of Ontology and Epistemology, with questions of realism and opposing views, in short with the “most hotly debated issues in contemporary metaphysics” according to philosopher Alexander Miller (2005). Then what is realism? “Realism is the thesis that the objects, properties and relations the world contains exist independently of our thoughts about them or our perceptions of them. Anti-realists either doubt or deny the exis-
28
Matthias Nickles et al.
tence of the entities the realist believes in or else doubt or deny their independence from our conceptions of them.” (Khlentzos 2004) Realism is rarely held across the board, philosophers rather tend be realist about one domain and non-realist about another. John Searle, for instance, in his recent paper What is an institution? says: [I]t is essential to distinguish between those features of the world that are totally independent of human feelings and attitudes, observer independent features, and those features of the world that exist only relative to human attitudes. . . . It is important to see that one and the same entity can have both observer independent features and observer dependent features, where the observer dependent features depend on the attitudes of the people involved. (Searle 2005: 3–4)
Searle’s aim in the paper just quoted from is “to explain how the ontology of institutions fits into the more basic ontology of physics and chemistry” (Searle 2005: 1) and the explanation he offers is the following: [O]ne and the same phenomenon (object, organism, event, etc.) can satisfy descriptions under which it is non-institutional (a piece of paper, a human being, a series of movements) and descriptions under which it is institutional (a twenty dollar bill, the president of the United States, a football game). An object or other phenomenon is part of an institutional fact, under a certain description of that object or phenomenon. (Searle 2005: 12)
For Searle an institutional fact is something that has been collectively assigned a status function, and since this collective assignment presupposes some representation of it as having this function, institutions require (at least some primitive form of) language. Although Searle devotes one section of his paper to Language as the Fundamental Social Institution, he is mainly concerned there with showing that language is a prerequisite for social institutions and not with discussing the status of language as a social institution itself. The reason may be this: If language were a social institution just as the others it would have to be a prerequisite of itself. The way out of this seeming circle is not hard to find: Each higher form of language requires only some more primitive form of language and so there is space for the evolution of language out of more primitive forms of representation sharing. Therefore an extension of Searle’s (or some similar) concept of an institution to include language as the fundamental in-
Ontologies across disciplines
29
stitution along these lines is consistent and will be taken for granted in the following sections.5 Coming back to the definition of realism above (“the thesis that the objects, properties and relations the world contains exist independently of our thoughts about them or our perceptions of them”) we now see that the devil is in a certain detail, namely the reference of the possessive pronoun form our. Here we have two options. If we read our to include any rational subject, then we are forced to assume an at least partially non-realist position if we assume (as we probably should) that the world contains among other things institutions and institutional facts, since we have just subscribed to the view that their existence depends on thoughts about and perceptions of them. If we read it to include only the persons who are involved in the current reflection process, then we can maintain a completely realist position which has the interesting property of allowing two kinds of really existing phenomena, those that exist independently of anyone’s thoughts or perceptions and those that exist independently of our (in the narrow sense), but not independently of others’ thoughts or perceptions.6 Human languages, being very fundamental forms of institutions, are of the second kind. Fortunately, in the context of the present volume the issue of realism and its different opposing views does not really constitute a problem. Most of the authors seem to be realists about the world, although there may be considerable disagreement with respect to the degree to which different views on this world (alias ontologies) can diverge.7
1.2.
Further dimensions of variation
Our short review of the depth dimension of the conceptual space that embeds different notions of ontology has brought to the fore the importance of the agents who have to do with ontologies (the plural makes it clear that the metalevel concept is meant). Here at least two roles have to be distinguished: The author of an ontology and its user. It is a trivial fact that they need not coincide, but an especially compelling illustration of this fact is provided by Rolf Pfeifer’s (2000) Didabots, simple robots for didactic purposes, who didn’t author but use an ontology which consists of only three situation categories: (a) no obstacle, (b) obstacle to the left, (c) obstacle to the right. Didabots have wheels and sensors and an algorithm that, based on their ontology, lets them avoid obstacles.8
30
Matthias Nickles et al.
Whereas the user of an ontology can always be identified, its author may well be unknown or even inexistent. Take Caenorhabditis elegans, the nematode or roundworm that became famous among other things for its nervous system which consists of only 302 neurons. One could say that the ontology of C. elegans comprises at least the following categories and their complements: (a) increasing concentration of an attractant, (b) decreasing concentration of a repellent, (c) increasing closeness to the preferred temperature. This makes sense insofar as the behavior of this nematode is geared towards situations of category (a), (b) and (c), and not their complements; the former two forms of behavior are called chemotaxis, the latter thermotaxis. So if one is ready to speak of the world view or ontology of a robot or a roundworm, the identity of the user is clear, but in the latter case the identity of the author is problematic. It therefore seems reasonable to assume that oftentimes ontologies have simply evolved, without any specific author available. Alongside with the roles of the author and the user, the role of the object or domain of an ontology (in the right-hand column of our conceptual space) is of prime importance. It has already been shortly addressed in terms of the horizontal dimension of variation of our conceptual space, so a short reminder will suffice here. Ontologies vary with respect to their ‘aboutness’, i.e., what they are ontologies of. This can be, with decreasing generality, (i) all possible worlds, (ii) one world only, especially the one we live in (or seem to live in; cf. the depth dimension), (iii) subdomains of this world (or others) of increasing degrees of specificity. Is there an upper bound for the specificity in (iii)? In other words: What is the minimum degree of generality that is required for an ontology? Does it make sense to speak of the ontology of this nasty fly that keeps circling your head as you are reading this? We submit that the answer should be negative. There is something like the ontology of C. elegans, or at least it is in the making,9 but this does not mean that this ontology provides a systematic account in all relevant aspects of a single exemplar of this species, but of all exemplars that come from the same kind of genome. Similarly, it certainly makes sense to develop an ontology not only of aircraft, but also one of aircraft accidents. But it does not make sense to create an ontology of the Airbus A 340 crash at Toronto on August 2nd, 2005,10 at least not without another significant extension of the concept. So far, even the most specific domain ontologies like the one of the famous roundworm have a generic object. They are intrinsically intensional insofar as they entail predictions about new entities that instantiate the generic entity. Insofar, ontological knowledge about a domain is definitional knowledge about it, not episodic knowledge about its states and fates.
Ontologies across disciplines
31
The last and certainly not least aspect of ontologies that has to be addressed in this overview is the evaluation aspect. If there are two competing ontologies of the same domain, is it possible that one is true and the other one false? Or are there other criteria for the evaluation of ontologies? If, as we have assumed, ontologies in the simple meta-sense are conceptualizations, then they themselves cannot be true or false, they can only be more or less adequate and more or less useful. And of course they are completely inadequate if they entail false statements. If for instance someone conceptualizes human languages as being either red or green, this entails that English is either red or green, which is false, because it involves a category mistake. Category mistakes, especially less blunt ones, are a serious source for inadequacies in ontologies. But even if an ontology does not entail false statements it can still be inadequate or less useful for various reasons. Usefulness is a relational concept which requires a purpose. Of two competing ontologies one can be more useful than the other for one purpose and less useful for another. Of two classifications of aircraft, e.g., one according to the kind of propulsion and the other one according to the status of the owner, the first one will outrank the latter in helpfulness when the purpose is spare parts, and the opposite will be the case when the purpose is legal matters of air traffic. The usefulness of adequate conceptualizations and hence ontologies is a key issue in all scientific disciplines,11 but it is especially important in disciplines that involve evolutionary accounts, such as biology, and even more so in fields where historical transfer plays a role, such as linguistics. Ontologies for linguistics are the topic of the contributions by Farrar and Zaefferer to this volume. Other aspects of the assessment of ontologies like provenance and credibility are discussed in Section 2.4.2.4. below. Here is a summary of the findings of this section. In its most general reading, the article-less term Ontology, which lacks a proper plural, has turned out to refer to a rather controversial and indeed puzzling subfield of philosophy and more precisely of metaphysics. Thomas Hofweber, a philosopher of language, metaphysics, and mathematics, speaks of a “puzzle about ontology”, which he identifies as “the puzzle that there seem to be two contrary but equally good answers to the question (Q) How hard is it to answer ontological questions?”, namely “Answer I: Very hard” and “Answer II: Trivial.” (Hofweber 2005: 259). Still, the definition of the philosophical discipline as being concerned with “what entities make up reality” (Hofweber 2005: 256) seems to be relatively uncontroversial.
32
Matthias Nickles et al.
This is not the case with the notions the count noun ontology, with a proper and frequently used plural, is used to encode. Ontologies in this sense, specific answers to the question of what entities there are, may – but need not – be the outcome of philosophical endeavors, they may come from other disciplines as well or they may be no human artifacts at all, as in the case of the ontology of a macaque monkey brain (Metzinger and Gallese this vol.). In that third case the hardness question quoted above does not even arise (after all, it is the author of an ontology who has to answer ontological questions) and in the second case there is a clear tendency towards Answer II (authoring an aircraft spare parts ontology will rarely benefit from an ambition to answer deep questions). With respect to ontologies from non-philosophical disciplines, the controversy about the definition is mostly restricted to computer science (cf. Sections 2.3. and 2.4. below). The reason has to do with the prevailing conceptualization (or domain ontology) of language within that field. In principle, answering the question of what there is in a given domain may take any of various forms the outcome of a process of stocktaking may take: term lists, thesauri, glossaries or what have you. According to the traditional view, the linguistic items in these data structures are but strings of bytes and not full-fledged linguistic signs with form, structure, and content (cf. Farrar this vol.). So computer scientists tend to emphasize that a real ontology has to be much more than a mere inventory of items, i.e. lists of strings (cf. Section 2.3. below), and the controversy is mostly about what additional ingredients are required for an enriched inventory to count as an ontology. The fact that philosophers never have thought about this kind of intricacies shows only that they always have taken the inferential potential of concepts for granted, whereas in computer science it takes a whole machinery to get it going. We will conclude this section with Figure 3 that illustrates some of the dimensions of variation we have presented above. In the rows from top to bottom it shows very general, more special and very special variants of Ontology and ontologies, in the columns from left to right (i) the corresponding fields, (ii) examples for specific cases of accounting for what there is in a domain, and (iii) the purview or intended domain of such a specific account, and in the depth from front to back realist-objectivist and more idealist-subjectivist approaches. Not shown in the figure are the following aspects of ontologies we have also addressed briefly above: origin (author or evolutionary process), user, purpose and quality.
Ontologies across disciplines
33
Objectivity Generality Ontology
maximal
ontologies (subjectivist)
entities for us
Ontology
ontologies (objectivist)
entities in themselves
Possible World Theory
Lewis' possible world realism
all possible worlds
Idealist Ontology
Kant's categories
the world for us
submaximal
Classical Ontology
Aristotle's categories
the world in itself
intermediate
Special Ontology
ontology for mental categories
the human mind
Nematodologic Ontology
ontology for C. elegans
the species of C. elegans
minimal
level field
specific account
purview
Figure 3. Example of variation dimensions and aspects of ontologies
2. Notions of ontology in different disciplines 2.1. Ontology and ontologies in other fields Having carved up the conceptual space underlying the term ‘ontology’, we will now turn to a discussion of notions of ontology in different disciplines. This includes briefly touching on disciplines that are not directly relevant with respect to linguistics and thus not in the focus of this volume. Following this, we will concentrate on the disciplines and areas that are connected to the ontolinguistics enterprise, specifically on linguistics, computer science, and artificial intelligence.
34
Matthias Nickles et al.
Ontological questions are metaphysical questions by definition, and etymologically speaking the latter come after the physical questions, so they can be turned back on physical issues. Contemporary theoretical physics is full of ontological questions of the hard kind like the particle-wave duality, quantum ontology or the ontological implications of string theory, so Ontology as a discipline has interesting issues to deal with that come from physics. But physicists do not seem to make much use of ontologies of their domain. The situation is different with the other hard sciences. One of the most successful ontologies of modern times seems to be the periodic table of elements in chemistry. This can be regarded as a paradigm case of an ontology not only because it is highly systematic and useful, but also for its predictiveness: At least two elements, those with atomic numbers 117 and 118, are claimed to exist, but have not yet been successfully attested. Another paradigmatic feature of this ontology is that it is about natural kinds. The same has been assumed for a long time for the species of biology. But it turned out that although there is no doubt that there are different individual species (species taxa as biologists call them) like Homo sapiens, our species, or Canis familiaris, that of the domestic dog, it is doubtful, if taken together, that the species form a natural kind or category because there is no criterion that unites them to the exclusion of other taxa (interbreeding competes with common ecological niche and phylogenetic unity, cf. Ereshevsky 2002). Nevertheless, one of the domains that are currently characterized by an incredible boom in ontologies is that of the live sciences. Bioinformatics and related areas are teeming with web services like Open Biomedical Ontologies (“an umbrella web address for well-structured controlled vocabularies for shared use across different biological and medical domains,” http://obo.sourceforge.net/) or Gene Ontology (“a controlled vocabulary to describe gene and gene product attributes in any organism,” http://www.geneontology.org/). Since the social sciences are only starting to discuss ontological issues (cf. our discussion of Searle’s contribution), it does not come as a surprise that social ontologies are not considered to be very mature so far, with a notable exception: Legal systems are perhaps the most well-developed ontologies in the social world. Most laws are categorizations of objects at some level and most legal disputes turn on distinctions among categories. Because legal systems often comprise the most well-developed ontologies of the social world, they are a good reference for philosophers and social scientists seeking to study social objects. (Koepsell 1999: 219)
Ontologies across disciplines
35
To conclude the general overview presented in this section, the increasing use of ontologies in business applications, e.g., workflow ontologies, should not remain unmentioned.
2.2.
Notions of ontology in linguistics
Compared to other disciplines, contemporary linguistics assigns the term ontology a rather peripheral role in its domain. So far, only two subfields seem to make systematic use of one or the other member of the family of concepts coded by this term and a third one is just beginning to do so. The younger one is that branch of computational linguistics that systematically takes advantage of ontologies in the AI sense of the term. One of the most recent and most extensive outcomes of this field is the book Ontological Semantics (Nirenburg and Raskin 2004). Its ontology lists the definitions of concepts for describing the meanings of lexical items of natural languages, but also for the specification of the meanings of the text-meaning representations that serve among others the function of an interlingua for machine translation. Other intended applications are information extraction, question answering, human-computer dialog systems, and text summarization. The older linguistic domain that uses the term ontology is model-theoretic formal semantics, which in one of its simplest guises uses the ontology of first order logic: individuals, sets of tuples of individuals and truth values. But progress in the semantic analysis of natural language made it soon obvious that a less parsimonious ontology is required for this purpose and so the number of ontological categories started to grow. To point out just a few milestones: Montague (1973) added possible worlds and moments of time to the basic ontology and projected from there, using a recursively defined set of types, an infinite ontology of possible denotations; Link (1983) devised an integrated ontology for individuals and substances by providing both with a semilattice structure; Davidson’s (1967) proposal to grant events the status of a basic ontological category has been welcomed and fruitfully employed in linguistics, and Barwise and Perry’s idea (1983) of enriching the ontology by admitting situations as ‘first-class citizens’ had a similar impact. In short, many contributed to the task of freeing linguistic ontology from the constraints of philosophical ontology. In the abovementioned paper, Godehard Link advocated the view that “reductionist ontological considerations” are “quite alien to the purpose of logically analyzing the inference structures
36
Matthias Nickles et al.
of natural language” and went on to state the maxim: “Our guide in ontological matters has to be language itself, it seems to me.” (Link 1983: 303–304) In the same spirit Emmon Bach coined the term ‘Natural Language Metaphysics’ (cf. Bach 1986; “. . . what I am doing here is not metaphysics per se but natural language metaphysics,” Bach 1989: 98). He is also to be credited with the most succinct characterization of the difference between the two:12 Whereas the philosopher is interested in answering question (1), the job of the linguist is to find convincing answers to question (2): (1)
What kinds of things are there?
(2)
What kinds of things do people talk as if there are?
We have called above the field that revolves around (1) Ontology, consequently we will use the term Language Ontology (with two capital initials) for the endeavors around (2), if language in general is concerned, and language Ontology, e.g., Korean language Ontology, when the focus is on an individual language. It should be clear by now that the subject-matter of Language Ontology does not concern only model-theoretic semantics (it simply becomes visible there most clearly), but should interest every linguist who subscribes to the view that linguistic signs associate perceivable forms with conceptual contents, because these conceptual contents are never isolated in human language users, but integrated into the way they conceptualize their world, their individual ontology. Individual ontologies contain one or more language ontologies, but also something else which is often called commonsense ontology. We submit that Commonsense Ontology, the study of commonsense ontologies, should be defined as being about answering the question (3): (3)
What kinds of things do agents behave as if there are?
There are two reasons for distinguishing Language Ontology from Commonsense Ontology (and the same holds for their lower-case counterparts). The first is that it makes sense to ascribe commonsense ontologies also to subjects that do not have language (like robots, macaque brains, and roundworms), and the second is that the question of the relation between the two is too interesting to be begged by blunt stipulation of their identity. The pertinent keyword is linguistic relativity and the challenge consists in factoring the ontogenesis of individual ontologies into (a) the conceptual default settings babies are born with, (b) the culturally induced development the conceptual
Ontologies across disciplines
37
system is subject to, and (c) the effects of the individual language on this development. This points to another discipline that uses the term ontology, not entirely within the confines of linguistics proper, but overlapping with it: developmental cognitive science. A look into the child development literature shows lively research activities in this field. Imai and Gentner, e.g., tested whether the distinction between object names and substance names is based on a pre-linguistic ontological distinction or is driven by language (Japanese and English). Their results lead them to the following speculation: Children begin learning word meanings building on their pre-linguistic ontological knowledge about individuation. Language learning leads children to pay attention to those aspects of the world that are habitually used in their own language, and this influence begins very early. Finally, children’s sensitivity to linguistically-relevant aspects of the world may come to extend beyond the context of language use. (Imai and Gentner 1997: 196–197)
There is no doubt that language ontologies and commonsense ontologies are closely related since every language ontology is “a conceptualization or categorization of what normal everyday human language can talk about” (Zaefferer 2002: 33–34) and this is largely determined by the requirements of everyday life. In other words, both primarily contain concepts of entities encountered in everyday life (for an overview of what that could and should comprise, cf. Zaefferer this vol.) and their relations. These are the concepts for the expression of which natural languages tend to readily provide codings, be they simple or complex. Examples include concepts such as CAR or RUN , the most compact codings of which in English are the noun car and the verb run (more complex codings like motorized vehicle with wheels or go faster than a walk are reserved for special purposes), and relations such as the conceptual subordination of CAR under VEHICLE and of RUN under MOVE. Systems for the representation of word semantics such as WordNet (Fellbaum 1998, cf. also Fellbaum this vol.) are based on sense relations and thus reflect the underlying language ontology, since sense relations are relations between words (in a reading) based on ontological relations between the concepts that constitute the meanings of these words (in that reading). Lexical semantics is not the only example of linguistic research that needs to take the corresponding language ontology into account. Another case in point is work on classifiers – be it in the context of classifier systems in different spoken languages (cf. again Imai and Gentner 1997 and Hellwig this
38
Matthias Nickles et al.
vol.) or of classifier predicates in signed languages (Talmy this vol.). Here, the underlying conceptual categorization of entities is responsible for the use of different classifier morphemes or predicates. A further example is the almost trivial observation that different word classes tend to reflect different conceptualizations in that verbs usually code eventities whereas non-derived nouns most often code ‘thing-like’ entities (called ‘inventities’ in Zaefferer this vol., and ‘ineventities’ in Schalley 2004). Based on this, adjectives code mostly characteristics or attributes of the latter, whereas adverbs do the same for the former. A fourth kind of linguistic studies where language ontology plays a role is anaphora resolution. In a sentence like When you try to catch a lizard, the reptile may drop the tail and escape both definite noun phrases are anaphorically related to the indefinite a lizard, but the relationship is mediated by ontological relations of different kinds (for the definitions compare Schalley and Zaefferer this vol.): Since LIZARD is c-subordinated to REPTILE (every lizard is a reptile), the reptile may have its antecedent in a lizard, and since LIZARD is m-i-superordinated to TAIL (every complete lizard has a tail as integral part), the tail may be interpreted as including a possessor slot which again has its antecedent in a lizard. Given the ontolinguistic framework the former case could be called conceptual subanaphor and the latter meronomic superanaphor. Our final example showing the relevance of ontological knowledge for the proper use of language is the grammar of coordination. A precondition for the coordination of phrases as well as sentences is that the conjuncts are parallel with respect to syntax, semantics, and prosody (Lang 1984), where semantic parallelism is defined by two constraints: (a) the concepts coded by the coordinated elements have to be semantically independent, i.e., neither of them is c-subordinated to the other, and (b) there has to be a non-trivial subordinator, a third concept that is c-superordinated to both. So my dog and my animal and a walk and an integer are both semantically bad noun phrases, the first for violating (a), since DOG is c-subordinated to ANIMAL, and the second for violating (b), since the strongest common c-superordinate of WALK and INTEGER is probably ENTITY and so it could not be more trivial. As mentioned at the beginning of this section there is a third subfield of linguistics that makes use of the term ontology or rather is just beginning to do so. It is the field of foundations of linguistic theory together with the field that deals with linguistic terminology. Whereas the former is concerned with questions various other disciplines like philosophy of language and philoso-
Ontologies across disciplines
39
phy of science are also interested in, the latter has among other things to keep pace with all the terminological innovations that keep growing in the different schools of linguistics around the globe. And still they are united by a common interest in what will be called ‘Ontology for linguistics’ (or ‘ontologies for linguistics’) here in order to distinguish it from Language Ontology and its kin. Continuing our strategy we will characterize also this field by its leading question: (4)
What kinds of things linguists talk about are there?
Strictly speaking, since the study of linguistic terminology is about linguistic metalanguage, a variation of leading question (2) above, characterizing Language Ontology, would seem to be more adequate for this field, namely (5): (5)
What kinds of things do linguists talk as if there are?
However, given the scientific ambition of linguistics, a separate investigation of (5) without consideration of (4) will not be satisfactory. So the notion ‘ontology for linguistics’ refers to those conceptualizations of the domain of language and languages that are used to ‘talk linguistics’, to express and describe linguistic phenomena with the help of the corresponding concepts and the relations between them. The linguistic codings of these concepts are often, but by no means exclusively, technical terms of linguistics. Examples for such concepts include WORD CLASS, SPEECH ACT or EVENTITY (coded, e.g., in German linguistic terminology by the nouns Wortart, Sprechakt and Eventit¨at, respectively), and CONCEPT (coded by the English word concept as used in this chapter), but also relations such as the conceptual incompatibility between ARTICLE and VERB (i.e., it is not conceivable that some linguistic entity is both an article and a verb). Interestingly, a certain ontology of linguistics is also part of natural language ontology, since the codings of corresponding concepts such as SAY, WORD or QUESTION are presumably part of every natural language (cf. also Goddard this vol.; Zaefferer this vol.). Some of these concepts have a special status in any language ontology as they are instantiated by linguistic signs, which in turn have instantiations that refer to a concept. The concept WORD, e.g., is instantiated among others by the English word word, whose tokens have the potential to activate in hearers mental representations of the concept WORD .13 Since the same holds for many concepts coded by technical terms of linguistics, it is obvious that any ontology for linguistics has to include a
40
Matthias Nickles et al.
meta-language ontology, that is a language ontology for the language that is used to describe linguistic phenomena (cf. Farrar this vol. and Zaefferer this vol.). There are at least two reasons why both explicitly spelled-out language ontologies and well-defined ontologies for linguistics are urgent desiderata in current linguistics. First, it is a truism that precise descriptions of linguistic phenomena without precisely defined technical terms are impossible. And second, only with the help of these tools can linguists reliably compare and compile different descriptions within a language and across languages. However, there are still many areas in linguistics that are characterized by confusion and disagreement on the terminology used – to pick just one example, the area of information structuring with all kinds of uses for words such as topic or given or background – and therefore it is often far from clear if different authors and schools presuppose different ontologies for linguistics or if only the labels vary. Given these circumstances it is to be highly welcomed that projects like GOLD (‘General Ontology for Linguistic Description’, cf. Farrar this vol.) or DOLPhen (‘Domain Ontology for Linguistic Phenomena’, cf. Zaefferer this vol.) are on their way. Quite a few terminological problems that arise in linguistics are due to a lack of awareness of ontological differences. Consider sense relations for instance. Sense relations structure the lexicon in that they reflect conceptual relations that hold between the readings of lexemes. In talk about sense relations the distinction between conceptual relations and relations between the corresponding linguistic signs is often blurred or not drawn at all. Whereas the relation between CAR and CHASSIS is a conceptual one – a meronomic relation – the corresponding relation between any linguistic codings of those concepts, e.g. between the English nouns car and chassis, is a semantic relation – a meronymic one. Another example would be the hyponymy relation between the English nouns car and vehicle, which holds because of the conceptual subordination of CAR under VEHICLE. Yet, the role of the corresponding language ontology is typically left implicit, in that criteria for the sense relations are formulated in terms of linguistic characteristics or ‘meaning’ (cf. Cruse 1986, for instance, or Schalley 2004: 27–29), but not in ontological terms. If meanings are just concepts that happen to be coded by a given meaningbearing entity of a given language, for instance a word, then of course meaning relations are just ontological relations. But linguists are rarely aware of the fact that relations of this kind hold irrespectively of how the related con-
Ontologies across disciplines
41
cepts are coded and irrespectively of whether they are coded (and therefore meanings) at all. It is as if people would speak of manned space capsule A being in love with manned space capsule B when what they mean is that the man aboard A is in love with the woman aboard B. In summary, it appears that, in order to improve the analysis of linguistic facts, linguists would need to give yet more attention and weight to the study of underlying conceptualizations (including their interconceptual relations) both in the users of their object languages and in themselves as users of the linguistic metalanguage. This area seems to have the potential for considerable progress through explicit and systematic investigation of the languageontology interface. The present volume aims to put corresponding current efforts into a broader context and to instigate a more systematic approach to ontologies in general, and to language ontologies as well as ontologies for linguistics in particular, by promoting an ontology-driven approach to linguistics and thus by arguing for and exemplifying what we are calling ontolinguistics. Given that the construction and maintenance of ontologies by hand becomes quickly cumbersome with increasing size, it seems reasonable to consider using corresponding tools from computer science, and therefore the next section presents an outline of the state of the art in the relevant subsections of this thriving discipline.
2.3.
Ontologies in computer science: A survey
As indicated above, the term ontology is widely used by the computer science community and there it refers broadly speaking to the construction of information models. In computer science an information model is an abstract formal representation of entities that includes their properties and the relations that hold between them. By contrast with data models, information models represent their entities without any specification of implementation issues or protocols for data transportation. Among computer scientists the word ontology has received such a broad use that it has been employed to refer to any information model. It is necessary for the purpose of the following to constrain that usage. More specifically, we will understand by ontology in the computer science sense a specification in a formal language of terms and definitions describing things that make up the world. A key component of this definition of ontology is the phrase ‘a formal language’: Different degrees of formality are exhibited by different informa-
42
Matthias Nickles et al.
formal ontologies
tion modeling languages. Figure 4 presents a set of such modeling languages along a continuum (excerpted from Ray 2004). We will also briefly describe them in order to delineate what counts as ontology in computer science. Terms refers to a controlled and usually domain specific vocabulary. Terms ‘Ordinary’ Glossaries are terms ‘Ordinary’ Glossaries with natural language definitions, Ad hoc Hierarchies as one finds in many textbooks, (Yahoo!) Ad hoc Hierarchies such as Yahoo! Data Dictionaries are sets of terms with a relation(EDI) ship between terms, but where no Thesauri formal semantics for that relationship is defined. Data Dictionaries Structured Glossaries are more formal models of informaXML DTDs tion, often of relational databases, Principled, informal where each term and relation has hierarchies an associated natural language defDatabase Schema inition (EDI stands for standardized Electronic Data Interchange). XML Schema Since Roget’s pioneering work a Formal Taxonomies Thesaurus is a word list that is not Frames ordered alphabetically but accord(OKBC, Protege) ing to conceptual relations. StrucData and Process Models tured Glossaries may include rela(UML, ORM) tionships among the terms in the Description Logic -based glossary. XML DTDs are Docu(DAML+OIL) ment Type Definitions in eXtenKIF, OCL, OWL sible Markup Language (Yergeau et al. 2004), used for communicaFigure 4. Information modeling languages tion among software systems. XML supports nested, or hierarchical information structures, but is a language for defining syntax that has no associated constraints on semantics. Principled Informal Hierarchies are those which do not have a formal, logical model of relations between terms, but at least have an informal, common-sense explanation of relationships. DB Schemas are Data Base structures that have more formal definitions of the meaning of terms and relations, usually by employing statements in a database constraint language. XML Schema (with a capital S) is a further
Ontologies across disciplines
43
development of the XML DTD and is now the way to specify XML-based communication that is recommended by the World Wide Web Consortium. Formal Taxonomies are those which have a formal, logical semantics for the relations among terms. Frames include a range of standard AI languages that have terms, relations, and inheritance of properties. Examples are the Open Knowledge Base Connectivity (OKBC) protocol (Chaudhri et al. 1998) and the ontology editor and knowledge acquisition system Prot´eg´e. Data and Process Models couple taxonomies and defined relationships with a semantics for representing process and action. UML, the Unified Modeling Language for specifying the design of object-oriented systems (Object Management Group 1997–2006, cf. also Schalley this vol.) and Object-Role Modeling (ORM) exemplify this kind of information model. Description Logic (Baader et al. 2003) languages like the DARPA Agent Markup Language (DAML), the Ontology Integration Language (OIL), and their merger DAML+OIL combine the knowledge representation elements of a frame system with the ability to define rules; they are sublanguages of predicate logic. The expressiveness of rules is limited in order to ensure that inference on the rules is tractable. KIF, OCL, and OWL refer to three very expressive languages. Knowledge Interchange Format (KIF) (Genesereth 1991) is a first order logic, for which there have been several versions that differ in their details. Object Constraint Language (OCL) is part of UML (see above). The Ontology Web Language (OWL) (McGuiness and van Harmelen 2004) is a formal logical language, with similar expressiveness to KIF, that conforms to XML syntax. It should be noted that there are languages with far higher expressiveness including modal logic and various higher-order logics (Nadathur and Miller 1998) that exist further down along the continuum shown in Figure 4. They present significant challenges for practical inference however, and to date they have been used primarily for research in theorem proving rather than specification of ontologies. It is only models that make use of the full features of the languages in the bottom of the diagram, from ‘Frames’ onward, that can be called ontologies in a way consistent with our definition. In order to distinguish ontologies in computer science (and artificial intelligence) – which are based on a formal language – from ontologies in a more general sense, we will refer to the former as ‘formal ontology’ hereafter. A formal ontology is distinct from the most common instance of a set of terms and definitions: the dictionary. A dictionary does not employ a formal language, but rather an informal one: a human natural language. A dictio-
44
Matthias Nickles et al.
nary is meant to be read and interpreted by humans. No machine is currently capable of understanding a dictionary in any realistic sense of the word ‘understanding’. Furthermore, a dictionary is descriptive. It provides definitions which are presumably appropriate at a point in time, often with annotations about the usage of words up to the time of publication. Language evolves through an organic process and all attempts to render language static, such as the efforts of the Acad´emie Franc¸aise, have failed and will continue to fail to a significant degree. In contrast, a formal ontology is prescriptive or normative. It states definitively what a given term means in a formal language. A term in an ontology is not a word but a concept, although the concept will normally be given a name which is a word or combination of words in order to support human understanding of the ontology. A true formal ontology however could have all its term names replaced with arbitrary codes and still have the same formal properties. The only issue would be how such an ontology relates its terms to linguistic items in order to make its results of processing intelligible and useful to humans. One of the issues within the ontolinguistics enterprise is indeed this: how can relations between ontologies – as they are used in computer science – and linguistic expressions be established? Although any such relation will be imperfect, the degree of precision of relation and scope of coverage has been improving since greater bodies of formal ontologies, lexical resources, and corpora became available. The most prominent lexical database is WordNet (Fellbaum 1998), and there are efforts to create similar resources in languages other than English, often with relation to the English WordNet. Such resources focus on the smallest lexical units, which are usually words, although multi-word units are also present in small numbers (cf. Fellbaum this vol.). Collections of larger, phrasal units have been proposed (Pease and Fellbaum 2004) and the collection of lexical functions proposed by Mel’cuk (1998) – universal relations between lexical items including the standard sense relations – has been studied. There is significant potential for the ontology community to make use of work undertaken to catalog closed-class elements of language. Such elements may be considered to have a significant place in communication due to their presence as structural features in languages, as opposed to the elements of the open-class or lexical subsystem (cf. also Talmy this vol.). A more recent effort to relate a formal ontology to WordNet (Niles and Pease 2003) is also described in this volume (Pease this vol.). However, lexical resources and formal ontologies are very different artifacts. For instance, over the past few years there have been many publica-
Ontologies across disciplines
45
tions that describe ‘fixes’ to the WordNet taxonomy according to ontological principles (Gangemi et al. 2002b). Fundamentally, these are misguided, since language, as an organic system, does not conform to ontological principles. Once this distinction is recognized, however, there is great value in relating language to ontology for use in a broad range of applications and research endeavors. One innovative effort in this volume (Farrar this vol.) makes use of a formal ontology to describe structural linguistic information itself, an approach that has been touched on from a linguistic perspective in the previous section. There is an important distinction in Formal Ontology (and also Ontology more generally, as has extensively been discussed in Section 1.) between the language in which an ontology is expressed and the content or semantics of the ontology itself. A much larger proportion of effort in the Ontology community in computer science has gone into the development of languages as well as tools and methods, compared to the level of effort that has gone into the creation of content. One aspect connected to the creation of content is the semantic scope and degree of generality an ontology exhibits (cf. also the discussion in Section 1.). This naturally applies to formal ontologies as well – which are characterizable by whether they cover very general concepts, as in an upper ontology, or very specific topics, as in a domain specific ontology. Most extant formal ontologies pertain to fairly narrow topics or domains (Casati and Varzi 1995; Gr¨uninger and Menzel 2003), although these can be of great interest and value. The authors are aware of only three formal ontologies that have attempted to define the broadest and most general notions, which collectively may be termed an upper ontology. These are the Suggested Upper Merged Ontology (Niles and Pease 2001), Cyc (Lenat 1995), and DOLCE (Gangemi et al. 2002a). We should note that the distinction between upper ontology and domain specific ontology, however, is a continuum without a clear dividing line. While the relation temporallyBefore is certainly an upper ontology concept and the class Carburator is certainly a domain specific one, there are many concepts in between such extremes that do not have such an obvious membership. Another way of characterizing formal ontologies is the number of terms in the ontology. Terms may be classified into, amongst others, – instances, like KofiAnnan and Germany; – classes, like Human, and Country;
46
Matthias Nickles et al. – relations, like agentOf and mother; – function terms, like GovernmentOf and AdditionFunction.
However, a count of terms can only be a meaningful metric as to the size of an ontology, if the terms counted include significant associated definitions (through which the terms are sufficiently interrelated). Yet another metric is the number of axioms, which are indispensable for inference and expressive power. An axiom is any formal statement. Such statements may be – simple ground statements, like ‘Kofi Annan is Secretary General of the UN’, and ‘Germany is a country’; – quantified statements, like ‘there exists some farmer who beats his donkey’; – rules, like ‘Every good boy loves his mother’.
Note that here we have stated in informal English examples that would be expressed in a formal language in a formal ontology. 2.4.
Ontologies and artificial intelligence
According to a prevalent definition by Luger and Stubblefield (1993), Artificial Intelligence (AI) is the branch of computer science that is concerned with the automation of intelligent behavior. Regardless whether this effort is based on the abstract concept of rationality or involves mimicking the human mind, any truly intelligent computer system surely requires the capability to acquire, process and use knowledge about the domain it is situated in or concerned with. Whereas until the early 80s, this capability was usually associated with the presence of some knowledge storage facility and logical reasoning (like in so-called expert systems), nowadays it is no longer consensus in AI research that information processing demands for an explicit collection of computational knowledge (a so-called knowledge base), possibly including some domain conceptualization in form of a formal ontology. Approaches in some contemporary AI-subfields such as connectionism or situated intelligence usually lack any symbolic knowledge representation, and certain popular AI methods such as Q-learning get along without any (explicit) world model. Nevertheless, the presence of a knowledge base (whether based on logic or another knowledge representation format such as
Ontologies across disciplines
47
Bayesian networks) can strongly increase the flexibility and adaptability of an intelligent system by virtue of the separation of knowledge collection and knowledge-based reasoning on the one hand, and planning, searching and acting capabilities on the other, as pointed out in detail in Section 2.4.1. In addition, the ability to represent and process information symbolically can be a prerequisite for the exchange of information with others (see below). Consequently, the majority of AI-frameworks still comprise some kind of more or less powerful knowledge base, and an increasing number also provides some facility specifically dedicated to concept knowledge. The latter is the case for basically the same reasons why ontologies are used in other disciplines and in “ordinary” computer science too (e.g., to be able to reuse rather general, abstract and persistent domain theories in different tasks of knowledge processing). But in contrast to other areas of computer science, in AI the foci are on the computational reasoning about/using (ontological) knowledge, on the (computationally) intelligent acquisition and revision of new knowledge, and on the computational use of (ontological) knowledge for decision making. In addition, ontologies in AI are very useful and often inevitable means for the communication and collaboration of intelligent systems, including the interaction of humans with machines and the machine-supported interaction of humans. In such settings a common informational ground needs to be found in order to facilitate understanding and cooperation (and even conflicts), and consented ontologies partially provide this common ground in terms of conceptualizations of the respective common domain. This states a reason for the eminent importance of ontologies in Distributed AI, a fast-growing subfield of AI concerned with the interaction of intelligent systems, as described in detail in Section 2.4.2.
2.4.1.
Knowledge bases
The notion of a knowledge base, or a collection of facts and rules, along with inference procedures to make use of those rules, has a long history in the field of AI (Pease, Liuzzi, and Gunning 2001). The point of this research area has been to decouple declaratively specified knowledge from procedural code, allowing a software system to behave more intelligently, and less mechanically, by dynamically combining small chunks of knowledge to reach an answer. Whereas a conventional software system would have specified a series of operations to be performed in a certain order, a knowledge base sys-
48
Matthias Nickles et al.
tem has a generic inference process that can opportunistically apply a range of declaratively specified knowledge in order to reach different answers to different queries. Knowledge base systems are also called expert systems in part because this work was primarily undertaken to provide a software-based expert in some field. One of the earliest expert systems was Mycin (Buchanan and Shortliffe 1984), which was designed to diagnose infectious blood diseases. It achieved a level of competence that was better than most human experts. Notably, despite its competence, it was not put into commercial use because of social and human factor issues. This, in part, spawned a whole new field of research to address these ‘soft’ considerations in the successful application of software systems. Mycin did not originally have a completely clear separation of knowledge from inference procedures. The Emycin project (van Melle 1982) was an attempt to make that separation clear by creating a more general expert system shell that could be used on a more widely varied set of knowledge. Many companies created and sold expert system shells, some derived directly from the Emycin work. As expert system projects proliferated in the 1980s, it became clear that although inference processes could be reused, knowledge often had to be recreated for each new application. The problem was that simplifying assumptions were often built into the knowledge. These assumptions were invariably appropriate for one domain, but not another. One way of interpreting this larger issue of assumptions was called the frame problem (McCarthy and Hayes 1969). The real world is large and complex. It is not practical to model every feature of the world, especially when a project has a focused goal such as diagnosing blood diseases. This tension between the need to focus knowledge creation on the task at hand, and to make that knowledge as reusable as possible, has spawned the Ontology sub-field of AI. One head-on approach to this problem has been to attempt to encode all the common sense knowledge of the world (Lenat 1995). More modest efforts have been to create knowledge that is at a level of generality and reusability that is simply greater than most applications. Creating such knowledge however has proven difficult, and there are only three major attempts to create formal upper level ontologies (cf. above). Most research in this area has focused on tools such as Prot´eg´e (Eriksson et al. 1999), languages such as KIF (Genesereth 1991) and OWL (McGuiness and van Harmelen 2004), and processes as described in Guarino and Welty (2002), rather than on the knowledge itself.
Ontologies across disciplines
2.4.2.
49
Ontologies in distributed AI: Issues and selected approaches
The last few years have seen a tremendous rise of interest in ontologies for the use in distributed settings with multiple, interacting participants – especially so-called open environments like the Semantic Web, open multiagent systems and peer-to-peer systems. Such environments can be characterized by the following properties, which might be more or less distinct depending on the concrete application domain: – Heterogeneous set of autonomous participants, with only few restrictions for participation. The participants operate basically self-interested towards their individual and often hidden goals. Neither the concrete set of participants nor their capabilities, beliefs, and intentions are known beforehand. – The knowledge domain is highly dynamic and heterogeneous. – Initial and possibly persistent nonexistence of a commonly agreed, single ‘truth’ (and thus of ‘knowledge’ in an objectivist sense), absence of a central instance for the enactment of behavioral and informational norms.
Applications in such environments require a shared domain semantics in order to support a mutual understanding among the distributed participants. Computational ontologies constitute a popular response to this need. The use of ontologies in such settings focuses thus mainly on the enabling of knowledge communication, sharing, and reuse by means of the generation and provision of a conceptual common ground among the interacting parties. In this regard, we distinguish two probably overlapping classes of participants, namely ontology sources and users, both human as well as artificial agents, but also ‘passive’ ontology sources like web documents. As stated earlier, ontologies are usually defined as formal representations of domain conceptualizations, focusing on consented and stable concepts. In the following, we will describe issues which arise in open environments from applying this traditional understanding of ontologies, and present selected research efforts in response to these issues. For lack of space, we cannot give an exhaustive overview of the research field. Rather we would like to provide a short, concise description of key properties of a relatively small group of formal and technical frameworks, selected from a large list (with often equally useful other approaches), in order to provide a starting point for further reading.
50
Matthias Nickles et al.
2.4.2.1.
Issues
While the compilation, integration, and sharing of information among heterogeneous information sources and users becomes more and more important, a large number of contemporary frameworks and tools for the modeling and usage of ontologies still proceed on the assumption of static, fully consented and authoritative ontologies. This reflects a sort of dilemma: On the one hand, ontologies should ease the reuse of knowledge and the sharing of knowledge among distributed parties, and thus should be stable and agreed, on the other hand, ontologies (being a special kind of knowledge) are themselves subject to difficulties known from the field of knowledge sharing, e.g. arising from controversial viewpoints. The following problems are considered to be most prominent due to distributed settings with heterogeneous ontology sources. Further potential issues and examples can be found in Tamma (2002) and Staab and Studer (2003). 1. Ontology sources operating with mutually incompatible representation languages. This issue is likely to become less severe in the near future, since the standardization of representational aspects progresses rapidly, mainly driven by the Semantic Web effort. But there still is no consensus regarding the adequate degree of expressiveness required for a common formal ontology language (taking into account aspects like logical decidability). This is one reason why the W3C’s description logic based ontology language OWL (cf. Section 2.3.) comes in three variations – OWL Lite, OWL-DL (the most often used variant of OWL which is equivalent to a certain prevalent description logic), and OWL Full – differing in expressiveness, not to mention languages such as SWRL that extends OWL with the ability to represent rules. OWL-DL can be considered as the current quasi-standard for the representation of web ontologies, since it is considerably more expressive than OWL Lite while corresponding (in contrast to OWL Full) to a decidable variant of description logic. 2. Homonymy and synonymy. E.g., i) the same name is used for different concepts (because of context-dependency of the name, for instance, as with the word wood which denotes both a collection of trees and their material), or ii) different names are used for the same concept, like car and automobile (Tamma 2002). 3. Incompatible concept coverage, scopes and modeling granularities: E.g., i) multiple concept definitions appear to describe the same concepts, but overlap only partially in fact (for instance, in one ontology, CHAMPAGNE
Ontologies across disciplines
51
might be a sub-class of WINE, in another it is not), or ii) concepts are modeled in a fine-grained way (i.e., with many sub-classes and/or attributes) in one ontology, but only coarsely in another (for instance, RED WINE and WHITE WINE as the only sub-class of WINE vs. CHARDONNAY , CHIANTI, BEAUJOLAIS , . . . [Tamma 2002]). 4. Incompatible representation paradigms and top-level concepts. E.g., ontology 1 might use the elementary concept EVENT as a top node, whereas ontology 2 subsumes everything under MATTER. 5. Semantic inconsistencies due to stable goal or belief conflicts of the participants. E.g., the first ontology source treats RELIGION as a direct aspect of CULTURE, whereas the second contains an additional concept SUPER STITION which is used to classify RELIGION . While this is an extreme example, one can easily find further examples from various controversial fields like politics. 6. Communication problems, preventing agreement on and coordination of ontology sources; unreliable ontology sources.
Whereas some of these issues can be resolved at least in principle (e.g., incompatible ontologies could be made compatible just by unifying the names of certain concepts, as with issue 2 ii) above), some other issues might be technically very difficult to resolve (e.g., issues 1 and 3), or the dissolution is impossible on the level of ontology processing at least without disproportional measures that would lead to severe restrictions on the software applicability. For example, semantically inconsistent definitions of the same concept might have their origin in divergent world views of the ontology sources (issue 5); an alignment of these world views, establishing an agreement, would – if practicable at all – lead to a loss in source autonomy and therefore decrease the flexibility and robustness of the application. In addition to the possibility that some of the above issues cannot be resolved, and the possible undesired loss of ontology source autonomy when attempting to get rid of them, there are further considerations to be taken into account when it comes to the integration of ontologies in open environments which have been largely neglected in traditional approaches to ontology integration and sharing. First, stable semantic conflicts are not just something one should get rid of by any means. Instead, conflict knowledge (M¨uller and Dieng 2002) (i.e., meta-knowledge about conflicts) can provide valuable information about the attitudes, world views and goals of the respective knowledge sources. More generally, a set of distributed ontology sources and users forms a social layer
52
Matthias Nickles et al.
consisting of provenance information and information about the social relationships among ontology contributors and users. The explication and evaluation of this layer can provide the knowledge users with valuable metaknowledge, and – if made explicit and visible – can be prerequisites for a subsequent resolution of conflicts regarding controversial knowledge. In this regard, it is important to bear in mind that subjective intentions and goals do not just exist for intelligent agents, but also indirectly for other kinds of ontology sources (like web documents), simply by virtue of their human creators. Second, in the absence of normative meaning governance, and due to the inherently dynamic nature of knowledge in open environments, such mechanisms for ontology integration as filtering, use of trust relationships and most traditional ways of ontology merging can only provide preliminary decisions about the reasonable modeling of domains, because within a heterogeneous group of autonomous ontology sources and users, in the end each user can only decide for himself about the meaning, relevance and correctness of the given information, and these decisions might need to be revised in the course of time. 2.4.2.2.
Ontology integration
To integrate data from multiple ontologies, there exist different possibilities, namely ontology merging, mapping, and matching (Gruber 1993; Tamma 2002; Staab and Studer 2003). Whereas merging describes the process of creating a new single coherent ontology that includes the information of all merged ontologies, mapping describes a process where the original ontologies remain separated, but are made consistent and coherent with each other, either by finding tuples of related concepts and/or by defining mappings to relate concepts within the source ontologies. ‘Matching’ in particular deals with the problem of finding equalities or at least similarities among several ontologies. The central problem for the process of mapping/matching is to identify and compare the meaning of the respective concepts. Subsequently, a merging process can then unite equivalent concept descriptions, and remove redundant ones. There exist several clues for this task, e.g., from the comparison of the concept names to the evaluation and comparison of the sub-concepts of a concept and their relations. Even today such processes are still largely conducted by hand, which is a time-consuming process that often leads to mistakes. Especially the fast growing number of distributed ontologies in the Semantic Web will therefore
Ontologies across disciplines
53
increase the need for (semi-)automated tools in order to support ontology integration. There already exist a number of tools – such as, for instance, OntoMerge (Dou, McDermott, and Qi 2002), OntoMorph (Chalupsky 2000), or Observer (Mena et al. 2000) – supporting automatic or semi-automatic ontology merging or mapping. Their main purpose is to (possibly interactively) guide the ontology designer through the merging/mapping process, and to identify inconsistencies or other problems, and to present suggestions for further proceeding. Advanced ontology and knowledge modeling environments which support multiple users are, e.g., OntoEdit (http://www.ontoknowledge.org/tools/ ontoedit.shtml), Prot´eg´e (Eriksson et al. 1999), and Sigma (Pease 2003). Analogously, tools exist for the multi-user creation of ontology instances by means of the manual annotation of documents and other data with ontologybased meta-data. Of course, such approaches are of limited value if it comes to the integration of information in large-scale environments like the web. Software frameworks have been designed in order to support both the expert development of ontologies and the virtual and/or transformational integration and dissemination of heterogeneous ontologies and instance knowledge. A pioneering approach in this regard has been OntoBroker (Decker et al. 1999). A more recent, ambitious example for such a framework is KAON (Bozsak et al. 2002), which integrates available resources and provides tools for the acquisition, engineering, management, and presentation of distributed ontologies and meta-data. If agents are involved in integration frameworks, they are often ‘only’ part of the technical middleware (serving as matchmakers, for instance), rather than being intelligent ontology sources themselves. One of few exceptions in this regard is InfoSleuth (Nodine, Fowler, and Perry 1999), where information agents provide ontologies that are used as media for the integration of heterogeneous knowledge contributions. Beside other features, InfoSleuth allows the annotation of knowledge contributions with information about their provenances and the long-term monitoring of knowledge domains. One of the most important application fields for distributed computational ontologies is Organizational Knowledge Management, which addresses the ontology-supported creation, representation, usage, and distribution of knowledge within complex organizations, such as (large) companies or within the government. Most organizational knowledge management systems still aim for the creation of monolithic, centralized, and homogeneous knowledge bases for the collection of corporate knowledge, according to a
54
Matthias Nickles et al.
single ontology-based organization schema, in order to enable communication and knowledge sharing across the organization. An example for an organizational, agent-based knowledge management framework which in contrast explicitly acknowledges the distributed and social nature of knowledge in large organizations is FRODO (van Elst et al. 2004). FRODO can be characterized as a large-scale meta-knowledge system with ontology-based organizational structures and support for workflow-based knowledge contexts. It makes use of social agents for the management of ontologies, workflow, and personal information assistance in order to relate individuals and organizational concerns. 2.4.2.3.
Ontology emergence and uncertainty
A characteristic of open environments is that knowledge domains are dynamic and can often be modeled with some uncertainty. Probabilistic ontologies (e.g., Giugno and Lukasiewicz 2002) provide the possibility to describe concepts and instances with variable degrees of belief, denoting uncertainty of description logic’s terminological axioms (as opposed to vagueness in fuzzy logic). E.g., in a probabilistic ontology, the modeler can assign the probability 0.3 to the claim that ‘Tomatoes are fruits’. Probabilistic ontologies usually build upon description logic, which can be used to describe assertional knowledge (i.e., about concept instances) also. Besides this, there are many other approaches for the probabilistic enhancement of knowledge bases which could in principle be used to model uncertain ontologies also (e.g., stochastic logic). The Simple HTML Ontology Extensions (SHOE) (Heflin and Hendler 2000) is a quite fundamental approach to dynamic ontologies, acknowledging that knowledge on the internet is not static but evolves in time, and that ontologies do not exist in monolithic isolation. SHOE provides a formal framework and an ontology-based knowledge representation language intended for information embedded within web pages (via semantic annotations). It supports ontology revision as a change in the components of an ontology (i.e., the addition or removal of categories and their relationships), and the versioning of subsequently revised ontologies. In this regard, formal techniques like those described before are supported to align and integrate multiple ontologies. A wide research field within that of ontology emergence is that of ontology learning from large unstructured or semi-structured data sets and natural-
Ontologies across disciplines
55
language documents. Concrete techniques trade under names like concept mining and clustering, and build upon well-explored approaches in the areas of data mining and natural language processing. Mostly, they are limited to the automatic generation of taxonomical ontologies or word lists, though. Since this topic is beyond the scope of this article, we refer the interested reader to Staab et al. (2002) and Misskof, Navigli, and Velardi (2002) for details. Almost all approaches to ontology sharing require an agreement on the respective concepts. In case such an agreement is not given, a process of establishing a common ontological ground among the parties has to be executed. In case these parties are intelligent software agents, such a process can be performed automatically: Ontology negotiation (Bailin and Truszkowski 2003) enables intelligent agents to cooperate in performing a task, even if their domain knowledge is based on different ontologies. Ontology negotiation allows agents to discover ontological conflicts and to establish a common ground for further communications though incremental mutual requests and interpretations, clarifications, and explanations regarding concept meanings. For this purpose, practical approaches to ontology negotiation usually provide an ontology negotiation protocol and a software infrastructure in order to support the negotiation tasks. Within this protocol, certain speech acts according to the negotiation tasks can be performed by the agents, like ‘Request Clarification’ (of an unknown concept name) and ‘Confirmation of Interpretation’ (of a given concept definition). In the course of the negotiation process, ideally, the agents come to an agreed categorization that can subsequently be used in knowledge-based communication. Due to its high communicational overhead it is questionable if this approach can be applied in the large scale, but it is surely a very flexible way for ontology alignment in dyadic micro-scenarios. Ontology negotiation acknowledges that different agents might have different world views that need to be aligned communicatively in order to facilitate meaningful further communication. Although this is a technical process steered by a rather simple protocol, this approach is somewhat related to that of linguistic ontological mediation (Bateman this vol.), which contends that besides perception-related ‘world ontologies’ of commonsense concepts there also exist a ‘linguistic ontology’ domain resulting from the construction of ‘reality’ by means of language. It is also related to the concept of purely socially (i.e., communicatively) constructed open ontologies (cf. below).
56
Matthias Nickles et al.
2.4.2.4.
Ontology assessment
Ahead of any acquisition of ontological knowledge from an external source comes the rating of this knowledge regarding criteria like its credibility. This is especially important in case a selection is required from a set of inconsistent ontologies which cannot be integrated, or in case ontologies are found in an open environment where their providers cannot be trusted per se. In order to assess the usability of an ontology, ontology-inherent properties like the containment of a certain category and its topical appropriateness have to be examined (which is not dealt with here, since it is not specific to distributed settings), but also external characteristics. The certainly most important meta-knowledge of such kind is that of the provenance of an ontology (or a part of it). The co-presence of syntactically and semantically heterogeneous and even inconsistent ontological knowledge by virtue of provenance annotations is supported for instance by the Web Knowledge Base 2 (WebKB-2) (Martin and Eklund 2001). The WebKB-2 server permits web users to add ontological and assertional knowledge to a shared, central base, such that syntactical and semantic heterogeneity is advocated to permit the comparison and mutual completion of knowledge proposed by heterogeneous ontology sources and users. The WebKB-2 has been initialized with WordNet (Miller 1995; Fellbaum this vol.) and a top-level ontology in order to provide initial content and guidance for its users. But the WebKB-2 still needs to prove its usefulness for real-world application, and currently does not have significant facilities for the comparison of its ‘inhabiting’, possibly highly heterogeneous part-ontologies. Provenance information of ontologies is also provided by Swoogle (http://swoogle.umbc.edu/), which is an internet search machine specialized on OWL and RDF encoded formal ontologies and meta-data (RDF, the Resource Description Framework, is the currently most widely used language for the representation of information about resources on the web). It uses the OntoRank algorithm to rate multiple ontologies containing certain keywords. OntoRank is basically an adoption of Google’s famous PageRank algorithm. Starting from provenance information, it is often possible to assign a measure for the reliability of ontology sources. Trust is a very broad and common notion in Distributed AI, usually expressing whether some ‘positive’ behavior can be anticipated, and to which degree. As for ontologies, trust information can be used to calculate degrees of beliefs in ontological statements (or any
Ontologies across disciplines
57
other kind of claims) from the degree of trust assigned to the sources of these statements (Richardson, Agrawal, and Domingos 2003). As an alternative to the assignment of degrees of trust to knowledge sources, trust information can also be assigned directly to categorial and assertional statements (Fischer and Nickles to appear), which might allow for a more fine grained, context-aware trust management. Although the concept of trust is widely used in knowledge modeling, and has an intuitive appeal, it should be used with care. Very often, ‘trust’ in a certain information source is plainly identified with the belief in every statement from this source, which is surely a much too simple understanding. Another issue especially in large application scenarios like the Semantic Web is that an efficient trust-based rating of knowledge would require an existing trust infrastructure like trust networks (Golbeck, Parsia, and Hendler 2003). In general, any kind of ranking of information in terms of quality and credibility comes to its limits in case there is not yet enough information about trust or recommendations to identify and filter out ‘inappropriate’ or ‘wrong’ contributions, or there does not even exist an abstract concept of global inappropriateness or correctness at all. An alternative approach contrasting the identification and removal of semantic heterogeneity within or among ontologies is to maintain inconsistencies while integrating them, and to reify them using meta-knowledge about their provenances, their degrees of agreement by various knowledge sources, users, and groups (by virtue of voting on knowledge), and their personal and social contexts. Such open ontologies (Froehner, Nickles, and Weiss 2004) account for the fact that in open environments even ontologies (traditionally assumed to be more ‘objective’ and stable than other kinds of knowledge) are subject to social acceptance or rejection, and thus need to accommodate possibly divergent preferences and multiple points of view (e.g., controversial opinions) (Nickles et al. 2005) and optional mechanisms for the leveled fusion of heterogeneous ontological and ‘ordinary’ knowledge. Therefore, the focus is here not on the emergence of an agreement on a conceptualization, but on the provision of meta-knowledge about the personal and social circumstances steering the generation, propagation and usage of ontological knowledge. In summary, distributed settings pose various challenges for the acquisition, representation, and usage of ontologies, caused by factors like information source autonomy, information heterogeneity, and the possible absence of commonly agreed conceptual knowledge in such settings. Approaches to computational ontologies need to address potential problems arising from
58
Matthias Nickles et al.
these factors, like inconsistent and un-trustable information, in order to be useful in open environments like the Semantic Web. Some common research approaches to the described issues have been outlined in this section. Very recent developments in the research of ontologies for open environments investigate meta-modeling as a technique to allow context-sensitivity of concept specifications (Motik 2005) (in the tradition of higher-order logic and context logic [McCarthy 1987]), a logic for the representation of possibly disagreeing opinions and other public attitudes (Fischer and Nickles to appear), and reasoning with inconsistent ontologies (Huang, van Harmelen, and ten Teije 2005). Some of these techniques have been known in Artificial Intelligence for quite a long time, but their practical use in very large and complex environments like the Semantic Web requires major adaptations, e.g. to ensure logic decidability.
Notes 1. The full frontispiece reads OGDOAS SCHOLASTICA CONTINENS Diagraphen Typicam Artium: Grammatices (Latinae. Graecae.) Logices. Rhetorices. Astronomices. Ethices. Physices. Metaphysices, seu Ontologiae. Ex Praestantium hujus temporis virorum lucubrationibus, Pro Doctrinae & virtutum studiosa juventute: CONFECTA; Iacobo Lorhardo, Gymnasij Sangallensis Rectore, & in Ecclesia Christi servo: APUD GEORGIUM STRAUB Sangalli: ANNO. 1606. It is reproduced on Raul Corazzon’s deservedly award-winning website Ontology. A resource guide for philosophers (http://www.formalontology.it). 2. It seems that John McCarthy is to be credited not only for inventing the term ‘Artificial Intelligence’ (on the occasion of the Dartmouth conference 1955), but also for introducing en passant the term ‘ontology’ in AI when he wrote in the context of a discussion of the Missionaries and Cannibals puzzle: “Using circumscription requires that common sense knowledge be expressed in a form that says a boat can be used to cross rivers unless there is something that prevents its use. In particular, it looks like we must introduce into our ontology (the things that exist) a category that includes something wrong with a boat or a category that includes something that may prevent its use.” (McCarthy 1980: 33–34) 3. One cannot exclude, however, that such a second metonymic extension will occur in the near future. Compare the analogously structured noun psychology, which already has all three readings. The main entry of the Merriam-Webster Online Dictionary (http://www.m-w.com/cgibin/dictionary?book=Dictionary&va=psychology&x=21&y=16) reflects this, although in a somewhat unsystematic way: “1 : the science of mind and behavior 2 a : the mental or behavioral characteristics of an individual or group b : the study of mind and behavior in relation to a particular field of knowledge or activ-
Ontologies across disciplines
4.
5. 6.
7.
8.
9.
10.
11.
59
ity 3 : a treatise on psychology”. 2 a is the object-level reading, 3 is the meta-level reading, and 1 and 2 b represent the original meta-meta-level reading. This is why publications like “Evaluating Ontological Decisions with OntoClean” (Guarino and Welty 2002) with the nice subtitle “Explosing [sic! Presumably: exploring] common misuses of the subsumption relationship and the formal basis for why they are wrong” are so important for the field. Cf. also Zaefferer (this vol.). Namely the thoughts or perceptions of those who enact the institution. The fact that (a) only the thoughts or perceptions of the institution-enacting agents, and (b) not even all of their thoughts and perceptions but only some of them are relevant for the existence of the institution in question, points to a possible revision of the notion of realism that obviates the recourse to partial non-realism even for the case of an observer who essentially co-enacts a given (for instance two-person) institution: If we redefine realism as the thesis that the objects, properties and relations the world contains exist independently of an observer’s reflecting thoughts about or perceptions of them, then I can be a realist for instance about money if I reflectingly think that this is just a disk-shaped piece of metal, but at the same time pragmatically think that this is a quarter and use it for payment, thereby enacting the institution of money. Note that the absence of commonly agreed knowledge in the objectivist sense ascribed in Section 2.4.2. of this article to open environments like the Semantic Web does not imply a non-realist position, it rather characterizes an epistemological situation. The point of the experiment is that as a side-effect of their specific ontology (there is no category for ‘obstacle ahead’) the Didabots clean an area cluttered with Styrofoam blocks by pushing them into clusters (Pfeifer 2000: 3), illustrating thus that an extremely simple configuration can result in complex (and useful) behavior, but this aspect cannot be elaborated on in our context. Or more precisely a cell and anatomy ontology of this roundworm (Lee and Sternberg 2003). An excerpt from the abstract: “We are endowed with a rich knowledge about Caenorhabditis elegans. . . . To make the information more accessible to sophisticated queries and automated retrieval systems, WormBase has begun to construct a C. elegans cell and anatomy ontology.” It does of course make sense to compile a corresponding database. But exactly such a database would nicely illustrate the difference between an ontological (and therefore generic) database, and a database for a specific entity token. Cf., e.g., Nirenburg and Raskin (2001: 15–16): “The following components in an agent’s model are relevant for its language processing ability: . . . Knowledge about the world, which we find useful to subdivide into: – an ontology, which contains knowledge about types of things (objects, processes, properties, intentions) in the world; and – a fact database, an episodic memory module containing knowledge about instances (tokens) of the above types and about their combinations”. “The success of a categorization can be measured by the degrees of prediction and control which the categories produced afford other scientists. Good theories are built upon successful categorizations of nature.” (Koepsell 1999: 217)
60
Matthias Nickles et al.
12. In the context of a talk given at the University of Munich on July 18, 1983, entitled A Chapter of English Metaphysics, according to the notes taken by Dietmar Zaefferer. 13. It is of course a simplification to speak of the concept of WORD, since there are several concepts we could refer to in this way, both in everyday language and in linguistic terminology, but for the present purposes these differences do not matter.
References Baader, Franz, Diego Calvanese, Deborah McGuinness, Daniele Nardi, and Peter Patel-Schneider (eds.) 2003 The Description Logic Handbook: Theory, Implementation and Applications. Cambridge: Cambridge University Press. Bach, Emmon 1986 Natural language metaphysics. In Logic, Methodology, and Philosophy of Science VII, Ruth Barcan Marcus, Georg J. W. Dorn and Paul Weingartner (eds.), 573–595. Amsterdam: North Holland. 1989 Informal Lectures on Formal Semantics. Albany, NY: State University of New York Press. Bailin, Sidney C., and Walt Truszkowski 2003 Ontology negotiation: How agents can really get to know each other. In Proceedings of the First International Workshop on Radical Agent Concepts (WRAC 2002), Walt Truszkowski, Christopher Rouff, and Michael G. Hinchey (eds.), 320–334. (Lecture Notes in Computer Science 2564.) Berlin: Springer. Barwise, Jon, and John Perry 1983 Situations and Attitudes. Cambridge, MA: MIT Press. Bateman, John A. this vol. Linguistic interaction and ontological mediation. Berkeley, George 1999 Reprint. A Treatise Concerning the Principles of Human Knowledge. In Principles of Human Knowledge and Three Dialogues, Howard Robinson (ed.). Oxford: Oxford University Press. Original edition, Dublin: Aaron Rhames/Jeremy Peptat, 1710. Bozsak, Erol, Marc Ehrig, Siegfried Handschuh, Andreas Hotho, Alexander Maedche, Boris Motik, Daniel Oberle, Christoph Schmitz, Steffen Staab, Ljiljana Stojanovic, Nenad Stojanovic, Rudi Studer, Gerd Stumme, York Sure, Julien Tane, Raphael Volz, and Valentin Zacharias 2002 KAON – Towards a large scale Semantic Web. In Proceedings of the Third International Conference on E-Commerce and Web Technologies (EC-Web), Erol Bozsak, A. Min Tjoa, and Gerald Quirchmayr (eds.), 304–313. (Lecture Notes in Computer Science 2455.) Berlin: Springer.
Ontologies across disciplines
61
Buchanan, Bruce G., and Edward H. Shortliffe (eds.) 1984 Rule-based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project. Reading, MA: Addison-Wesley. Casati, Roberto, and Achille C. Varzi 1995 Holes and Other Superficialities. Cambridge, MA: MIT Press. Chalupsky, Hans 2000 OntoMorph: A translation system for symbolic knowledge. In Proceedings 7th International Conference on Principles of Knowledge Representation and Reasoning (KR’2000), Fausto Giunchiglia and Bart Selman (eds.), 273–284. Breckenridge, CO: Morgan Kaufmann. Chaudhri, Vinay, Adam Farquhar, Richard Fikes, Peter Karp, and James Rice 1998 OKBC: A programmatic foundation for knowledge base interoperability. In Proceedings of the 15th National Conference on Artificial Intelligence (AAAI’98), 600–607. Cocchiarella, Nino B. 2001 Logic and ontology. Axiomathes 12 (1–2): 117–150. Cruse, D. Alan 1986 Lexical Semantics. (Cambridge Textbooks in Linguistics.) Cambridge: Cambridge University Press. Davidson, Donald 1967 The logical form of action sentences. In The Logic of Decision and Action, Nicholas Rescher (ed.), 81–95. Pittsburgh, PA: University of Pittsburgh Press. Decker, Stefan, Michael Erdmann, Dieter Fensel, and Rudi Studer 1999 Ontobroker: Ontology based access to distributed and semistructured information. In Semantic Issues in Multimedia Systems. Proceedings of DS-8, Robert Meersman, Zahir Tari, and Scott M. Stevens (eds.), 351–369. Boston: Kluwer. Dou, Dejing, Drew McDermott, and Peishen Qi 2002 Ontology translation by ontology merging and automated reasoning. In Proceedings EKAW 2002 Workshop on Ontologies for Multi-Agent Systems, Rose Dieng (ed.), 3–18. Elst, Ludger van, Andreas Abecker, Ansgar Bernardi, Andreas Lauer, Heiko Maus, and Sven Schwarz 2004 An agent-based framework for distributed organizational memories. In Proceedings Coordination and Agent Technology in Value Networks, Multikonferenz Wirtschaftsinformatik (MKWI-2004), Martin Bichler, Carsten Holtmann, Stefan Kirn, J¨org P. M¨uller, and Christof Weinhardt (eds.), 181–196, Berlin: GITO-Verlag. Ereshefsky, Marc 2002 Species. In The Stanford Encyclopedia of Philosophy (Fall 2002 Edition), Edward N. Zalta (ed.), http://plato.stanford.edu/archives/ fall2002/entries/species/.
62
Matthias Nickles et al.
Eriksson, Henrik, Raymond W. Fergerson, Yuval Shahar, and Mark A. Musen 1999 Automatic generation of ontology editors. In Proceedings of the Twelfth Banff Knowledge Acquisition for Knowledge-based Systems Workshop, Mark A. Musen and Brian R. Gaines (eds.), Banff, Alberta, Canada. Technical report, University of Calgary/Stanford University. Farrar, Scott this vol. Using ‘Ontolinguistics’ for language description. Fellbaum, Christiane this vol. The ontological loneliness of verb phrase idioms. Fellbaum, Christiane (ed.) 1998 WordNet: An Electronic Lexical Database. (Language, Speech, and Communication.) Cambridge, MA/London: MIT Press. Fischer, Felix, and Matthias Nickles to appear Computational opinions. In Proceedings of the 17th European Conference on Artificial Intelligence (ECAI-06), Gerhard Brewka (ed.). Amsterdam: IOS Press. Froehner, Tina, Matthias Nickles, and Gerhard Weiss 2004 Towards modeling the social layer of emergent knowledge using open ontologies. In Proceedings of The ECAI 2004 Workshop on Agent-Mediated Knowledge Management (AMKM-04), Andreas Abecker, Ludger van Elst, and Virginia Dignum (eds.), 10–19. Gangemi, Aldo, Nicola Guarino, Claudio Masolo, Alessandro Oltramari, and Luc Schneider 2002a Sweetening ontologies with DOLCE. In Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management, 166–181. Berlin/Heidelberg: Springer. Gangemi, Aldo, Nicola Guarino, Alessandro Oltramari, and Stefano Borgo 2002b Cleaning-up WordNet’s top-level. In Proceedings of the First Global WordNet Conference. Mysore: Central Institute of Indian Languages. Genesereth, Michael 1991 Knowledge Interchange Format. In Proceedings of the Second International Conference on the Principles of Knowledge Representation and Reasoning, James F. Allen, Richard Fikes, and Erik Sandewall (eds.), 238–249. San Mateo, CA: Morgan Kaufmann. Giugno, Rosalba, and Thomas Lukasiewicz 2002 P-shoq(d): A probabilistic extension of shoq(d) for probabilistic ontologies in the semantic web. In JELIA’02: Proceedings of the European Conference on Logics in Artificial Intelligence, Sergio Flesca, Sergio Greco, Nicola Leone, and Giovambattista Ianni (eds.), 86–97, Berlin: Springer. Goddard, Cliff this vol. Semantic primes and conceptual ontology.
Ontologies across disciplines
63
Golbeck, Jennifer, Bijan Parsia, and James A. Hendler 2003 Trust Networks on the Semantic Web. In Proceedings of the Seventh International Workshop CIA-2003 on Cooperative Information Agents, Matthias Klusch, Sascha Ossowski, Andrea Omicini, and Heimo Laamanen (eds.), 238–249. (Lecture Notes in Computer Science 2782.) Berlin: Springer. Gruber, Thomas R. 1993 A translation approach to portable ontology specification. Knowledge Acquisition 5: 199–220. Gr¨uninger, Michael, and Christopher Menzel 2003 Process Specification Language: Principles and applications. AI Magazine 24 (3) (Fall 2003): 63–74. Guarino, Nicola, and Christopher Welty 2002 Evaluating ontological decisions with OntoClean. Communications of the ACM 45 (2): 61–65. Haider, Hubert 1993 Deutsche Syntax – generativ. T¨ubingen: Narr. Heflin, Jeff, and James A. Hendler 2000 Dynamic ontologies on the web. In Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence (AAAI2000), 443–449. Menlo Park, CA: AAAI/MIT Press. Hellwig, Birgit this vol. Postural categories and the classification of nominal concepts: A case study of Goemai. Hofweber, Thomas 2005 A puzzle about ontology. Nous 39 (2): 256–283. Huang, Zhisheng, Frank van Harmelen, and Annette ten Teije 2005 Reasoning with inconsistent ontologies. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence (IJCAI-2005), Leslie Pack Kaelbling, and Alessandro Saffiotti (eds.), 454–459. Edinburgh: Professional Book Center. Imai, Mutsumi, and Dedre Gentner 1997 A cross-linguistic study of early word meaning: Universal ontology and linguistic influence. Cognition 62: 169–200. Khlentzos, Drew 2004 Semantic challenges to realism. In The Stanford Encyclopedia of Philosophy (Winter 2004 Edition), Edward N. Zalta (ed.), http://plato.stanford.edu/archives/win2004/entries/realism-semchallenge/. Koepsell, David R. 1999 Introduction to applied ontology: The philosophical analyses of everyday objects. American Journal of Economics & Sociology 58 (2): 217–220.
64
Matthias Nickles et al.
Lang, Ewald 1984 The Semantics of Coordination. Amsterdam: John Benjamins. Lee, Raymond Y. N., and Paul W. Sternberg 2003 Building a cell and anatomy ontology of Caenorhabditis elegans. Comparative and Functional Genomics 4 (1): 121–126. Lenat, Douglas B. 1995 CYC: A large-scale investment in knowledge infrastructure. Communications of the ACM 38 (11): 33–38. Link, Godehard 1983 The logical analysis of plurals and mass terms: A lattice-theoretical approach. In Meaning, Use, and the Interpretation of language, Rainer B¨auerle, Christoph Schwarze, and Arnim von Stechow (eds.), 303–323. Berlin/New York: Walter de Gruyter. Luger, George F., and William A. Stubblefield 1993 Artificial Intelligence: Structures and Strategies for Complex Problem Solving. Redwood City, CA: Benjamin/Cummings. Martin, Philippe, and Martin Eklund 2001 Large-scale cooperatively-built heterogeneous KBs. In Proceedings of the 9th International Conference on Conceptual Structures, Gerd Stumme and Guy W. Mineau (eds.). (Lecture Notes in Artificial Intelligence 2120.) Berlin: Springer Verlag. McCarthy, John 1980 Circumscription – A form of non-monotonic reasoning. Artificial Intelligence 13: 27–39. 1987 Generality in Artificial Intelligence. Communications of ACM 30 (12): 1030–1035. McCarthy, John, and Pat Hayes 1969 Some philosophical problems from the standpoint of artificial intelligence. Machine Intelligence 4: 463–502. McGuinness, Deborah, and Frank van Harmelen 2004 OWL web ontology language: Overview. (W3C Recommendation 10, February 2004.) http://www.w3.org/TR/owl-features/. Mel’cuk, Igor 1998 Collocations and lexical functions. In Phraseology. Theory, Analysis, and Applications, Anthony P. Cowie (ed.), 23–53. Oxford: Clarendon Press. Melle, William van 1982 System Aids in Constructing Consultation Programs: EMYCIN. Ann Arbor, MI: UMI Research Press. Mena, Eduardo, Arantza Illarramendi, Vipul Kashyap, and Amit Sheth 2000 OBSERVER: An approach for query processing in global information systems based on interoperation across pre-existing ontologies. On Distributed and Parallel Databases – An International Journal 8 (2): 223–271.
Ontologies across disciplines
65
Metzinger, Thomas, and Vittorio Gallese this vol. The emergence of a shared action ontology: Building blocks for a theory. Miller, Alexander 2005 Realism. In The Stanford Encyclopedia of Philosophy (Fall 2005 Edition), Edward N. Zalta (ed.), http://plato.stanford.edu/archives/ fall2005/entries/realism/. Miller, George A. 1995 WordNet: A lexical database for English. Communications of the ACM 38 (11): 39–41. Misskof, Michele, Roberto Navigli, and Paola Velardi 2002 Integrated approach to web ontology learning and engineering. IEEE Computer 35 (11): 60–63. Montague, Richard 1973 The proper treatment of quantification in ordinary English. In Approaches to Natural Language, K. Jaakko J. Hintikka, Julius M. E. Moravcsik, and Patrick Suppes (eds.), 221–242. Dordrecht: Reidel. Motik, Boris 2005 On the properties of metamodeling in OWL. In Proceedings of the 4th International Semantic Web Conference (ISWC 2005), Yolanda Gil, Enrico Motta, V. Richard Benjamins, and Mark A. Musen (eds.), 548–562. (Lecture Notes in Computer Science 3729.) Berlin: Springer. M¨uller, Heinz-J¨urgen, and Rose Dieng (eds.) 2002 Computational Conflicts. Berlin: Springer. Nadathur, Gopalan, and Dale Miller 1998 Higher-order logic programming. In Handbook of Logic in Artificial Intelligence and Logic Programming, Vol. 5, Dov M. Gabbay, Christopher J. Hogger, and John A. Robinson (eds.), 499–590. Oxford: Oxford University Press. Nickles, Matthias, Tina Froehner, Ruth Cobos, and Gerhard Weiss 2005 Multi-source knowledge bases and ontologies with multiple individual and social viewpoints. In Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI’05), Andrzej Skowron, Rakesh Agrawal, Michael Luck, Takahira Yamaguchi, Pierre Morizet-Mahoudeaux, Jiming Liu, and Ning Zhong (eds.), 62–65. Los Alamitos, CA: IEEE Computer Society Press. Niles, Ian, and Adam Pease 2001 Toward a Standard Upper Ontology. In Proceedings of the 2nd International Conference on Formal Ontology in Information Systems (FOIS-2001), Christopher Welty and Barry Smith (eds.), 2–9. New York: ACM Press. 2003 Linking lexicons and ontologies: Mapping WordNet to the Suggested Upper Merged Ontology. In Proceedings of the IEEE International
66
Matthias Nickles et al. Conference on Information and Knowledge Engineering, Nazli Goharian (ed.), 412–416. Las Vegas, NV: CSREA Press.
Nirenburg, Sergei, and Victor Raskin 2004 Ontological Semantics. Cambridge, MA: MIT Press. Nodine, Marian, Jerry Fowler, and Brad Perry 1999 An overview of active information gathering in InfoSleuth. In Proceedings of the Second International Symposium on Cooperative Database Systems for Advanced Applications (CODAS’99). Berlin: Springer. Object Management Group 1997–2006 UMLTM Resource Page. Available online at http://www.uml.org/ (accessed 30 May 2006). Pease, Adam 2003
this vol.
The Sigma ontology development environment. In Working Notes of the IJCAI-2003 Workshop on Ontology and Distributed Systems, Fausto Giunchiglia, Asucion Gomez-Perez, Adam Pease, Heiner Stuckenschmidt, York Sure, and Steven Willmott (eds.). (CEUR Workshop Proceeding Series 71.) http://ceur-ws.org/Vol-71/. Formal representation of concepts: The Suggested Upper Merged Ontology and its use in linguistics.
Pease, Adam, and Christiane Fellbaum 2004 Language to logic translation with PhraseBank. In Proceedings of the Second International WordNet Conference (GWC 2004), Petr Sojka, Karel Pala, Pavel Smrz, Christiane Fellbaum, and Piek Vossen (eds.), 187–192. Brno: Masaryk University. Pease, Adam, Raymond A. Liuzzi, and David Gunning 2001 Knowledge bases. In Encylopedia of Software Engineering, John J. Marciniak (ed.). 2nd ed. New York: John Wiley and Sons. Pfeifer, Rolf 2000
Ray, Steven 2004
On the role of morphology and materials in adaptive behavior. In From Animals to Animats. Proceedings of the 6th International Conference on Simulation of Adaptive Behavior, Jean-Arcady Meyer, Alain Berthoz, Dario Floreano, Herbert L. Roitblat, and Stewart W. Wilson (eds.), 23–32. Cambridge, MA: MIT Press. NIST’s semantic approach to standards and interoperability. Unpublished presentation.
Richardson, Matthew, Rakesh Agrawal, and Pedro Domingos 2003 Trust management for the Semantic Web. In Proceedings of the International Semantic Web Conference (ISWC-03), Dieter Fensel, Katia Sycara, and John Mylopoulos (eds.), 351–368. (Lecture Notes in Computer Science 2870.) Berlin: Springer.
Ontologies across disciplines
67
Schalley, Andrea C. 2004 Cognitive Modeling and Verbal Semantics. A Representational Framework Based on UML. (Trends in Linguistics. Studies and Monographs 154.) Berlin/New York: Mouton de Gruyter. Schalley, Andrea C., and Dietmar Zaefferer this vol. Ontolinguistics – An outline. Searle, John R. 2005 What is an institution? Journal of Institutional Economics 1 (1): 1– 22. Staab, Steffen, Alexander M¨adche, Frank Nack, Simone Santini, and Luc Steels 2002 Emergent Semantics. IEEE Intelligent Systems, Trends & Controversies 17 (1): 78–86. Staab, Steffen, and Rudi Studer (eds.) 2003 Handbook on Ontologies in Information Systems. Berlin: Springer. Talmy, Leonard this vol. The representation of spatial structure in spoken and signed language: A neural model. Tamma, Valentina A. M. 2002 An ontology model supporting multiple ontologies for knowledge sharing. Ph.D. diss., Department of Computer Science, University of Liverpool. Yergeau, Franc¸ois, Tim Bray, Jean Paoli, C. Michael Sperberg-McQueen, and Eve Maler 2004 Extensible Markup Language (XML) 1.0 (Third Edition). (W3C Recommendation 4th February 2004.) http://www.w3.org/TR/2004/ REC-xml-20040204/ Zaefferer, Dietmar 2002 Polysemy, polyvalence, and linking mismatches. The concept of RAIN and its codings in English, German, Italian, and Spanish. DELTA – Documentac¸a˜ o de Estudos em Ling¨u´ıstica T´eorica e Aplicada 18 (spe.): 27–56. Special Issue: Polysemy. this vol. Language as mind sharing device: Mental and linguistic concepts in a general ontology of everyday life.
Part II: Foundations, general ontologies, and linguistic categories
The emergence of a shared action ontology: Building blocks for a theory Thomas Metzinger and Vittorio Gallese To have an ontology is to interpret a world. In this paper we argue that the brain, viewed as a representational system aimed at interpreting our world, possesses an ontology too. It creates primitives and makes existence assumptions. It decomposes target space in a way that exhibits a certain invariance, which in turn is functionally significant. We will investigate which are the functional regularities guiding this decomposition process, by answering to the following questions: What are the explicit and implicit assumptions about the structure of reality, which at the same time shape the causal profile of the brain’s motor output and its representational deep structure, in particular of the conscious mind arising from it (its “phenomenal output”)? How do they constrain high-level phenomena like conscious experience, the emergence of a first-person perspective, or social cognition? By reviewing a series of neuroscientific results and integrating them with a wider philosophical perspective, we will emphasize the contribution the motor system makes to this process. As it will be shown, the motor system constructs goals, actions, and intending selves as basic constituents of the world it interprets. It does so by assigning a single, unified causal role to them. Empirical evidence demonstrates that the brain models movements and action goals in terms of multimodal representations of organism-object-relations. Under a representationalist analysis, this process can be conceived of as an internal, dynamic representation of the intentionality-relation itself. We will show how such a complex form of representational content, once it is in place, can later function as a functional building block for social cognition and for a more complex, consciously experienced representation of the first-person perspective as well.
1.
Introduction
Actions in the external world can be experienced as such, recognized, and understood. Simultaneously, the intentional content correlated with them (i.e., their satisfaction conditions resp. their intended goal-state) is interpreted by the observer as playing a causal role in determining the behavior of the ob-
72
Thomas Metzinger and Vittorio Gallese
served other individuals. From a first-person perspective, the dynamic social environment appears as populated by volitional agents capable to entertain, similarly to the observer, an intentional relation to the world. We experience other agents as directed at certain target states or objects. We are “intentionality-detectors”: As human beings, we cannot only mentally build an “objective”, third-person account of the behaviors constituting the events of our social world. Beyond phenomenally experiencing the objective nature of a witnessed action, we can also experience its goal-directedness or intentional character, similarly to when we experience ourselves as the willful conscious agents of an ongoing behavior. In the present paper we will provide and integrate some empirical and conceptual building blocks for a theory of the emergence of a common ontology between members of a group. We will examine, from a third-person scientific perspective, the fundamentally relational character of actions in the world. In particular, we want to look at the “ontological commitments” the brain makes when representing actions and goals. It will be further shown that the brain builds an ontology, an internal model of reality, which – on a very fundamental level within its representational architecture – incorporates the relational character of inter-actions between organism and environment, and that this architecture can actually be traced at the microfunctional level implemented in the brain’s neural networks. The same subpersonal ontology then guides organisms when they are epistemic agents in a social world: Interpersonal relations become meaningful in virtue of a shared action ontology. An action ontology can only be shared and successfully used by two systems, if there is a sufficient degree of functional overlap between them, if they decompose target space in similar ways. We will posit that the cognitive development of social competence capitalizes upon such a shared ontology to trigger the timely onset of behaviors such as gaze following, shared attention, and mind reading, which will eventually give rise to a full-blown capacity to entertain mental accounts of the behavior and goal states of other agents. We will also propose that what makes humans special is the fact that their functional ontology is much richer in socially individuated goal representations and that their model of reality is not only rich and flexible, but that they can actively expand their own functional ontology by mentally ascribing distal goals to conspecifics. Neuroscientific results discussing the functional properties of mirror neurons, a class of premotor neurons in the monkey brain, will be introduced. It will be proposed that mirror neurons can be conceived of as the dawning of
The emergence of a shared action ontology
73
what the equivalent matching systems in our human brains are the fully developed realization of: a fundamental and mostly unconscious representational structure capable to build a shared action ontology. However, in closing we will also provide an example for a late and high-level utilization of this structure: enabling beings like ourselves to mutually acknowledge each other as persons, and to consciously experience this very fact at the same time.
2.
The neural underpinnings of social understanding
Primates, and particularly human beings are social animals whose cognitive development capitalizes upon the interaction with other conspecifics (adults, siblings, etc.). During social interactions we overtly manifest our inner intentions, dispositions and thoughts by means of overt behavior. We reciprocate this by trying to figure out what are the intentions, dispositions, and thoughts of others, when witnessing their behavior. Detecting another agent’s intentions, or other inner states, helps anticipating this agent’s future actions, which may be cooperative, non-cooperative, or even threatening. Accurate understanding and anticipation enable the observer to adjust his responses appropriately. Some recent neuroscientific results seem to suggest that a common neural representation underpins the specification of action end-states, independently of whose end-states are to be specified. About 10 years ago a class of premotor neurons discharging not only when the monkey executed goal-related hand actions but also when observing other individuals (monkeys or humans) executing similar actions were discovered in the macaque monkey brain. These neurons were designated as “mirror neurons” (Gallese et al. 1996, 2002; Rizzolatti et al. 1996a; see also Fogassi and Gallese 2002; Gallese 2000, 2001; Rizzolatti, Fogassi, and Gallese 2000, 2001; Rizzolatti, Craighero, and Fadiga 2002). Mirror neurons require, in order to be activated by visual stimuli, an interaction between the agent (be it a human being or a monkey) and its target object. The visual presentation of objects does not evoke any response. Similarly, actions that, although achieving the same goal and looking similar to those performed by the experimenter’s hand, are made with tools such as pliers or pincers have little effect on the response of mirror neurons (Gallese et al. 1996). Neurons with similar properties were later discovered in a sector of the posterior parietal cortex reciprocally connected with area F5, area PF or 7b (PF mirror neurons, see Gallese et al. 2002).
74
Thomas Metzinger and Vittorio Gallese
Of course, on the level of theories of mental representation, the idea of an “ideomotor principle” and the empirical hypothesis that perception and empathy engage motor representations are much older and go back at least to Howard Carpenter and William James. The discovery of mirror neurons, nevertheless, has changed our views on the neural mechanisms at the basis of interpersonal relations. It has been proposed that the mechanism instantiated by mirror neurons could be at the basis of an implicit form of action understanding (Gallese et al. 1996; Rizzolatti et al. 1996a). The observation of an action leads to the activation in the brain of the observer of the same neural network active during its actual execution: action observation causes action simulation, the automatic simulation of the motor plan leads to the same endstate in the observer. The shared, overlapping computational space leads to the implicit detection of the same end-state in the observed behavior of the agent (Gallese 2003a; Gallese, Ferrari, and Umilt`a 2002). The relationship between action understanding and action simulation is even more evident in the light of the results of two more recent studies. In the first series of experiments, Umilt`a et al. (2001) tested F5 mirror neurons in two conditions: in the first condition the monkey could see the entire action (e.g., a hand grasping action); in the second condition, the same action was presented, but its final critical part, that is the hand-object interaction, was hidden. In the hidden condition the monkey only “knew” that the target object was present behind the occluder. The results showed that more than half of the recorded neurons responded also in the hidden condition (Umilt`a et al. 2001) (see Figure 1). Behavioral data have shown that, like humans, monkeys can also infer the goal of an action even when the visual information about it is incomplete (Filion, Washburn, and Gulledge 1996). The data by Umilt`a et al. (2001) reveal the likely neural mechanism at the basis of this cognitive capacity. The inference about the goals of the behavior of others appears to be mediated by the activity of motor neurons coding the goal of the same action in the observer’s brain. Out of sight is not “out of mind” just because, by simulating the action, the gap can be filled. The results of a more recent series of neurophysiological experiments make this hypothesis even more plausible. Let us see why. Some transitive actions are characteristically accompanied by a sound. Imagine hearing the sound produced by footsteps of a walking person approaching you. This sound will induce you thinking that someone is getting closer to you. That particular sound enables you to understand what is going on, even if you
The emergence of a shared action ontology
75
have no visual information about what is currently happening out of your visual field. The action’s sound has the capacity to make an invisible action inferred, and therefore present and understood.
Figure 1. Example of an F5 mirror neuron responding to action observation in Full vision and in Hidden condition. The lower part of each panel illustrates schematically the experimenter’s action as observed from the monkey’s vantage point: the experimenter’s hand starting from a fixed position, moving toward an object and grasping it (panels A and B), or mimicking grasping (panels C and D). The behav-
76
Thomas Metzinger and Vittorio Gallese ioral paradigm consisted of two basic conditions: Full vision condition (A) and Hidden condition (B). Two control conditions were also performed: Mimicking in full vision (C), and Mimicking hidden (D). In these last two conditions the monkey observed the same movements as in A and B, but without the target object. The black frame depicts the metallic frame interposed between the experimenter and the monkey in all conditions. In panels B and D the gray square inside the black frame represents the opaque sliding screen that prevented the monkey from seeing the experimenter’s action performed behind it. The asterisk indicates the location of a marker on the frame. In hidden conditions the experimenter’s hand started to disappear from the monkey’s vision when crossing the marker position. The upper part of each panel shows rasters displays and histograms of ten consecutive trials recorded during the corresponding experimenter’s hand movement illustrated in the lower part. Above each raster kinematics recordings (black traces) of the experimenter’s hand are shown. The black trace indicates the experimenter’s hand movements recorded using a motion analysis system. The illustrated neuron responded to the observation of grasping and holding in Full vision (A) and in the Hidden condition (B), in which the interaction between the experimenter’s hand and the object occurred behind the opaque screen. The neuron response was virtually absent in the two conditions in which the observed action was mimed (C and D). Histograms bin width 1/4 20 ms. Ordinates: spikes/s; abscissae: time (modified from Umilt`a et al. 2001).
To investigate the neural mechanism possibly underpinning this capacity, F5 mirror neurons were recorded from two monkeys during four different experimental conditions: when the monkey executed noisy actions (e.g. breaking peanuts, tearing sheets of paper apart, and the like); when the monkey saw and heard, or just saw, or just heard the same actions performed by another individual. The results showed that a consistent percentage of the tested mirror neurons fired when the monkey executed the action, just observed or just heard the same action performed by another agent (see Kohler et al. 2001, 2002; Keysers et al. 2003) (Figure 2). These neurons were defined as “audio-visual mirror neurons”. They not only respond to the sound of actions, but also discriminate between the sounds of different actions. The actions whose sounds are preferred are also the actions producing the strongest responses when observed or executed. It does not matter at all for the activity of this neural network if the actions are specified at the motor, visual or auditory level. The activation of the premotor neural network controlling the execution of action A in the presence
The emergence of a shared action ontology
77
of sensory information related to the same action A can be characterized as simulating action A.
Figure 2. Example of an F5 audio-visual mirror neuron. (A) Lateral view of the macaque brain with the location of area F5 shaded in gray. Major sulci: a, arcuate; c, central; ip, intraparietal; s, sylvian sulcus. (B) Experimental setup. (C) Response of a neuron discriminating between two actions in all modalities. Rastergrams are shown together with spike density functions for the best (black) and the less effective action (gray). V + S, V, S, and M stand for vision-and-sound, vision-only, sound-only, and Motor conditions, respectively. The vertical lines indicate the time at which the sound occurred (V + S, S) or would have occurred (V). The traces under the spike-density functions in the sound-only conditions are oscillograms of the sounds played back to test the neurons. This neuron discharged when the monkey broke a peanut (row ‘M’) and when the monkey observed the experimenter making the same action (rows V and V + S). The same neuron also responded when the monkey only heard the sound
78
Thomas Metzinger and Vittorio Gallese of a peanut being broken without seeing the action (row ‘S’). When the monkey grasped a ring (‘M’), Neuron 1 responded much less, demonstrating the motor specificity of the neuron. Also both the vision and the sound of an experimenter grasping the ring determined much smaller responses. A statistical criterion yielded both auditory and visual selectivity for this neuron (modified from Keysers et al. 2003).
The multimodally driven simulation of action goals instantiated by neurons situated in the ventral pre-motor cortex of the monkey, exemplifies properties that are strikingly similar to the symbolic properties so characteristic of abstract human thought. The similarity with conceptual content is quite appealing: the same mental content (“the goal of action A”) results from a multiplicity of states subsuming it, such as sounds, observed and executed actions. These states, in turn, are subsumed by differently triggered patterns of activation within a population of “audio-visual mirror neurons”. Depending on one’s theory of concepts, all this may of course also be interpreted as a process of non-conceptual generalization. The important point, however, is that we have evidence for an abstract, allocentric, and multimodal level of goal representation (see Gallese 2003b). The general picture conveyed by these results is that the premotor-parietal F5-PF mirror matching system instantiates simulations of actions utilized not only to generate and control goal-related behaviors, but also to provide a meaningful account of the goals and purposes of others’ actions, by means of their implicit and automatic simulation. What is the relation between these data and our understanding of human social cognition? Several studies using different experimental methodologies and techniques have demonstrated in humans also the existence of a similar mirror system, matching action observation and execution (see Buccino et al. 2001; Cochin et al. 1998; Decety et al. 1997; Fadiga et al. 1995; Grafton et al. 1996; Hari et al. 1998; Iacoboni et al. 1999; Rizzolatti et al. 1996b). In particular, it is interesting to note that brain imaging experiments in humans have shown that during action observation there is a strong activation of premotor and parietal areas, the likely human homologue of the monkey areas in which mirror neurons were originally described (Buccino et al. 2001; Decety and Gr`ezes 1999; Decety et al. 1997; Grafton et al. 1996; Iacoboni et al. 1999; Rizzolatti et al. 1996b). From an empirical point of view, it is now highly plausible to assume that in standard configurations for humans, as well as for monkeys, action observation automatically triggers action simulation on a rather abstract, multimodal level of neural representations. Before mov-
The emergence of a shared action ontology
79
ing upwards from the neuroscientific and functional to the representationalist level of analysis, let us define more precisely what we imply by “simulation”.
3.
Embodied simulation
So far we have characterized the function of mirror neurons in terms of implicit action understanding, by means of embodied simulation. The notion of simulation is at present employed in many different domains, often with different, not necessarily overlapping meanings. Simulation is a functional process that possesses a certain representational content, typically focussing on the temporal evolution or on possible states of its target object. For example, an authoritative view on motor control characterizes simulation as the mechanism employed by forward models to predict the sensory consequences of impending actions (see Wolpert, Doya, and Kawato 2003). According to this view, the predicted consequences are the simulated ones. In philosophy of mind, on the other hand, the notion of simulation has been used by the proponents of Simulation Theory of mind reading to characterize the pretend state adopted by the attributer in order to understand others’ behaviour (see Goldman 1989, 1992; Gordon 1986, 1995). We will use the term “simulation” as the core element of an automatic, unconscious, and pre-reflexive control functional mechanism, whose function is the modelling of objects, events, and other agents to be controlled. Simulation, as conceived of in the present paper, is therefore not necessarily the result of a willed and conscious cognitive effort, aimed at interpreting the intentions hidden in the overt behavior of others, but rather a basic functional mechanism of our brain. However, because it also generates representational content, this functional mechanism seems to play a major role in our epistemic approach to the world. It represents the outcome of a possible action one could take, and serves to attribute this outcome to another organism as a real goal-state it tries to bring about. Indeed, successful perception requires the capacity of predicting upcoming sensory events. Similarly, successful action requires the capacity of predicting the expected consequences of action. As suggested by an impressive and coherent amount of neuroscientific data (see Gallese 2003a), both types of predictions seem to depend on the results of unconscious and automatic simulation processes. In this sense, simulation is not simply confined to the domain of motor control, but it seems to be a more general and basic endowment of our brain. It is mental because it
80
Thomas Metzinger and Vittorio Gallese
has a content, but it can be motor, because its function may be realized by the motor system. We also call it “embodied” – not only because it is neurally realized, but also because it uses a pre-existing body-model in the brain, and therefore involves a prerational form of self-representation. In this context, low-level action simulation in social cognition can also be seen as an exaptation: There has never been any “special design” for the function we describe here. It is an extended functionality that was later co-opted from a distinct original adaptational functionality (namely, motor control), and it did not arise as a consequence of progressive adaptations via natural selection, because the underlying anatomical structure was never directly selected for. A further point that deserves some comments is the relationship between a given observed action and the simulation process it induces. How can it be possible that when witnessing a given action, exactly the same action is simulated in the observer’s brain? How is action identity internally represented (see Knoblich and Flach 2003). How can the appropriate simulation be computed? Two hypothetical scenarios can be sketched. According to the first hypothesis, the selection process is determined on the basis of prior visual identification of the observed action. Once the action is visually identified, its simulation can ensue. On the basis of this scenario, though, the simulation process appears to be redundant for understanding what the observed action really means, because it implies its prior visual identification. The capacity of mirror neurons to be activated by hidden actions, provided that their outcome can be predicted (see above, Umilt`a et al. 2001), seems to suggest, however, that action simulation does not require a comprehensive visual representation of the witnessed action in order to be triggered. According to the second hypothesis – the one we endorse – the selection process of which action is to be simulated in response to the observed one is automatic and pre-determined (see the “direct matching hypothesis”, Rizzolatti, Fogassi, and Gallese 2001). Now, the crucial point is to explain what enables the direct matching between the visual representation of an action and its motor representation. The answer to such a question can only be speculative. On the one hand, early forms of automatic imitation of mouth movements in newborn babies (Meltzoff and Moore 1977) seem to suggest a hardwired mechanism coupling action observation with action execution. Perhaps it is no coincidence that these phenomena are mostly involving the action of body parts, such as the mouth, for which we have no direct visual control. In the case of hand actions, the ontogenetic development of the direct matching mechanism could instead be the result of a Hebbian association between the agent’s own action and self-observation, and a further generalization of this association to others.
The emergence of a shared action ontology
4.
81
Sharing an action ontology: The general idea
One useful way to look at the brain is to describe it as a dynamic representational system. Every representational system presupposes an ontology: A set of assumptions about what the elementary building blocks of the reality to be represented actually are. By necessity, it constructs primitives. For example, many natural languages, viewed as representational systems, typically assume that extralinguistic reality is constituted by objects, properties, and relations. Their underlying ontology then frequently carries over into folk-psychological contexts, for instance, by influencing the way in which we naively describe our own mental states in everyday life situations. A quantum mechanical description of reality in physics, on the other hand, may successfully operate under a completely different set of metaphysical assumptions. It represents reality under a different ontology, because it uses different primitives. An interesting, but frequently overlooked fact is that the human brain possesses an ontology too, because it makes assumptions about what the relevant and what the irreducible building blocks of external reality are. By metaphorically speaking of the brain as “making assumptions” (or as “arriving at conclusions”, etc., see below), of course, we do not mean to imply that the brain is a cognitive or epistemic agent, or even that it internally uses propositional formats and symbolic coding principles. Persons make assumptions, and brains are only subpersonal functional modules of whole persons. In addition, the representational dynamics of brains defies the distinction between syntax and semantics. It has to be conceived as an “agent-free” type of subpersonal self-organization. Personal-level predicates, or what philosophers sometimes call “intentionalist idioms”, have to be avoided on all subpersonal levels of description. There is no little man in the head interpreting quasi-linguistic representations, and, for the approach sketched here, the phenomenal self is something that has no transcendental subject or conscious agent “behind it”. However, in generating a coherent internal world-model, brains decompose target space in a certain way. The functional regularities guiding this decomposition process are by no means arbitrary. We are interested in the causally relevant invariance guiding it, and this is what we mean by investigating the brain’s “ontological assumptions”. Please note that we will here take no position at all on the ontological commitments of cognitive neuroscience or the metaphysics generated by scientific theories in general. Furthermore, although we are interested in how this process constrains and enables high-level phenomena like subjectivity and phenomenal experi-
82
Thomas Metzinger and Vittorio Gallese
ence, we do not want to imply a solution to the problem of consciousness as such. As will become obvious, the representational dynamics we describe cuts across the border between phenomenal and non-phenomenal states in an interesting way. Our main point is that there exist unconscious functional precursors of what can later also be phenomenally represented as a goal, an acting self or an individual first-person perspective. This is a point where some conceptual tools provided by philosophy of mind can shed more light on the general problem. Two research targets are of particular interest in this context, because they enrich the system’s functional ontology in a decisive way: The phenomenal self-model (PSM) and the phenomenal model of the intentionality-relation (PMIR). The PSM is the integrated conscious model an organism may have of itself as a whole. The PMIR is the integrated conscious model an organism may have of itself as standing in a relation to certain perceptual objects, other agents, and, in particular, the goals of other agents. In folk-psychological terms, the content of the PSM is what we sometimes naively call “the” self. The content of the PMIR is what we call “our” perspective on the world and other subjects (see Metzinger 2003 for details). Our claim is that the mirror system makes a centrally relevant microfunctional contribution to both of these high-level phenomena, because, on the neural level, it helps to code the representational content in question: First, the ongoing activity of the motor system obviously is a major contributor to any system’s current self-model; second, there now is strong empirical evidence for mirror neurons as specifically coding organism-object relations on various levels of abstraction. We will come back to this point at the end of our paper. Before we can do so, it will be necessary to further clarify our conceptual framework and show how it is integrated with the data upon which we draw.
5.
Goals
We think that the neuroscientific data so far presented make it necessary to conceptually restructure the questions that have to be asked. Let us begin by defining some basic notions. What is a goal? From a strict scientific point of view, no such things as goals exist in the objective world. All that exists are goal-representations, for instance, as we have seen in Section 2., those activated by biological nervous systems. A goal-representation is, first, formed by the representation of a certain state of the organism, of the world,
The emergence of a shared action ontology
83
or by the holding of a certain relation between the organism and a part of the world, e.g., another organism. Goal-representations are representations of goal-states. Second, what functionally makes such a state a goal-state is the fact that its internal representation is structured along an axis of valence: It possesses a value for the system. Broadly speaking, a value is anything that is conducive to preserving an organism’s integrity (e.g., homeostasis), to maintaining integration on higher levels of complexity (e.g., cognitive development and social interaction), and to procreative success. Therefore, the reward system will be a second important element of the way in which a goal-representation can be implemented in a causally effective way. Goalstates imply values on the level of the individual organism, and values are made causally effective through the reward system. It is interesting to note how infants differently construe goal-relatedness when witnessing the intentional actions of other individuals as opposed to physical events not involving human agents. When 18-month-old infants see a person slip and fail to complete an intended action, they imitate the intended action and not the actual movements that the actor made. However, if the action is displayed by a mechanical device, they fail to successfully reproduce it (Meltzoff 1995). A further argument favoring the hypothesis that goal-relatedness is differently perceived by infants in social and physical event configurations, is provided by some recent findings by Amanda Woodward and collaborators (Woodward, Sommerville, and Guajardo 2001). These researchers have shown that 6-months-old infants react differently to observed grasping actions according to the biological (human hand) or artificial (mechanical claw) nature of the grasping agent. Only the former are considered as goal-directed actions. It appears therefore that infants’ early propensity to attend to goals seems to be specific to human actors. Biological systems are systems under evolutionary pressure, systems having to predict future events that possess a high survival value. Therefore, the prehistory of representational goal-states is likely to be found in the reward system. Why is this so? Reward is the payoff of the self-organizing principles that functionally govern and internally model the organization of an open system such as a living body is. Every organism has an internal, likely predetermined and genetically imprinted “drive” pushing toward homeostasis. A reward system is necessary to tell the organism that it is doing right, that it is heading on along the right track, the track leading through multiple stages to the achievement of higher and more complex levels of integration. Higher integration means greater flexibility, which in turn means fitness, better adap-
84
Thomas Metzinger and Vittorio Gallese
tation to changing environments, better chances to pass over genes, and the like. In healthy individuals, the architecture of their goal-state hierarchy expresses their individual “logic of survival” (Damasio 1999). Therefore, the reward system can be conceived of as generating a new, non-symbolic kind of representational content, namely the internal value assigned to a certain state. It is important to note that even if “value” (just like “goal”) in the normative sense is not a public property observable from a scientific third-person perspective, an internal simulation of value within an individual organism can be causally effective, and adaptive. A conscious representation of value, as for instance expressed in a subjectively experienced emotional state, has the additional functional advantage of making survival value-related information globally available for the selective and flexible control of action, attention, and cognition within a virtual window of presence. Global availability means that this information is accessible to many different processing systems at the same time (Baars 1988; Metzinger 2003). Presence means that it is functionally integrated into short-term memory and therefore phenomenally experienced as holding now. A further interesting aspect of the relation between goal-state representations and reward consists in the fact that very soon the infant also learns to rely more and more on external causes for the internal activation of the reward system. The positive reactions (or their lack) induced in the adult caregivers by infant’s behavior provide her almost on-line with a very useful guide about how to act in a given context. It has indeed been shown that around six months of age infants visually “check back” to the mother’s emotional reaction in order to disambiguate ambiguous or uncertain events. Such phenomenon has been called “social referencing” (see Thompson 1999). Once the value of the adults’ reaction has been evaluated, a given goal-state representation can reliably be consolidated (or put into question, in case of negative reactions), making it a stable part of the infant’s behavioral repertoire. Neurally realized goal-state representations possess a number of interesting features. First, they change the system’s functional ontology: It now acts as if something like goals actually were a part of reality, as if maximizing complexity and integration was a good in itself. As a representational system, it embodies an assumption, and this assumption – a representational construction of valence – becomes causally active. To use a metaphor coined by Francisco Varela, teleology is now “enacted” (Varela, Thompson, and Rosch 1991). Second, on the level of conscious goal-representation, goal representations frequently are phenomenally transparent. This is to say that the organ-
The emergence of a shared action ontology
85
ism “looks through” the actual representational mechanism and “directly” assigns value to perceived target objects, to specific subject-object-relations characterizing certain goal-states, or even to another agent. Phenomenal transparency turns an organism into a naive realist. Because earlier processing stages are not accessible to introspective attention, their content appears as directly and immediately given. On the level of conscious experience, the internal construct of “valence” therefore frequently becomes an objective property perceived in external parts of reality. However, at least in humans, phenomenally opaque goal-representations do exist as well: In these cases we consciously experience them as representations. We experience ourselves as beings that actively construct their own goals, and then operate with internal representations of the corresponding goal-states. Conscious goal representations are positioned on a continuum between transparency and opacity, with a strong bias towards transparency (for more on phenomenal transparency, see Metzinger 2003, Sections 3.2.7. and 6.4.2.). A third feature makes goal-representations interesting: Goal-representations do not possess truth-conditions, but conditions of satisfaction. Because no such things as goals exist in the objective order of things, a goalrepresentation cannot be true or false. However, the goal-states as represented can either hold or not hold, or they can hold to various degrees. Therefore, an internally represented goal-state can continuously be matched against sensory input (or even against ongoing internal simulations like memories, or the results of planning operations). Fourth, goal-representations are intimately linked to an organism’s selfrepresentation: In standard situations they typically depict an individual logic of survival. Only on higher levels of complexity do we find mentally represented first-person plural and socially individuated goal-states. As representations are also dynamical entities, it is best to conceive of the special form of representational content formed by goal-states as an ongoing process that allows an organism to functionally appropriate the fact that certain states of itself, of the world and its social environment are valuable states to itself. The possession of goal-states, if integrated into a coherent self-model, allows an organism to own the logic of its own survival – functionally, representationally, and sometimes even consciously. Goal-states are states the organism wants to bring about, by means of goal-representations. Goals – as representational constructs – are fundamental elements of the brain’s model of the world. In particular they are building blocks of behavioral space (as represented by the brain). They turn its functional ontology into a teleological ontology, or, in short, into a “teleontology”.
86
Thomas Metzinger and Vittorio Gallese
On the other hand, the building blocks of behavioral space seem to be abstract representations of possible actions, as seen from no particular perspective, because the empirical evidence (e.g., on mirror neurons, see above) now clearly seems to point to the existence of agent-independent goal detectors. Therefore, a particularly interesting way of analyzing the content of goal-representations is as states that portray a successful, completed action from no particular/individual perspective. The new theoretical problem to be solved is to explain how precisely these allocentric entities later get bound to the conscious perspective of a full-blown agent. But what is an “action”? And what is a “perspective”? 6.
Actions
On a non-conceptual level, actions are elementary building blocks of reality for certain living organisms: Some kinds of organisms have developed agentdetecting modules, and some of them also conceive of themselves as agents. They have an extended ontology, because their reality has been considerably enriched. We can define such functionally enriched systems as possessors of an “action ontology”. Let us now define what an action is on a conceptual level. Let us begin by distinguishing movements, behaviors, and actions. Bodily movements are simple physical events, and they can be represented accordingly. Behaviors are movements that are goal-directed, i.e., which can meaningfully be described as directed towards a set of satisfaction conditions, but without necessarily being linked to an explicit and conscious representation of such conditions. As simple motor acts, they also do not have a consciously experienced reward-producing component (Rizzolatti, Fogassi, and Gallese 2001: 668). In particular, behavior is something that can take place in a semi-automatic fashion, like when brushing our teeth. Actions are a specific subset of goal-directed movements: A series of movements that are functionally integrated with a currently active representation of a goal-state as leading to a reward constitute an action. Therefore, an action is not isomorphic to a particular movement or specific behavioral pattern, because many different movements can constitute the same goaldirected action. What individuates an action is the set of satisfaction conditions defining the representational content of its goal-component as leading to a reward plus the special way in which it is causally linked to the actual event of overt movement generation. In particular, an action results from a selection process (which may or may not be conscious) and a representation of
The emergence of a shared action ontology
87
the system as a whole as standing in a certain relation to a specific goal-state (which is phenomenally represented, e.g., globally available via short-term memory). The second defining characteristic is that an action in the true sense not only involves an explicit and conscious self-representation, but also a representation of the perspective the system now takes onto the world. That is, the selection process may well be unconscious, but it inevitably leads to a more global final stage resulting in a conscious representation of the system as a whole – as having an intention, as initiating and executing its own bodily movements. In other words, on the phenomenal level we always find a corresponding global state in which the system as a whole is itself represented as an agent. This leads us to the third, and most intriguing, aspect of actions in the strong sense we are here proposing: Actions are first-person phenomena, and they are carried out by a conscious self. It is important to understand that all of this does not necessarily involve reflexive selfconsciousness, the possession of concepts, or the mastering of a language: In animals such as monkeys an attentional and a volitional perspective could suffice to establish the firstperson character of actions. In order to be appropriately related to an action goal it is enough to be able to (non-cognitively, sub-symbolically) attend to it or to (non-cognitively, sub-symbolically) select a specific motor pattern. At least these are two further assumptions underlying the way in which the brain, as a representational system, typically decomposes the reality it models into sub-components. For full-blown actions, we can only understand the special causal linkage between a goal-representation and the overt motor output, if we also understand how both are mediated through a specific representational structure, namely the “phenomenal model of the intentionality relation” (PMIR, see above). This structure will be explored in the next section. 7. 7.1.
The PMIR: A short representationalist analysis What is a PMIR?
Before addressing the problem of the phenomenal aspect of the shared action ontology, we must first focus on the way the agent-object relationship can be phenomenally represented. What is the phenomenal model of the intentionality relation (for details see Metzinger 2003, Chapter 6.)? It is a conscious mental model, and its content is an ongoing, episodic subject-object-relation.
88
Thomas Metzinger and Vittorio Gallese
Here are four different examples, in terms of typical phenomenological descriptions of the class of phenomenal states at issue: “I am someone, who is currently visually attending to the color of the book in my hands”, “I am someone currently grasping the content of the sentence I am reading”, “I am someone currently hearing the sound of the refrigerator behind me”, “I am someone now deciding to get up and get some more juice”. The central defining characteristic of phenomenal models of the intentionality-relation is that they depict a certain relationship as currently holding between the system, as transparently represented to itself, and an object-component. Phenomenologically, a PMIR creates the experience of a self in the act of knowing or of a self in the act of intending and acting. In the latter case, a goal-representation is transiently integrated into an ongoing action-simulation. This class of phenomenal mental models is particularly rich, because the number of possible object components is almost infinitely large. Phenomenal models of the intentionality-relation typically consist of a transparent subject component and varying object components. Those can be transparent as well as opaque, transiently being integrated into an overarching, comprehensive representation of the system as standing in a specific relation to a certain part of the world. The overall picture emerging is that of the human self-model continuously integrating the mechanisms of attentional, cognitive, and volitional availability against a stable background, which is formed by the transparent representation of the bodily self (Metzinger 2000, 2003, Chapter 6.). If one now takes the step from a representationalist level of description to the actual phenomenological changes inherent in the emergence of a full-blown conscious first-person perspective, it is easy to see how for the first time it allows a system to consciously experience itself as being not only a part of the world, but of being fully immersed in it through a dense network of causal, perceptual, cognitive, attentional, and agentive relations. 7.2.
The phenomenal representation of volition
Given the notion of a PMIR, we can begin to develop a more fine-grained analysis for the phenomenological target properties of volitional subjectivity and agency. Conscious volition is generated by integrating abstract goal representations – constituted by allocentric motor-simulations – into the current model of the intentionality-relation as object-components, in a process
The emergence of a shared action ontology
89
of decision or selection. Let us differentiate a number of cases. If we contemplate a certain action goal, i.e., when we ask ourselves whether we should get up and walk over to the refrigerator, we experience ourselves as cognitive subjects. This kind of phenomenally represented subject-object relationship can be analyzed as one in which the object-component is phenomenally opaque: Experientially, we know that we take a certain attitude toward a selfgenerated representation of a goal. A completely different situation ensues if we integrate a goal representation into the phenomenally transparent selfmodel, thereby making it a part of ourselves by identifying with it. Obviously, goal representations and goal hierarchies can also be important components of self-models that are not based on transient subject-object relations, but on enduring internal reorganizations of the self-model, of its emotional and motivational structure, etc., and which may possibly last for a lifetime. A volitional first-person perspective – the phenomenal experience of practical intentionality – emerges if two conditions are satisfied. First, the objectcomponent must be constituted by a particular self-simulatum, by a neural simulation of a concrete behavioral pattern, e.g., like getting up and walking toward the refrigerator. Second, the relationship depicted on the level of conscious experience is one of currently selecting this particular behavioral pattern, as simulated. Again, we can usefully illustrate this by describing it as “representational identification”: The moment following volition, the moment at which concrete bodily behavior actually ensues, is the moment in which the already active motor simulation is integrated into the currently active bodily self-model, and thereby causally coupled to the rest of the motor system and the effectors. It is precisely the moment, in which we identify with a particular action, transforming it from a possible into an actual pattern of behavior and thereby functionally as well as phenomenologically embodying it. Embodiment leads to enaction. Interestingly, the moment of agency seems to be the moment when the phenomenal model of the intentionality relation collapses. We can now describe the experience of being a volitional subject and the experience of being an agent more precisely, using the simple tools already introduced. Phenomenal volition is a form of conscious content. It can be analyzed as representational content as follows: [I myself ( = the currently active transparent model of the self) am currently ( = the de-nunc-character of the overall phenomenal model of the intentionality relation, as integrated into the virtual window of presence) present in a world ( = the transparent, global model of reality currently active) and I am just about to select ( = the type of relation de-
90
Thomas Metzinger and Vittorio Gallese
picted in the phenomenal model of the intentionality relation) a possible way to walk around the chairs toward the refrigerator ( = the object-component, constituted by an opaque simulation of a possible motor pattern in an egocentric frame of reference)]. The experience of agency follows in the moment in which the internal “distance” created between phenomenal self-representation and phenomenal self-simulation in the previously mentioned structure collapses to zero: I realize a possible self-state, by enacting it. As I experience myself walking around the chairs and toward the refrigerator, proprioceptive and kinesthetic feedback allows me to feel the degrees to which I have already identified with the sequence of bodily movements I have selected in the previous moment. Please recall how phenomenally transparent representations are precisely those representations the existence of whose content we cannot doubt. They are those we experience as real, whereas opaque representations are those, which we experience as thoughts, as imagination, or as hallucinations. Realizing a simulated self-state means developing a functional strategy of making it the content of a transparent self-model, of a self that really exists – on the level of phenomenal experience. Ongoing agency, the conscious experience of sustained executive control, can therefore be representationally analyzed according to the following pattern: [I myself (the content of the transparent self-model) am currently ( = the de-nunc-character of the phenomenal model of the intentionality relationship as integrated into the virtual window of presence) present in a world ( = the transparent, global model of reality) and I am currently experiencing myself as carrying out ( = continuously integrating into the transparent self-model) an action which I have previously imagined and selected (the opaque selfsimulation forming the object-component, which is now step by step assimilated into the subject-component)]. Of course, there can be all sorts of additional functional and representational complications, e.g., if the proprioceptive and kinesthetic feedback integrated into the internal model of the body does not match with the forward model still held active in working memory. It is interesting to see how agency conceived of as executive consciousness (“Vollzugsbewusstsein” in the sense of Karl Jaspers) can be analyzed as an ongoing representational dynamics collapsing a phenomenal model of the practical intentionality relationship into a new transparent self-model. Again, as the whole structure is embedded into a virtual window of presence, the transparent, intranscendable experiential state for the system as a whole now
The emergence of a shared action ontology
91
is one of being a full-blown volitional subject, currently being present in a world, and acting in it. Finally, the PMIR has a phenomenally experienced direction: PMIRs are like arrows pointing from self-model to object-component. As soon as one has understood the arrow-like nature of the PMIR, two special cases can be much more clearly described. First, the arrow can point not only outwards, but also inwards, namely in cases where the object component is formed by the PSM itself (as in introspective attention or in thinking about oneself). Here, the second-order PMIR internally models a system-system-relationship instead of a system-object-relationship. Furthermore, in consciously experienced social cognition the object-component can now be either formed by a phenomenal model of another agent or an arrow in the other agent’s head (as in observing another human being observing another human being). Such ideas are appealing, because they show how the relevant representational domain is an open domain: In principle, many layers of complexity and intersubjective metacognition can be added through a process of social/psychological evolution. As the elementary representational building block, the PMIR, gets richer and more abstract, an ascending and cognitively continuous development from the simple portrayal of body-centered subject-object-relations to full-blown self-other modeling becomes conceivable. What changes from monkeys to humans is only the complexity of the self-modeling process. 8.
Intentionality phenomenalized
The philosophical step just taken consists in phenomenalizing intentionality. Phenomenalizing intentionality, we submit, may be an indispensable first step in the project of naturalizing intentionality tout court. Meaning and the conscious experience of meaningfulness have to be separated. Of course, this does not mean that no such thing as intentional content exists. Mental representations possess at least two kinds of content: Phenomenal content (PC) and intentional content (IC). As a large majority of researchers in philosophy of mind currently agrees, PC can be characterized by the Principle of Local Supervenience (PLS): (PLS): By nomological necessity, once all internal and contemporaneous properties of an organism’s nervous system are fixed, the PC of its mental states is fixed as well.
92
Thomas Metzinger and Vittorio Gallese
Very obviously, the same is not true of IC: What a representational state is a representation of also depends on the environment of the organism, e.g., on what it actually represents and if it does so correctly or adequately. The PC, on the other hand, is something a veridical perception and a hallucination may share. The phenomenal content of your mental representations is that aspect which, being independent of their veridicality, is available for conscious experience from the firstperson perspective, while simultaneously being determined by inclusively internal properties of your brain. (Metzinger 2000: 3; for more on supervenience, see Kim 1993). However, what is not yet determined is if this experiential character also goes along with actual knowledge about the world, with intentionality in terms of a biological information-processing system generating states possessing representational – or intentional – content. IC, in many cases, is determined by external and non-local factors. PC, on the other hand, is what stays invariant regardless of whether a perception is veridical or a hallucination. The frequently overlooked point, to which we here draw attention, is that intentionality (on a pre-rational level, probably starting with the sensorimotor system and early levels of attentional processing) is itself depicted on the level of phenomenal content. And it is precisely this kind of conscious content, which has guided theoreticians for centuries in developing their own, now classic theories of intentionality. Due to the principle of local supervenience (see above), it has today become highly plausible that this aspect of intentionality can be naturalized. The phenomenal experience of being an intentional agent, of being a perceiving, attending and cognizing subject can be naturalized. Of course, this does in no way preclude the possibility that intentional content as such can never, and maybe even for principled reasons, be naturalized. However, getting the first obstacle out of the way may greatly help in gaining a fresh access to intentionality as such, because it frees us from the burden of false intuitions generated by our own transparent model of reality and because it helps us to set aside the issue of how we come to consciously experience our mental states as meaningful and directed towards an object-component. We can separate the issue of consciously experienced intentionality from the more general problem of how something like representational content could evolve in the minds of human beings and other animals at all. In this context, it is interesting to note how the originally philosophical concept of a conscious model of the intentionality-relationship (Metzinger 1993, 2000, 2003) currently surfaces at a number of places in the cognitive
The emergence of a shared action ontology
93
neurosciences. Jean Delacour, in an excellent review of current ideas about possible neural correlates of conscious experience, explicitly introduces the notion of an “intentionality-modeling structure” (Delacour 1997: 138). LaBerge (1997: 150, 172) elucidates how important an understanding of the self-representational component present in attentional processing will have to be for a full blown theory of conscious attention. Craik and colleagues (1999) point out how episodic memory, of course, is a process of reconstructing what was here termed a PMIR, because one necessary constituent of memory retrieval is not simply the simulation of a past event, but an association of this simulation with a self-representation (Craik et al. 1999: 26). Building an autobiographic memory is a process of self-related encoding, and conscious, episodic memory retrieval is a process necessarily involving the self-model, because reactivating a PMIR inevitably means reactivating a PSM. Most notable, of course, is Antonio Damasio’s conception of a “juxtaposition” of self and object (see Damasio and Damasio 1996a: 172, 1996b: 24) and the general framework of a fully embodied “self in the act of knowing” (Damasio 1994, 1999). The theory we are trying to build is at present mute about the question if anything like “real” intentionality does exist. A possible speculation is that philosophical models of the intentionality of the mental have ultimately resulted from a naive realistic interpretation of the process of visual attention, of the phenomenal self directing its gaze at a visual object, thereby making it more salient, and simply elevating this interpretation to the level of epistemology. The concept of the intentionality of the mental may simply be a mistaken attempt to theoretically model epistemic relations in accordance with the consciously experienced model of the intentionality relation. Pursuing this point is outside the scope of the present article. However, we will briefly point to two interesting aspects of the overall issue, which are of considerable theoretical interest. Of particular interest is the fact that the brain models the relationship between subject and object as an asymmetric relationship. It is the consciously experienced “arrow of intentionality”, paradigmatically experienced in having the feeling of “projecting” visual attention outwards, as it were, or in attentionally “tracking” objects in the environment. Intendere arcum, to bend the bow of the mind and point the arrow of knowledge toward parts of the world is an intuitively plausible and popular philosophical metaphor, in particular in combination with the idea of “direct”, magical intentionality.
94
Thomas Metzinger and Vittorio Gallese
We can now understand why such an idea will strike beings like us as intuitively plausible: It is phenomenally possible, because there is a directly corresponding structural element in our conscious model of reality. Many theoretical models of the representational relationship are implicitly oriented at the phenomenal experience of visual attention, of the directedness inherent in the phenomenal model of the intentionality relation. Frequently, the theoretical model we design about ourselves as cognitive agents is one of organisms, which, ad libitum, direct the beam of their “epistemic flashlight” at parts of the world or their own internal lives, of beings, which generate the representational relation as subjects of experience. This can lead to the kind of fallacy, which Daniel Dennett has described as “Cartesian materialism” (Dennett 1991: 333). As Dennett has pointed out, many of the different forms of Cartesian materialism, the assumption of a final inner stage, can also be generated in the context of representationalist theories of mind by mistakenly transporting what he called the “intentional stance” (Dennett 1987a) into the system, (see Dennett 1991: 458). The model here proposed, of course, does not make this mistake; it is much more closely related to an extension of the idea of a “second-order intentional system” (Dennett 1987b), a system that applies the intentional stance to itself – but in a phenomenally transparent manner. A related hypothesis is that philosophical theorizing about the intentionality relation has generally been influenced by that aspect of our phenomenal model of reality, which is generated by our strongest sensory modality. If the process of mental representation in general is modeled in accordance with our dominant sensory modality (namely, vision), we will automatically generate distal objects – just as we do in our transparent, visual model of reality. If the object-component of a PMIR is of an opaque nature, as in genuinely cognitive contents or in goal representation, a philosophical interpretation of these mental contents as non-physical, “intentionally inexistent” objects in the sense of Brentano ([1874] 1973) becomes inevitable. 9.
Consciously sharing an action ontology: A phenomenological multilevel analysis
Phenomenal mental models are instruments used to make a certain subset of information currently active in the system globally available for the control of action, for focal attention and for cognitive processing. A phenomenal model of transient subject-object relations makes an enormous amount of new infor-
The emergence of a shared action ontology
95
mation available for the system: All information related to the fact that it is currently perturbed by perceptual objects, that certain cognitive states are currently occurring in itself, e.g., to the fact that certain abstract goal representations are currently active, that there are a number of concrete self-simulations connecting the current system-state with the state the system would have if this goal state would be realized; allowing for selective behavior and the information that it is a system capable of manipulating its own sensory input, e.g., by turning its head and directing its gaze to a specific visual object. A first-person perspective allows a system to conceive of itself as being part of an independent objective order, while at the same time being anchored in it and able to act on it as a subject (see, e.g., Grush 2000). Let us now move to the social dimension. Once a system is capable of representing transient subject-object relations in a globally available manner it becomes possible for the object-component in the underlying representational structure to be formed by the intentions of other beings. Once again the brain’s ontology is expanded, and, as we have learned in the preceding sections, the motor system plays a crucial role in this functional expansion: A phenomenal first-person perspective allows for the mental representation of a phenomenal second-person perspective. Therefore, it is of central relevance to find the subpersonal building blocks implementing this capacity. A representationalist analysis of the matching mechanism instantiated by mirror neurons clearly shows that the object component of an other-agent PMIR simulated by F5 mirror neurons does not have to be visible in order for the full-blown neural response to occur (Umilt`a et al. 2001). Mirror neurons code not object-presence, but rather the relational fact that a certain external agent is currently directed at an object-component. The natural default assumption, therefore, is that if the other agent is conscious they code the existence of a PMIR in another agent, and interestingly they do so in virtue of being a central part of the system that, in other situations, constructs an internal PMIR in the monkey herself. To have a PMIR inevitably means to have a phenomenal self-model. But to be self-conscious does not imply having language, concepts, being able to mentally form a concept of oneself, etc. Body image and visceral feelings are enough. Because the monkey’s motor system allows for prediction of actions/occluded target objects, neurons responsive to the observation of goal-related behaviors might actually be triggered not by a visual representation alone, but by the “embedded motor schema”, that is the same motor schema which is at work when the monkey itself is acting in a similar way.
96
Thomas Metzinger and Vittorio Gallese
We posit that the representational deep structure of this schema is what later evolved into the full-blown PMIR in human beings. It should, again, be stressed that the notion of a PMIR as here introduced does not yet imply of being able to mentally form a concept of oneself as oneself, of consciously experiencing the selection process for goal-states, etc. An elementary selfmodel in terms of body image and visceral feelings plus the existence of a low-level attentional mechanism is quite enough to establish the basic representation of a dynamic subject-object relation. The non-cognitive PMIR is thus what builds the bridge into the social dimension. Once a rudimentary subjective perspective has been established with the help of the motor system, inter-subjectivity can follow. The dynamics of low-level intersubjectivity then helps to further develop, enrich, and stabilize the individual first-person perspective in each participating agent. If a functional mechanism for discovering and phenomenally representing the unobservable goal states of conspecifics is in place, the observed behavior of other systems in the organism’s environment can lead to the activation of a goal-representation, which in turn can be represented as belonging to someone else. This would be a decisive enrichment of the organism’s functional ontology, because they would then irrevocably become a part of its reality, leading to a change in its behavioral profile. As we have seen, it is empirically plausible that such a mechanism is actually in place in human beings. Therefore, representations of the intentions of external agents can now become the object-component of the phenomenal model of the intentionalityrelation as well. This enables further high-level phenomena. Behavior-reading is transformed into mind-reading, because the representational tools for action representation allow for action simulation as well (see Gallese 2003a, 2003b; Gallese and Goldman 1998), including the simulation of goal-states (i.e., of specific, successfully terminated actions, of new and successfully achieved subject-object-relations) in other agents. If this happens on the level of conscious experience, a completely new and highly interesting form of information is integrated into a virtual window of presence and becomes globally available for the system: The information of actually now standing in certain relations to the goals of other conspecifics. We would claim that it is precisely the conscious availability of this type of information, which turned human beings from acting, attending and thinking selves into social subjects. In this paper we have mainly been concerned with the investigation of subpersonal elements of the brain’s “action ontology”. Let us close with a
The emergence of a shared action ontology
97
very specific high-level capacity that could arise from such elements. If the fact that your are constantly not only standing in perceptual and behavioral relations to your environment, but that you are frequently realizing subjectsubject-relationships becomes globally available, it also becomes available for cognition. This, in turn, will allow those systems capable of concept formation to cognitively model social relations from a third-person perspective. Such beings can mentally represent social relationships between other individuals depicted as intentional agents, even if they are not involved themselves. Second, if such systems possess or develop the concept of a “person”, they can now mutually apply it to each other. Through our extended PSMs we were able to simultaneously establish correlated cognitive PMIRs of the type just described. Two or more human beings could now at the same time activate cognitive PMIRs mutually pointing to each other under a representation as rational individuals. And the correlated nature of these two mental events, their mutuality and interdependence, could itself be represented on the level global availability. In this way we were able to mutually acknowledge each other as persons, and to consciously experience this very fact. We will not pursue this point at length here, but it is obvious how this particular representational capacity is of high relevance for social cognition and the pooling of cognitive resources. Social cognition, empathy, and cognition in general, need a minimal level of complexity in the brain’s functional ontology, and in human beings and primates the premotor cortex seems to be substantial part of the physical substrate underpinning this level.
Acknowledgements We thank Wolfgang Prinz for a number of very helpful comments to an earlier version of this paper. Vittorio Gallese was supported by MIURST and the Eurocores Program of the European Science Foundation. Thomas Metzinger was supported by the McDonnell Project in Philosophy and Neuroscience.
References Baars, Bernard J. 1988 A Cognitive Theory of Consciousness. Cambridge: Cambridge University Press.
98
Thomas Metzinger and Vittorio Gallese
Brentano, Franz 1973 [1874] Psychologie vom empirischen Standpunkt, Vol. 1, Oskar Kraus (ed.). Hamburg: Meiner. [Brentano, F. [1874](1973). Psychology from an Empirical Standpoint, Linda McAlister (ed.). Translated by Antos C. Rancurello, D. B. Terrell, and Linda McAlister. London: Routledge and Kegan Paul/New York: Humanities Press.] Buccino, Giovanni, Ferdinand Binkofski, Gereon Fink, Luciano Fadiga, Leonardo Fogassi, Vittorio Gallese, R¨udiger Seitz, Karl Zilles, Giacomo Rizzolatti, and HansJoachim Freund 2001 Action observation activates premotor and parietal areas in a somatotopic manner: An fMRI study. European Journal of Neuroscience 13: 400–404. Cochin, St´ephanie, Catherine Barthelemy, B. Lejeune, Sylvie Roux, and Jo¨elle Martineau 1998 Perception of motion and qEEG activity in human adults. Electroencephalography and Clinical Neurophysiology 107: 287–295. Craik, Fergus I. M., Tara M. Moroz, Morris Moscovitch, Donald T. Stuss, Gordon Winocur, Endel Tulving, and Shitij Kapur 1999 In search of the self: A positron emision tomography study. Psychological Science 10: 26–34. Damasio, Antonio R. 1994 Descartes’ Error. New York: Putnam/Grosset. 1999 The Feeling of What Happens: Body and Emotion in the Making of Consciousness. New York: Harcourt, Brace and Jovanovich. Damasio, Antonio R., and Hanna Damasio 1996a Images and subjectivity: Neurobiological trials and tribulations. In The Churchlands and Their Critics, Robert N. McCauley (ed.), 163– 175. Cambridge, MA: Blackwell. 1996b Making images and creating subjectivity. In The Mind-Brain Continuum: Sensory Processes, Rodolfo R. Llin´as and Patricia Smith Churchland (eds.), 19–28. Cambridge, MA: MIT Press. Decety, Jean, and Julie Gr`ezes 1999 Neural mechanisms subserving the perception of human actions. Trends in Cognitive Sciences 3: 172–178. Decety, Jean, Julie Gr`ezes, Nicolas Costes, Daniela Perani, Marc Jeannerod, E. Procyk, Franco Grassi, and Ferrucio Fazio 1997 Brain activity during observation of actions. Influence of action content and subject’s strategy. Brain 120: 1763–1777. Delacour, Jean 1997 Neurobiology of consciousness: An overview. Behavioural Brain Research 85: 127–141. Dennett, Daniel C. 1987a The Intentional Stance. Cambridge, MA: MIT Press.
The emergence of a shared action ontology 1987b
99
Intentional systems in cognitive ethology: The Panglossian Paradigm defended. In Daniel C. Dennett 1987a, 237–268. First published in The Brain and Behavioral Sciences 6: 343–390. 1991 Consciousness Explained. Boston/Toronto/London: Little Brown. Fadiga, Luciano, Leonardo Fogassi, Gianni Pavesi, and Giacomo Rizzolatti 1995 Motor facilitation during action observation: A magnetic stimulation study. Journal of Neurophysiology 73: 2608–2611. Filion, Christine M., David A. Washburn, and Jonathan P. Gulledge 1996 Can monkeys (Macaca mulatta) represent invisible displacement? Journal of Comparative Psychology 110: 386–395. Fogassi, Leonardo, and Vittorio Gallese 2002 The neural correlates of action understanding in non human primates. In Stamenov and Gallese (eds.), 13–36. Gallese, Vittorio 2000 The acting subject: Towards the neural basis of social cognition. In Metzinger (ed.), 325–333. 2001 The “Shared Manifold” Hypothesis: From mirror neurons to empathy. Journal of Consciousness Studies 8 (5–7): 33–50. 2003a The manifold nature of interpersonal relations: The quest for a common mechanism. Philosophical Transactions of the Royal Society London B 358: 517–528. 2003b A neuroscientific grasp of concepts: From control to representation. Philosophical Transactions of the Royal Society London B 358: 1231. Gallese, Vittorio, Luciano Fadiga, Leonardo Fogassi, and Giacomo Rizzolatti 1996 Action recognition in the premotor cortex. Brain 119: 593–609. 2002 Action representation and the inferior parietal lobule. In Common Mechanisms in Perception and Action: Attention and Performance XIX, Wolfgang Prinz and Bernhard Hommel (eds.), 334–355. Oxford: Oxford University Press. Gallese, Vittorio, Pier F. Ferrari, and M. Alessandra Umilt`a 2002 The mirror matching system: A shared manifold for intersubjectivity. Behavioral Brain Sciences 25: 35–36. Gallese, Vittorio, and Alvin Goldman 1998 Mirror neurons and the simulation theory of mind-reading. Trends in Cognitive Sciences 12: 493–501. Goldman, Alvin 1989 Interpretation psychologized. Mind and Language 4: 161–185. 1992 In defense of the simulation theory. Mind and Language 7: 104–119. Gordon, Robert M. 1986 Folk psychology as simulation. Mind and Language 1: 158–171. 1995 Simulation without introspection or inference from me to you. In Mental Simulation: Evaluations and Applications, Martin Davies and Tony Stone (eds.), 53–67. Oxford: Blackwell.
100
Thomas Metzinger and Vittorio Gallese
Grafton, Scott T., Michael A. Arbib, Luciano Fadiga, and Giacomo Rizzolatti 1996 Localization of grasp representations in humans by PET: 2. Observation compared with imagination. Experimental Brain Research 112: 103–111. Grush, Rick 2000
Self, world and space: The meaning and mechanisms of ego- and allocentric spatial representation. Brain and Mind 1: 59–92.
Hari, Riitta, Nina Forss, Sari Avikainen, Erika Kirveskari, Stephan Salenius, and Giacomo Rizzolatti 1998 Activation of human primary motor cortex during action observation: A neuromagnetic study. Proceedings of the National Academy of Sciences of the United States of America 95: 15061–15065. Iacoboni, Marco, Roger P. Woods, Marcel Brass, Harold Bekkering, John C. Mazziotta, and Giacomo Rizzolatti 1999 Cortical mechanisms of human imitation. Science 286: 2526–2528. Keysers, Christian, Evelyne Kohler, M. Alessandra Umilt`a, Leonardo Fogassi, Luca Nanetti, and Vittorio Gallese 2003 Audio-visual mirror neurons and action recognition. Experimental Brain Research 153: 628–636. Kim, Jaegwon 1993 Supervenience and Mind. Cambridge: Cambridge University Press. Knoblich, Guenther, and R¨udiger Flach 2003 Action identity: Evidence from self-recognition, prediction, and coordination. Consciousness and Cognition 12: 620–632. Kohler, Evelyne, Christian Keysers, M. Alessandra Umilt`a, Leonardo Fogassi, Vittorio Gallese, and Giacomo Rizzolatti 2002 Hearing sounds, understanding actions: Action representation in mirror neurons. Science 297: 846–848. Kohler, Evelyne, M. Alessandra Umilt`a, Christian Keysers, Vittorio Gallese, Leonardo Fogassi, and Giacomo Rizzolatti 2001 Auditory mirror neurons in the ventral premotor cortex of the monkey. Social Neuroscience Abstract XXVII (129): 9. LaBerge, David 1997 Attention, awareness, and the triangular circuit. Consciousness and Cognition 6: 149–181. Meltzoff, Andrew N. 1995 Understanding the intentions of others: Re-enactment of intended acts by 18-month-old children. Developmental Psychology 31: 838– 850.
The emergence of a shared action ontology
101
Meltzoff, Andrew N., and M. Keith Moore 1977 Imitation of facial and manual gestures by human neonates. Science 198: 75–78. Metzinger, Thomas 1993; 2 1999 Subjekt und Selbstmodell. Paderborn: Mentis. 2000 The subjectivity of subjective experience: A representationalist analysis of the first-person perspective. In Metzinger (ed.), 285–306. 2003 Being No One. The Self-Model Theory of Subjectivity. Cambridge, MA: MIT Press. Metzinger, Thomas (ed.) 2000 Neural Correlates of Consciousness. Empirical and Conceptual Questions. Cambridge, MA: MIT Press. Rizzolatti, Giacomo, Laila Craighero, and Luciano Fadiga 2002 The mirror system in humans. In Stamenov and Gallese (eds.), 37– 62). Rizzolatti, Giacomo, Luciano Fadiga, Leonardo Fogassi, and Vittorio Gallese 1996a Premotor cortex and the recognition of motor actions. Cognitive Brain Research 3: 131–141. Rizzolatti, Giacomo, Luciano Fadiga, Massimo Matelli, Valentino Bettinardi, Eraldo Paulesu, Daniela Perani, and Ferrucio Fazio 1996b Localization of grasp representation in humans by PET: 1. Observation versus execution. Experimental Brain Research 111: 246–252. Rizzolatti, Giacomo, Leonardo Fogassi, and Vittorio Gallese 2000 Cortical mechanisms subserving object grasping and action recognition: A new view on the cortical motor functions. In The Cognitive Neurosciences, Michael S. Gazzaniga (ed.), 539–552. 2nd ed. Cambridge, MA: MIT Press. 2001 Neurophysiological mechanisms underlying the understanding and imitation of action. Nature Reviews Neuroscience 2: 661–670. Stamenov, Maxim, and Vittorio Gallese (eds.) 2002 Mirror Neurons and the Evolution of Brain and Language. Amsterdam/Philadelphia, PA: John Benjamins. Thompson, Ross A. 1999 Empathy and its origins in early development. In Intersubjective Communication and Emotion in Early Ontogeny, Stein Br˚aten (ed.), 145–157. Cambridge: Cambridge University Press. Umilt`a, M. Alessandra, Evelyne Kohler, Vittorio Gallese, Leonardo Fogassi, Luciano Fadiga, Christian Keysers, and Giacomo Rizzolatti 2001 I know what you are doing: A neurophysiological study. Neuron 32: 91–101. Varela, Francisco J., Evan Thompson, and Eleanor Rosch 1991 The Embodied Mind. Cambridge, MA/London: MIT Press.
102
Thomas Metzinger and Vittorio Gallese
Wolpert, Daniel M., Kenji Doya, and Mitsuo Kawato 2003 A unifying computational framework for motor control and social interaction. Philosophical Transactions of the Royal Society London B 358: 593–602. Woodward, Amanda L., Jessica A. Sommerville, and Jos´e J. Guajardo 2001 How infants make sense of intentional action. In Intentions and Intentionality: Foundations of Social Cognition, Bertram F. Malle, Louis J. Moses, and Dare A. Baldwin (eds.), 149–169. Cambridge, MA: MIT Press.
Formal representation of concepts: The Suggested Upper Merged Ontology and its use in linguistics Adam Pease We believe that human language can be meaningfully mapped to a formal ontology for use in computational understanding of natural language expressions. We have created a formal ontology in a first order logic language called the Suggested Upper Merged Ontology (SUMO) (Niles and Pease 2001) composed of roughly 1000 terms and several thousand formal statements about those terms. We have also created an index (Niles and Pease 2003) linking all 66,000 noun synsets, 12,000 verb synsets and 18,000 adjective synsets from WordNet (Fellbaum 1998) to terms in SUMO. The links have been made to version 1.6 and then ported to 2.0. The links were made one synset at a time, by hand, over the course of a year, rather than by an automatic process. We are working on using this index for natural language understanding tasks. In this chapter we describe SUMO, its WordNet mappings and use in natural language understanding and inference. We also contrast SUMO to other ontologies. 1.
Other ontologies
The only other publicly available axiomatized ontology is the Descriptive Ontology for Linguistic and Cognitive Engineering (DOLCE) (Masolo et al. 2003). SUMO is a formal ontology, in that it is not simply a collection of terms and English definitions, but rather a fully axiomatized ontology, with definitions for terms provided in first order logic. Note that although the terms in SUMO were initially created as English labels (Pease 2000), they have no inherent linguistically dependent content. The labels are simply convenient mnemonics for the human ontologist, much like the names of variables in procedural software code. Each term name could be replaced with a meaningless unique code and still retain its meaning, since the meaning of a term is given solely by its formal axioms. SUMO is very different from taxonomies,
104
Adam Pease
lightweight ontologies developed in frame systems like Prot´eg´e, or early versions semantic web languages such as DAML+OIL language, all of which lack entirely or have only very restricted axioms, which limit the use of such representations for inference. The most notable proprietary formal ontology is Cyc (Lenat 1995), for which the top one percent of the taxonomy, but not the rules, has been released publicly, and called OpenCyc. DOLCE has a similar purpose and business process to SUMO in that it is a free research project for use in both natural language tasks and inference. DOLCE has been carefully crafted with respect to strong principles. It is reported that DOLCE is also being mapped to a portion of WordNet, although this content has not been released at the time of this writing. DOLCE is described by its authors as an “ontology of particulars” which the authors explain as meaning an ontology of instances, rather than an ontology of universals or properties. DOLCE does in fact have universals (classes and properties), but the claim is that they are only employed in the service of describing particulars. In contrast, SUMO could be described as an ontology of both particulars and universals. It has a hierarchy of properties as well as classes. This is a very important feature for practical knowledge engineering, as it allows common features like transitivity to be applied to a set of properties, with an axiom that is written once and inherited by those properties, rather than having to be rewritten, specific to each property. Other differences include DOLCE’s use of a set of meta-properties as a guiding methodology, as opposed to SUMO’s use and formal definition of such meta-properties directly in the ontology itself. Currently, DOLCE is much smaller than SUMO, with 103 terms, and a similar number of axioms, and lacking such items as a hierarchy of process types, physical objects, organisms, units and measures, and event roles.
2.
Mapping WordNet to SUMO
WordNet is described in more detail in Fellbaum (this vol.), but for the purposes of this effort, we can describe WordNet as a very large electronic dictionary, where synonymous word senses are grouped together and called “synsets”. When the mapping to WordNet was performed with WordNet version 1.6, there were approximately 100,000 synsets. There are four primary data files, covering nouns, verbs, adjectives, and adverbs (called NOUN.DAT etc.), in which each set of synonymous word senses is accompanied by a
The Suggested Upper Merged Ontology and its use in linguistics
105
brief definition and example usage. We went through each file, looking at each synset and attempting to find the formal SUMO term that most closely captures the meaning of the synset. These mappings were completed and released in 2002. Our initial concentration has been on simple mappings, where there is a direct correspondence between an English noun and a class membership statement in logic, or where a verb can be mapped directly to a SUMO event type. Note that in the examples below, words that are the object of discussion in this paper appear in an italic font. Words that are reified terms in the SUMO and logical formulas are given in monospaced bold font. For example, “The dog bites the man” shows both simple noun mappings and a simple verb mapping: (exists (?D ?M ?E) (and (instance ?E Biting) (instance ?D Canine) (instance ?M MalePerson) (agent ?E ?D) (patient ?E ?M)))
The verb bite maps directly to the statement that there is an event of type Biting. The noun Dog maps directly to the statement that there is an instance of the class Canine that participates in the event etc. Note that we have adopted a basically Davidsonian approach to action semantics (Davidson 1967). One of the challenges of that approach is in an over-generalization to all verbs. Many verbs refer to states rather than actions, and only actions are appropriately modeled with the Davidsonian approach. Copula sentences are treated differently for example. The above example shows a simple mapping from a word to a logical concept. It is the case however that many possible sentences do not have this sort of simple mapping to concepts in an ontology. The next simplest case is where a word does not have a reified equivalent in the ontology, primarily due to the need for clarity and simplicity in the ontology, independent of the concerns of natural language understanding issues. Note that this is an important factor, since we intend that the ontology be appropriate for theorem proving tasks, and each of the terms in the ontology that are used to state the formal equivalent of English sentences must have an associated logical definition. Those terms and definitions must be stated in a manner that makes logical inference possible, and efficient.
106
Adam Pease
There are several other considerations that may result in an ontology not having a term that directly corresponds to a word in a lexicon. Logic has a great degree of flexibility for expressing the same information, much in the same way that human natural languages do. However, while people are adept at understanding the similarity between semantically equivalent but syntactically different statements, machines are not. Proving the equivalence of logical formulas is a process that is not guaranteed to terminate in the general case. For that reason, it is highly beneficial for a knowledge engineer to state semantically equivalent statements in syntactically identical ways. Another consideration is the reusability of knowledge-based content. In (Pease et al. 2000), we discussed the need for having compositional expressions in order to facilitate the reuse of knowledge, as well as for efficiency concerns. Compositional expressions are those that state a concept by employing a set of more basic expressions, rather than encapsulating a single notion in a reified term. This factor may also encourage the use of expressions rather than single terms to express a particular concept. In “The man began walking to the market” there is no need for a BeginWalking concept. The information contained by that phrases is only the narrative temporal information that the following text is likely to refer to events after the beginning and before the end of the walk. Finally, many expressions in natural languages may have very minimal semantic content. In many contexts, the notion of starting or continuing to do something is equivalent to simply performing the action. There is no need to reify the concept of a starting action, although there is the need to express parts of processes or relate a process to its starting time or the time of another process. A typical case of the lack of direct correspondence between English words and terms in the ontology is where the ontology models roles that agents play as relationships between the agent and a type of role, rather than reifying a subclass of agent filling the role. For example, “The pilot lands the plane” results in (exists (?P ?PL ?E) (and (instance ?E Landing) (instance ?P Person) (attribute ?P Pilot) (instance ?PL Airplane) (agent ?E ?P) (patient ?E ?PL)))
The Suggested Upper Merged Ontology and its use in linguistics
107
We are also dealing with more complex relationships where an entire phrase forms a pattern that has an equivalent template structure for a logical expression. While many nouns map to instances of reified classes, and many verbs map to instances of Process types, the situation for mapping adjectives is more often problematic. Some adjectives can map to instances of reified classes, for example dying → Death, and potential → Possibility. Many adjectives map to a particular class of subjective assessments in SUMO, which can be related to the noun being modified by the relation attribute. Some examples are absolute and pure. A more detailed and specific ontology could profitably extend SUMO’s existing set of Attributes. Some adjectives map to implied values in relations with spongy, thirsty and repellent all expressing a certain value for the second argument of the SUMO capability relation. A spongy object is something that is relatively easy to compress. That is, it has the capability of participating as a patient in a Compression process. Some adjectives map to logical values for a sentence that may be implicit in the common sense information related to the sentence being interpreted. Faulty and unfaithful both relate to the falsehood of a particular proposition. Some adjectives indicate that a process has occurred at some point in the past to the noun being modified. Mounted, paneled and studded all imply the occurrence of a Putting event. A topic suggested for the workshop that this volume is partially a result of is another case of this issue reflecting literal word correspondence problems instead of problems with the correspondence between a language and an ontology, or one language to another language. The stated problem is that stellen, setzen, legen, and stecken do not each simply map to a single English word, much as the single English word ‘pilot’ does not map to a single term in the ontology above. English, however, is rich enough to communicate the notion of putting an object in an upright position without having to have a single word for it. In the same way, an ontology can be rich enough to express a concept like ‘pilot’ without having to reify that notion as a named term in the ontology. Natural languages and human constructed ‘languages’ like ontologies both exhibit a certain kind of organic growth in response to the pressures of use, although on a different scale. One should neither expect nor require them to parallel each other, nor should the degree of coincidence be a mark of quality, much in the same way that one would not assess the value of a human language on the basis of its similarity to a romance language for example.
108 3.
Adam Pease
SUMO and domain ontologies
SUMO consists of eleven modules with the dependency structure given in Figure 1. Structural Ontology
Base Ontology
Set/Class Theory
Graph
Numeric
Measure
Temporal
Mereotopology
Processes
Objects
Qualities
Figure 1. Hierarchy of SUMO theories
The Structural Ontology consists of fundamental relations, with associated formal definitions, such as instance, subclass and disjoint that are used to define most other concepts in SUMO. The Base Ontology consists of fundamental classes such as Abstract and Object from which all other classes inherit, as well as classes of relations, such as TransitiveRelation, and very frequently used relations which are instances of CaseRole(s). The Numeric ontology contains common numeric operators such as AdditionFn. The Set/Class Theory contains set operators such as UnionFn and IntersectionFn. The Graph ontology has basic graph theoretic notions such as DirectedGraph. The Measure ontology consists of classes of measures such as LengthMeasure and TemperatureMeasure as well as specific units such as Meter. The Temporal ontology contains notions such as duration, as well as the standard 13 Allen relations (Allen 1984) (which are: before, meets, overlaps, during, starts, finishes and their inverses plus equals). The Mereotopology, or theory of parts and places contains relations such as partiallyOverlaps and height. The ontology of Processes is quite extensive, consisting of roughly 175 process types, some of which are shown in Figure 2. The ontology of Objects contains a biological taxonomy with associated formal definitions, along with concepts such as Nation. The Qualities ontol-
The Suggested Upper Merged Ontology and its use in linguistics DualObjectProcess Substituting Transaction Comparing Attaching Detaching Combining Separating InternalChange BiologicalProcess QuantityChange Damaging ChemicalProcess SurfaceChange Creation StateChange ShapeChange
109
IntentionalProcess IntentionalPsychologicalProcess RecreationOrExercise OrganizationalProcess Guiding Keeping Maintaining Repairing Poking ContentDevelopment Making Searching SocialInteraction Maneuver Motion BodyMotion DirectionChange Transfer Transportation Radiating
Figure 2. Top two levels of the Process hierarchy
ogy has concepts such as compass directions (e.g. North), SocialRole and NormativeAttribute(s). There are a number of ontologies that extend SUMO. Together, they number some 20,000 terms and 60,000 axioms, making the SUMO family the largest publicly available formal ontology. These additional ontologies include a mid-level ontology comprising some 5000 terms with associated formal definitions, for concepts that are still quite general, but are arguably too low-level for SUMO itself. Domain ontologies include theories of geography, terrorism, political systems, economic systems, as well as the GOLD linguistic ontology described in Farrar (this vol.).
4.
Phrases
We have recently proposed (Pease and Fellbaum 2004) creating a new corpus of phrase elements with mappings to the template logical expressions that are entailed by each phrase. A good example of what is required in this area is the semantics of the English possessive. “John’s car” refers to a car that is owned by John and should be represented with the SUMO relation
110
Adam Pease
possesses. “John’s nose” however refers to a physical part of John and is best represented by the SUMO relation part. “John’s company” is ambiguous, and while it can mean a company that is owned by John, it more often represents a company of which John is an employee, which would use the term member. In each case, the relation used is dependent not only on the grammatical construction, but on the class membership of the grammatical elements. A possession relation between an Agent and a BodyPart entails part. A relation between an Agent and an Object (other than an Organization) entails possesses. Another case of semantic ambiguity involves prepositions. One gets “in a car” but “on a bus”. However, in both cases, the subject is inside the transportation device. The determining factors for the use of “on” is whether the vehicle is either not enclosed, or is a group transportation vehicle. Only enclosed, personal transportation vehicles, such as a car, require the use of “in”. Any system that attempts a deep semantic translation of such phrases must recognize the class of vehicle in order to generate the correct semantics of the spatial relation between subject and object. A somewhat more straightforward issue with “on” is its metaphorical employment with regards to times. “John arrived on the patio” meaning that his destination was the patio. “John arrived on Monday” means that the time the action was completed was Monday. The first sentence relates the event to the location with the SUMO destination relation. The second sentence relates the event to its temporal overlap with Monday using the function EndFn. Human language is not regular with regards to its employment of surface linguistic features for significantly different semantic intent. Any software system that attempts to generate a deep semantic equivalent for natural language will need a large corpus of language-specific rules that map surface features to deep semantics. Additionally, such a system will need a large ontology which specifies the semantics of the terms used in the target representation. 5.
Translation
We are also doing some early and arguably simplistic exploration in using SUMO for language translation. We can translate from many simple kinds of English sentences into logic. We are also able, using simple templates, to translate logic into somewhat awkward natural language sentences in English, Chinese, Czech, German, Hindi and Italian. Because the ontology serves as a “hub” for language translation we have a situation that is simpler than
The Suggested Upper Merged Ontology and its use in linguistics
111
if we had to create specific structures for translations between any particular pair of human languages. “Understanding” language is much harder than the problem of generating merely grammatical language (albeit that problem of generating natural and idiomatic language may be equal in difficulty). A colleague has implemented a browser (Sevcenko 2003) that employs the templates to express statements from SUMO in the languages for which we have translation templates. 6.
Language understanding
We are currently creating applications that use a restricted English language grammar as the input form for generating knowledge expressed in logic (Pease and Murray 2003). Our expectation is that these applications will test our hypothesis that we can develop a grammar and translation rules that are powerful enough to allow humans to express most thoughts, while not encountering the overwhelming problems found in machine understanding of unrestricted human language. Our approach is similar in spirit to the tradeoff made by the Palm Pilot as opposed to the Apple Newton. The Apple Newton attempted to recognize unrestricted cursive English handwriting. Even ten years later, this problem is still too hard for a handheld computer. The Newton simply did not work enough of the time to be useful. The Palm Pilot shifted the hardest problems for the machine on the human user. In exchange for a relatively small and easily learned change in behavior a hard problem became easy, and the Palm is able to function at a high enough rate of handwriting recognition to be useful and commercially successful. As specific examples of how we approach this tradeoff, we require user assistance for word sense disambiguation in many cases. Metaphor is not allowed. Anaphoric reference is decided by a simple algorithm. Verbs must be present tense and nouns must be singular. The use of modifiers and dependent clauses is limited to simple cases. We have started with an extremely simply constructed grammar, which we handle with deep understanding, and are building up the degree of sophistication little by little. This is in contrast to a typical approach to understanding which attempts to cover full natural language at a shallow degree of understanding. While our approach is not suitable for existing documents, it is perhaps more suitable than other approaches for understanding commands or assertions made by a user through a software interface.
112 7.
Adam Pease
Inference
We are currently conducting experiments with formal, logical inference. While the primary goal of the effort is to develop a powerful framework for formal reasoning, a second goal is to support information retrieval and question answering with a natural language interface. This framework allows us to test our hypotheses about translation of language to logic and more specifically about efficient inference on automatically generated logical content. First order inference presents many computational challenges. A common and general approach to addressing performance concerns is to trade space for time. That is to say, by using greater amounts of memory, it is often possible to improve the time it takes for a computational process to return results. Specifically, we are pre-computing various inferences and storing them in our knowledge base. By storing certain kinds of results, most often those having to do with reasoning about class membership, we have seen speedups of several orders of magnitude. The degree of speedup can mean the difference between tractable inference times, and no results at all. By performing formal reasoning on logic expressions generated automatically from natural language, we will be able to test our hypotheses about language semantics in a way that is objective and verifiable. 8.
Conclusion
We intend that the products mentioned in this paper be used as widely as possible. Most have already been released under the GNU open source license. All products will be released to the research community as they mature. We encourage other researchers either to verify our results, or develop innovative approaches to applying them differently. We equally welcome experiments that refute our hypotheses and help point us in the right direction. References Allen, James 1984
Towards a general theory of action and time. Artificial Intelligence 23: 123–154. Davidson, Donald 1967 The logical form of action sentences. In The Logic of Decision and Action, Nicholas Rescher (ed.), 81–95. Pittsburgh, PA: University of Pittsburgh Press.
The Suggested Upper Merged Ontology and its use in linguistics
113
Farrar, Scott this vol. Using ‘Ontolinguistics’ for language description. Fellbaum, Christiane this vol. The ontological loneliness of verb phrase idioms. Fellbaum, Christiane (ed.) 1998 WordNet: An Electronic Lexical Database. (Language, Speech, and Communication.) Cambridge, MA/London: MIT Press. Lenat, Douglas B. 1995 CYC: A large-scale investment in knowledge infrastructure. Communications of the ACM 38 (11): 33–38. Masolo, Claudio, Stefano Borgo, Aldo Gangemi, Nicola Guarino, Alessandro Oltramari, and Luc Schneider 2003 The WonderWeb Library of Foundational Ontologies Preliminary Report, WonderWeb Deliverable D17. Version 2.1 dated 29–05–2003. ISTC-CNR Technical Report. Niles, Ian, and Adam Pease 2001 Toward a Standard Upper Ontology. In Formal Ontology in Information Systems. Proceedings of the 2nd International Conference (FOIS-2001), Christopher Welty and Barry Smith (eds.), 2–9. New York: ACM Press. See also http://www.ontologyportal.org. 2003 Linking lexicons and ontologies: Mapping WordNet to the Suggested Upper Merged Ontology. In Proceedings of the IEEE International Conference on Information and Knowledge Engineering, Nazli Goharian (ed.), 412–416. Las Vegas, NV: CSREA Press. Pease, Adam 2000 Standard Upper Ontology Knowledge Interchange Format. Web document http://suo.ieee.org/suo-kif.html. This is a condensed version of the language described in Genesereth, Michael R. (1991). Knowledge interchange format. In Proceedings of the Second International Conference on the Principles of Knowledge Representation and Reasoning (KR–91), James Allen, Richard Fikes, and Erik Sandewall (eds.), 238–249. San Mateo: Morgan Kaufmann. See also http://logic.stanford.edu/kif/kif.html. Pease, Adam, Vinay Chaudhri, Fritz Lehmann, and Adam Farquhar 2000 Practical knowledge representation and the DARPA High Performance Knowledge Bases project. In KR–2000: Proceedings of the Conference on Knowledge Representation and Reasoning, Breckenridge, CO, USA, 12–15 April 2000, Anthony G. Cohn, Fausto Giunchiglia, and Bart Selman (eds.), 717–724. San Mateo, CA: Morgan Kaufmann. Pease, Adam, and Christiane Fellbaum 2004 Language to logic translation with PhraseBank. In Proceedings of the Second International WordNet Conference (GWC 2004), Petr Sojka, Karel Pala, Pavel Smrz, Christiane Fellbaum, and Piek Vossen (eds.), 187–192. Brno: Masaryk University.
114
Adam Pease
Pease, Adam, and William Murray 2003 An English to logic translator for ontology–based knowledge representation languages. In Proceedings of the 2003 IEEE International Conference on Natural Language Processing and Knowledge Engineering, Beijing, China, 777–783. Sevcenko, Michal 2003 Online Presentation of an upper ontology. In Proceedings of Znalosti 2003, Ostrava, Czech Republic, February 19–21, 2003. See also http://virtual.cvut.cz/kifb/en/.
Linguistic interaction and ontological mediation John A. Bateman In this chapter, we explore a novel configuration of some traditional issues – issues that relate centrally to the nature of linguistic ontologies, non-linguistic ontologies and the appropriate means of constructing relations between them. Relating linguistic information to non-linguistic information, traditionally termed world-knowledge in linguistics and AI, is a central problem for any body of research that terms itself “Ontolinguistics”. It is also a crucial task for any intended application of ontological engineering to natural language processing. It is natural, therefore, that several of the contributions to this volume have either discussed or used particular approaches to relating nonlinguistic and linguistic information. In our discussion here we probe this further by focusing on the particular role of linguistic interaction. We argue that the kind of phenomena inherent in interaction, and particularly those of meaning negotiation, support a fundamental division between linguistic and non-linguistic ‘ontological’ organisations that can be usefully employed for more effective treatments. The chapter is organised as follows. First, we introduce the notion of ontological stratification and ontological perspectivism crucial to our account. This provides us with the theoretical space necessary for making sense of the various kinds of levels of description that we deal with subsequently. We then focus particularly on the basic ontological strata that we are assuming and some of the approaches that have been taken to their description. Our main concern here is to show some of the problems that typically arise in accounts which insufficiently distinguish linguistic and non-linguistic information. We conclude that linguistic and nonlinguistic information needs to be maintained separately in order both to open up sufficient leeway for the flexibility observed during meaning negotiation during interaction and to do full justice to the individual contributions made by each stratum. Second, we present evidence, primarily from our ongoing work involving communication in the spatial domain (cf. Bateman and Farrar 2004a; Tenbrink 2005), that constructing a single static relationship between the two strata required is problematic in several ways and does not in any case offer a completely accurate picture of how the domains of information work together
116
John A. Bateman
during real language use.1 In contrast to this, we propose ‘parameterised’ mappings between two essentially distinct ontological domains. Much of this parameterisation is anchored to particular stages within unfolding linguistic interactions and is, therefore, inherently variable. This flexibility is typically downplayed in favour of a more static account of the relation between words and meanings. This allows us to predict linguistic properties on the basis of ontological decisions, which is the essence of the Ontolinguistic approach. While we are also committed to exploring a tight relationship between ‘ontological’ and ‘linguistic’ concerns, we do so within a framework that allows considerable variability across strata. While removing the restriction of a fixed relationship allows us to explore a broader range of actual language use, it also makes it essential that we are able to determine boundaries that restrict variability in order still to ground more reliable statements of ‘ontolinguistic’ import. We therefore conclude the chapter with a consideration of the extent to which flexibility can be restricted and generalised mappings can still be preserved. We do this by relating these issues to the broader task of relating ontologies in general. If we see the two strata of linguistic and non-linguistic information as two ontologies, then the task of finding mappings between them is similar to the general task of relating or aligning ontologies of any kind – which is itself a central goal for achieving interoperability between computational components by means of ontologies. We suggest that the incorporation of insights drawn from linguistic interaction in this process may contribute a useful extension of the techniques available.
1.
The need for two ontological domains: Linguistic and non-linguistic
We introduce here the general issue of ontological stratification, and follow this by particular arguments for stratifying out linguistic and non-linguistic ontologies.
1.1.
Ontological stratification: Layering reality
Earlier work on ontology generally adopted what we will describe here as a broadly ‘mono-perspectival’ approach. This was compatible with the basic goal of ontology as such, i.e., to lay bare the necessary structures of reality
Linguistic interaction and ontological mediation
117
and existence. With assumptions of realism, it is perhaps also natural to assume that reality cannot be a question of ‘perspective’: what is there does not depend on how we choose to look at it. There is, then, only room for one perspective – hence ‘mono-perspectival’. The apparent co-existence of very different kinds of entities, such as, for example, a table and the material out of which the table is made, or money and the physical coins which are exchanged when buying something, or thought and language, then require that they still be brought together in one overarching perspective, or ontology. While physical reductionism clearly provides one mechanism for approaching this task, it raises several more problems of its own. These problems can, in general, be grouped under the heading of ‘causal responsibility’. If we pay 300 Euros for something using 100 Euro notes, is it the physical exchange of pieces of paper that is crucial or something else? Is chemistry just physics with larger atomic configurations? Is the linguistic expression ‘to the left of’ simply a verbalisation of an underlying conceptual configuration that is itself ‘wired’ somehow in neural networks? To deal with these kinds of issues, several current approaches to ontology propose that causal responsibility needs to be shared more democratically among the entities participating. Borgo, Guarino, and Masolo (1996), for example, argue that objects need to be considered from a material level of constitution, a morphological level concerning properties of that material’s distribution in space, and a level of the ‘objects themselves’, which is distinct from, but (for physical objects) supervenient on, their material substrate. This is strongly motivated ontologically by the principles of ‘rigidity’ and ‘identity’ (Guarino and Welty 2004). We can change the materials involved and still have the ‘same’ table: the identity conditions for the two ontological entities involved are simply different and collapsing the two together leaves a variety of unattractive difficulties. Such ‘stratification’ is also proposed, in a completely different area, by Donnelly and Smith (2003), who argue for ‘layering’ ontologies into different areas of concern in order to separate out ontologically distinct aspects of reality. They give as an example a layering in the geographic domain concerning a lake containing mercury-contaminated fish where the fish, the mercury, the lake, and the spatial regions involved are all assigned to different layers. This allows the generalised ‘summing’ operation usually found in ontologies formalised in terms of mereology to be restricted so as only to operate within layers. It then becomes possible to say sensible things concerning the partial co-locations of mercury and fish, and the fish and the lake, without also being
118
John A. Bateman
able to derive ontologically considerably less convincing mereological sums such as the water and the mercury in the fish. These entities in no sense share parts and combining them is seen as the ontological equivalent of adding apples and oranges. Multiperspectivalism in ontology is currently taken to its logical extremes in the approach of Smith and colleagues, where it is promoted as a fundamental organising design decision for ontology as such (Smith and Grenon 2004).2 We will return to some of the consequences of this position for our own work in our conclusions below, but for now we refocus on the particular ontological strata involved in treatments of language. Our position is that, just as objects and their material substrates, or fishes and mercury, need to have their ontological contributions considered separately, this should also be the case with linguistic information – particularly that at the linguistic stratum of semantics – and non-linguistic information. As we shall see, this is also to reassert the argument developed in Bateman (1995) for a stratification, or layering, into physical reality, conceptual/perceptive reality, and a semiotic reality constituted by meanings constructed within a social context through language and linguistic interaction. These relationships are summarised in graphical form in Figure 1. We have two ontological strata involved: one that is essentially semiotic and one that is conceptual/perceptual. The model for the semiotic stratum (that shown on the left of the figure) is essentially that taken from the Hallidayan view of social semiotics (Halliday 1978; Halliday and Matthiessen 1999). This semiotic organisation is itself extremely complex and is receiving considerable attention in its own right. On the righthand side of the diagram is the organisation motivated by conceptual and cognitive considerations. The model contrasts with that often seen in ‘cognitively’ motivated approaches by virtue of its claim that the move from linguistic form to linguistic semantics to social interpretation does not involve a move from linguistic to conceptual. All of these moves remain within a semiotically defined ontological stratum. This is then where we can see the most direct realization of the Ontolinguistic assumption. The relations across levels of semiotic abstraction are taken to be highly regular, constraining and non-arbitrary. All of these semiotically constructed constructs must, however, find some anchoring in the conceptual/perceptual stratum in order to exist. This is similar to tables requiring the material out of which they are constituted. But there is no requirement – indeed, it would be quite unlikely – that the relationship is simple. It is quite possible, for example, that the conceptual/perceptual ‘im-
Linguistic interaction and ontological mediation
119
age’ of the semiotic constructs does not maintain the modularities observed on the semiotic side. It is for this reason, we suggest, that claims appear of the kind that there is a continuum between linguistic and conceptual constructs, or that there is no essential difference between them. Within the conceptual/perception ontological stratum, there may well be no difference: all of the constructs will be ‘psychological’, ‘conceptual’, etc. to an equal degree; but this is by no means the end of the story. We shall see that for most of the phenomena at issue for us here, it is precisely those organisational configurations on the lefthand side that are most active.
Figure 1. Graphical representation of the relationships between the two ontological strata primarily relevant for language
1.2.
Separating out ontologies according to layers
As mentioned above, we assume that in order to define sound ontologies, it is necessary to apply principles of ontology design such as identity and rigidity. This can be related usefully to the kinds of evidence that are employed for motivating and constructing portions of an ontology. Our claim is that the two ontological strata proposed, the semiotic and the conceptual, draw on distinct forms of evidence. For the former, we turn mostly to linguistics; for the latter, to psycholinguistics and cognitive psychology. Problematic, as we shall now see, are accounts where this division is not sufficiently respected. This results in ontology specifications whose contents draw on different ontological domains or which make claims for one domain on the basis of evidence collected in another. The consequences are similar to those mentioned above
120
John A. Bateman
concerning physical objects and the materials from which they are made: i.e., a confusion of entities with divergent identity conditions leads to confused representations. We refer to ontologies where this has occurred as mixed ontologies. Such ontologies are very frequent, particularly in research involving computational approaches to language processing. They are characterised by a combination of linguistic evidence and so-called ‘conceptual’ configurations. To begin, we will show why we believe such ontologies to be problematic. Following this discussion, we will propose that one solution to the problems identified is to adopt very strict methodologies which finely control the kinds of evidence that are admitted for motivating the ontological distinctions to be drawn in any particular ontology. Here we illustrate our discussion with approaches that have attempted this, including the verb alternation classes of Levin (1993) and our own Generalized Upper Model (Bateman, Henschel, and Rinaldi 1995).
1.2.1.
Some problems of mixed ontologies
We find mixed ontologies to be problematic primarily because there are few grounds to believe a priori that organisations derived on linguistic grounds and organisations derived on conceptual, commonsense or domainbased grounds will converge to a single coherent description. In particular, there is no guarantee that domain-motivated categories will choose lexicallymotivated categories that belong to a consistent more general linguistic ontological type. Examples of this abound. To take one example more or less at random, consider the category precipitation from the Suggested Upper Merged Ontology (SUMO: Pease this vol.). The description of this category is ‘liquid in any state falling from the air’. Two chains of subsumption links relate this category to its supercategories: one leads through WeatherProcess and InternalChange; the other leads through LiquidMotion and Motion. Both chains come together as a kind of process. Thus precipitation may correspond in English to finer linguistic selections such as ‘it is snowing’, ‘it is raining’, etc. – a very particular class of lexicogrammatical constructions. The kind of entities picked out along the two supercategory chains are, however, extremely diverse. The former chain covers general weather systems and events where there is some change of internal state (e.g., hurricanes and storms, high pressure areas). The latter covers ‘the flowing of water’,
Linguistic interaction and ontological mediation
121
such as ‘the river runs to the sea’ and perhaps also ‘high tide’. There is very little here that we can say about these collections of constructs linguistically: that is, there is no particular reason for hypothesizing a particular linguistic form – apart, perhaps, from the incorrect (for English) hypothesis that how to express ‘it is raining’ might be ‘water is falling from the sky’ or that when there is an area of high pressure one might say ‘it is high pressuring’. Although these items may belong to similar domain classes, it is clear that they are treated linguistically in quite different ways: the organisation proposed in SUMO does not lead to usable predictions of linguistic form. This is not in itself a criticism, since the goal of this particular ontology was not to cover linguistic form but to model the world. Problems only arise when the two tasks, modelling the world and predicting linguistic forms are mixed. In fact, representing these two, often conflicting, perspectives adequately in a single ontology would require that categories are consistently classified along the two dimensions simultaneously: which complicates the formal properties of the resulting ontology considerably. It also does not do justice to the notion that it is possible to draw useful semantic generalisations on the basis of regularities in grammatical expression, as argued, for example, by Levin (1993). This threatens to reduce the problem of mapping conceptualsemantic configurations to linguistic forms to lists of more or less arbitrary templates. Many of the generalizations that hold across grammatical constructions, and which provide the semantic motivations for selecting certain forms rather than others, are not then available. To characterise the problems that arise more finely, we can draw on the succinct discussion of Lang (1991) concerning the Lilog ontology described in detail by Klose and von Luck (1991). Lang subjects this ontology to extensive criticism with respect to its confusion of linguistic and non-linguistic information, explaining unclear modelling decisions in terms of several classes of ‘mixing’. Although, for reasons of space, our account here will focus predominantly on the Lilog ontology, similar statements can be made of its intellectual successors, e.g., the ontologies of the VerbMobil and SmartKom projects (Quantz et al. 1994; Gurevych, Porzel, and Malaka 2006), as well as of the lexicographic (semantic?, conceptual?) approaches of WordNet (Miller 1990) and FrameNet (Baker, Fillmore, and Lowe 1998). In all these cases, the ontologies involved have been explicitly designed according to criteria which do not recognise, or which even reject, the substantial differences existing between domain and linguistic ontologies. This position naturally aligns with the idea that lexicogrammatical and semantic configurations transparently re-
122
John A. Bateman
flect cognitive or physical configurations, which is what we argue below to be incorrect. The Lilog ontology was designed to support knowledge-rich dialogue behaviour in the domain of tourist information. It was part of a large project funded by IBM involving research groups all over Germany. The ontology, despite building on some accounts of linguistic meaning that do clearly separate linguistic and non-linguistic, or conceptual, representations, came during its development to merge the two strata. This is shown in the ontology’s free combination within a single formal language of conceptual forms and semantic forms derivable from the lexicon and from grammatical analysis. The semantic language is then a subset of the conceptual language (cf. Trautwein (this vol.) for further discussion of this point and other examples of accounts where this approach is taken). This free combination leads to an ontology of sorts where the motivations, and hence the consequences, of particular sorts being present or not is unclear. Lang in fact finds no less than four differentially motivated sorts accepted alongside one another in the type lattice of the ontology without any formal distinctions being made: – ‘Conceptually based sorts’ which are included on extra-linguistic (conceptual) grounds. – ‘Text base specific sorts’ which are concepts corresponding to special vocabulary items required by the particular domain and text with which LILOG as a project was concerned. – ‘Sorts projected from the grammar’ which are notions found in the grammar, such as preposition, transferred to the ontology. – ‘Sorts of mixed origin’ which are concepts where both extra-linguistic and linguistic criteria are involved.
The first case causes additions to the ontology to be made on the basis of what one ‘knows’ to be the case: which means that classes appear that may or may not behave linguistically homogeneously with other subclasses. The second case reifies any lexical item in a domain so that it appears as a conceptual element also. The third case is a particular example of the very frequent situation where: . . . the categorization of lexical items into nouns, verbs, etc., provides an apparently natural grid for establishing corresponding sorts of entities in the ontology. . . (Lang 1991: 464)
Assuming that single grammatical categories have a single home in a corresponding ontology is often dangerous, and for prepositions particularly so – precisely because they function as an integral part of the lexicogrammatical
Linguistic interaction and ontological mediation
123
‘glue’ for building a wide range of constructions. It is unlikely that there is some ‘corresponding abstract meaning’ that might correspond to ‘preposition’ ontologically. Lang suggests that this mixing of motivations organises itself loosely according to the ‘vertical’ and ‘horizontal’ dimensions in the hierarchy. This means that a rough vertical skeleton is sketched drawing on the availability of linguistic items and the lexical hyponomy relations between them, and this is then made more ‘bushy’ (horizontally) by adding sorts regardless of their linguistic properties but which are felt to be ‘conceptually’ close. The result is that one can never be quite sure whether the sorts found are going to have predictable linguistic consequences or not, similarly to the precipitation case above. Moreover, since extra-linguistic or conceptual criteria are less than well understood, there is a degree of arbitrariness in the categorizations that do appear: theories of roles, purpose, function and other domains would be necessary for ontologically sound modelling but these are still active research areas. The co-existence of distinct kinds of categories within a single sort lattice also means that the precise meaning of ‘subsumption’ with respect to particular cases is underspecified – different kinds of categories have different relations between their various sub-classes. Until this is clarified, it is unclear what kind of subsumption actually holds. Similarly, a supposedly general ‘part-whole’ relation will sometimes need to be interpreted as ‘is a component of’, sometimes as ‘is spatially included in’, somtimes as ‘pertains to’, sometimes as ‘inalienable possession’, sometimes as ‘alienable possession’, etc. The ontology is, in short, confused.
1.2.2.
The Penman Upper Model
Whereas one approach to handling the complexity of a resulting mixed ontology is to increase in turn the complexity of the mechanisms and axioms defined with respect to the ontology, an alternative is to attempt to avoid the combinations of motivations in the first place. This was attempted from the outset in a linguistically motivated ontology called the Penman Upper Model, an organisation of information begun in 1985 as cooperative work involving Halliday, Mann, and Matthiessen as an attempt to construct an ‘ontology’ for supporting natural language generation.3 Here, one of the essential problems to be solved for re-usability is to provide a generalised means of relating
124
John A. Bateman
domain information to the concrete abilities of an automatic generator. That is, it is not realistic to assume that some application system necessarily organises its information in a form amenable to linguistic expression and so mediated access to that information must be provided. The Upper Model has subsequently been used in order to investigate many questions concerning how best to interface between linguistic components and the knowledge of particular domains. As an example of its use, we can examine the following simple semantic specification that could be used as an input expression for the automatic generation of one of the sentences discussed in Pease (this vol.): “The pilot lands the plane.”4 This is given here in the standard ‘Sentence Plan Language’ (SPL: Kasper 1989) form that we use for interacting with our natural language generator and which has since been used in several automatic natural language generation systems: (ev0 / land :actor (x0 / pilot) :actee (x1 / plane)) This form of specification is a relatively flat logical form based around events in the manner proposed by Davidson (1967) and extended with named roles (actor, actee) to give a so-called Neo-Davidsonian representation. It may be glossed approximately as follows: there is an entity ev0 of type land that has two filled roles, actor and actee. The actor is an entity x0 of type pilot and the actee is an entity x1 of type plane. What enables such an expression to be ‘understood’ directly by the generation system is that the types used, which are categories of the domain application – ‘pilot’, ‘land’, ‘plane’ – are classified previously according to the sorts of the Upper Model. Thus, when the following subsumption relationships are defined, where the lefthand categories are drawn from the domain and the righthand categories from the Upper Model: land ⊂ doing-and-happening (an event) pilot ⊂ person (an object) plane ⊂ decomposable-object (an object) then an appropriate range of clauses (and related constructions, such as nominalizations and so on) will be produced. This is independent of any more detailed modelling of the flight-application domain that may (or may not) be supported.
Linguistic interaction and ontological mediation
125
For this to work, the categories included in the ‘Upper Model’ are selected according to a very different set of assumptions to those that became embodied within the Lilog and similar ontologies. It provides an ontological structure for classifying any particular instances of facts, states of affairs, situations, etc. that occur in an application or domain model. This classification is carred out in terms of a set of general objects and relations of specified types that behave systematically with respect to their possible linguistic realizations. The Upper Model therefore decomposes the mapping problem by establishing a level of linguistically motivated knowledge organisation specifically constructed for the task of constraining linguistic realizations. This separates out the kinds of possible motivations for inclusion in an ontology seen in the case of the Lilog ontology and commits to just one of them: lexicogrammatical reflexes. Moreover, particular weight is given to the grammatical end of the lexicon-grammar continuum. Broadly following Whorf’s notions of grammatical cryptotypes and reactances (Whorf 1956), general semantic significance is attributed to patterns of reactances within the grammar rather than to particular lexical distinctions, which are considered to be much more subject to idiosyncratic historical development. According to this methodology, it is considered far more significant that, for example, English has a regular grammatical alternation relating pairs such as: (1)
The dog walked. vs. He walked the dog. The plane landed. vs. She landed the plane. The cup broke. vs. He broke the cup.
than some lexically-based observation of the form that English may allegedly have fewer (or more) words for snow than some other languages, etc. Nevertheless, this does mean that an upper model is necessarily languageparticular: different languages, or rather the grammars of different languages, might well motivate different ‘upper model’ organisations. In our work to date, we find in fact that the differences are not so great as one might expect but the reasons why this is the case are beyond the scope of the current chapter (for more discussion on this issue and some cross-linguistic examples, see Halliday and Matthiessen 1999). A good illustration of the distinctions actually drawn in the Upper Model is provided by the subtypes of events. There are several ways in which an on-
126
John A. Bateman
tology might focus on distinct types of events. The primary way highlighted by the Upper Model builds on a complex array of grammatical phenomena that establishes four broad categories conforming to quite different grammatical patterns: these categories are labelled doing-and-happening processes, mental processes, saying processes and being-and-having processes.5 Now, while these categories are similar (but also different) in some respects to categories one finds in non-linguistic ontologies, what must always be emphasised with Upper Model distinctions is that they are motivated in the first place solely on grammatical grounds. That is, we recognise these four categories because when they are expressed through, for example, the lexicogrammar of English, distinct grammatical reactances can be observed for them. These and other reactances are described at length in discussions of grammar such as Halliday and Matthiessen (1999, 2004), and Frawley (1992). We can see, for example, that doing-and-happening processes behave differently with respect to their unmarked present tense options: (2)
I am building a house. vs. ?? I am believing that I will go. I believe that I will go. vs. ?? I build a house.
Mental and saying processes give rise to the only clause types that take thatclauses (‘I think/say that I will go’). This type of relation, called projection is a semantic motif that is also shown by virtue of reactances in certain classes of nominal phrases (e.g., ‘the fact that. . . ’, etc.) and adjectival phrases (e.g., ‘certain that. . . ’). Finally, being-and-having processes give rise to clause types that, again, do not like present progressive tenses, but which also can distinguish themselves further as either disallowing passivization: (3)
* Red is been by the box.
or allowing ‘reversals’ of the form: (4)
Mary is the leader. vs. The leader is Mary.
The more such grammatical indications found for a category – i.e., the larger the set of reactancies that can be collected – then the more confident one can be that a linguistically-motivated cryptotype has been found.6 Clearly, there are also useful similarities to be drawn here with other discussions of distinguishable grammatical patterns such as those of Levin (1993). Although in the Hallidayan approach, one is looking for grammatical indications any-
Linguistic interaction and ontological mediation
127
where within the grammar and not just in particular areas, such as in diathesis alternations. The point of adopting this strong methodological constraint on the motivation of categories for the upper model is that it is very clear what the status of the resulting ontology is. It is a linguistically motivated ontology, called here a linguistic ontology. There is no assumption that this corresponds to any other layer of organisation. This avoids the potentially conflicting arguments for placement that can be found in the Lilog ontology as described above by Lang. We see the Upper Model organisation as a view of the world as revealed to us through language; this is then entirely analogous to a view of the world revealed to us through perception. Both may yield ontological insights: but whether or not these insights converge, and to what extent, we consider still to be an open question. 1.3.
Some consequences
We can now return to the main focus of the present chapter: the relationship between a level of organisation such as that of the Upper Model and particular bodies of domain or conceptual knowledge. We have seen that (lexical) semantico-conceptual ontologies such as Lilog or SUMO take on the domain or world modelling task directly, whereas the Upper Model only takes on the task of characterising states of affairs and objects in terms of their potential for linguistic expression. The initial use of the Upper Model, and it is this that gave it its name, was then as the top fragment of a single hierarchically organised knowledge base. The Upper Model provided the more general categories at the top of the hierarchy; the domain model was fitted in below this, relating individual concepts of the domain by subsumption to concepts of the Upper Model. This supported the straightforward re-use of a general natural language generation component for a variety of domains, as well as the simplification of communication between an application and the generator as we saw above with the simple SPL example. While simple in conception, and still often used in natural language processing systems, this use of the Upper Model is problematic in that it requires that domains organised in its terms follow distinctions that are linguistically well behaved. This is then the reverse situation of that for SUMO described above: whereas SUMO enforces conceptual relatedness, the Upper Model enforces linguistic relatedness. Each approach may serve to distort the material being represented when carried to extremes.
128
John A. Bateman
We can see this straightforwardly by taking a very practical and concrete task that has been addressed many times. Assuming that we wish to use one of our natural language generation systems to generate natural sentences in a language, here English, and we wish to do this by providing semantic specifications in terms of a formalization such as the SPL introduced above, just how do we go about determining the necessary form of these SPL specifications? We have found this methodology very beneficial for probing the limits of proposed formalizations as it forces us to confront areas where our theoretical understanding is less detailed than might have been hoped. It also insists that we find appropriate places in an overall framework to locate the problems that arise. This has lead to changes in the details of our account, which we will characterise here as we step through our discussion. If we are attempting to generate texts concerning traffic and transport systems, for example, then the problem is immediately thrown into sharp relief by Hobbs’ observation concerning an everyday spatial object relevant for this domain, a road: When we are planning a trip, we view it as a line. When we are driving on it, we have to worry about our placement to the right or left, so we think of it as a surface. When we hit a pothole, it becomes a volume for us. (Hobbs 1995: 820)
This gives rise to varying linguistic expressions such as ‘on the road’, ‘in the road’, ‘at the road’, ‘along Route 66’, etc. If we are generating these expressions using an SPL expression informed by the first versions of the Upper Model (Bateman et al. 1990), then our input specifications look as follows: Table 1. Input specifications for spatial expressions Desired expression SPL at the road :spatial-locating (road/zero-d-location) on the road :spatial-locating (road/one-or-two-d-location) in the road :spatial-locating (road/three-d-location)
That is, each of the usages requires that the entity that is used in order to locate an object or event, the road, receives different assignments to concepts within the Upper Model. In one case as a one-or-two-d-location; in another as a three-d-location, etc. This is because, in that version of the Upper Model, the generation of prepositional phrases involving the prepositions ‘on’, ‘in’,
Linguistic interaction and ontological mediation
129
‘at’, etc. aligned with a set of cryptotypes constructing dimensionality. Linguistically the usage of these prepositions was committing to entities that behaved as if they had the specified dimensionalities.7 If we use the subsumption method illustrated above for linking linguistic and non-linguistic ontologies, where we have a domain model concept such as road and need to make this inherit from a particular Upper Model category, then we have a problem. Differing phrasings require different assignments. But it is quite unlikely that we want to entertain the possibility that the ontological classification of a ‘road’ can change simply because we choose a different form of expression. It is, from an ontological engineering perspective, quite unpleasant to have a basic category such as dimensionality behaving in such a non-rigid fashion. Subsumption is then clearly not an appropriate ontological solution: enforcing it leads to inconsistent linguistic behaviour and inappropriate ‘conceptual’ models. The currently standard escape route here is to talk not of roads, but of our ‘conceptualization’ of roads; this is also inherent in Hobbs’ phrasing of the phenomenon and appears in the guise of ‘roles’ in ontologies such as that of SmartKom. The subsumption of the concept road to one of the dimensionalities of the Upper Model then depends on a selected conceptualization rather than inherent properties of the concept in question. But for our purposes of achieving generation, this ‘solution’ does not yet help. All it does is say that we say something different because we have thought of it in a different way. This is quite a subtle point; it is still too often regarded as a solution rather than simply pushing the problem from one place to another. We suggest that a better formulation is to allow both a linguistic and a non-linguistic construction of dimensionality and to say that the linguistic and non-linguistic align naturally, but not necessarily.8 This leads to a more ontological characterisation of what is involved in ‘conceptualization’. Our starting point is provided by Hobbs, who convincingly relates the selection of dimensionality to purposes. That is, the conceptualization is not random but relates directly to an adopted purpose. Linguistically, the basic selection of entity types (via lexemes such as “road”) appear to follow classifications of physical objects, whereas the linguistic reactancies that are activated (e.g., dimensionality) instead follow the configurations of purposes in which those physical objects are playing a role. This is again consonant with the view that grammatical patterns express a more generic kind of meaning that abstracts across specific situations and entities (Halliday and Matthiessen 1999). Thus, when constructing our SPL specifi-
130
John A. Bateman
cations for generation, we cannot look at the physical objects involved and straightforwardly select for dimensionality, we must instead look at a representation of the purpose-configuration involved and use the dimensionalities of the entities that play a role in that configuration. Only the lexical selection remains, at least with this example, a label for the physical object involved; but its meaning in context is not just that physical object but instead, in the ‘1-dimensional line’ case: (5)
‘a link in a navigation system that is in this case being realised by a road’
Such a link is quite properly and rigidly 1-dimensional. This kind of semantic relationship has recently been studied (Kuhn and Raubal 2003) as a generalisation of the geographic notion of ‘projection’. We need here to be able to model the facts that certain objects may be used for certain purposes within certain social contexts. For example, that for some community of agents, some physical object (such as a road) is used for the purposes of a link within some navigation or path system: e.g., (6)
purpose (Agent: X, PhysObj: Road, NavigationSystem: Link)
Road has rigidly all the ontological properties of a physical object. Link has rigidly all the ontological properties of a link in a navigation system or graph. There are no ontologically curious combination objects, such as road-as-link, but instead an ontologically more sophisticated configuration of formally linked entities. A graphical impression of the relationships necessary for a handful of selected categories is set out in Figure 2. The upper part of the diagram shows in simplied form three ontological ‘regions’ from the abstract, non-linguistically bound stratum of the socio-semiotic: that of social agents, that of physical endurants (PED: roughly physical objects as defined in the DOLCE ontology of Masolo et al. (2003)), and that of a particular kind of abstract entities, those of transportation networks. The lower part of the diagram shows categories taken from the Upper Model and which are, therefore, directly aligned with linguistic form and grammatical reactances. The socio-semiotic regions are bound together via individual site-of relations, extending loosely a suggestion of Smith and Varzi (1999) concerning some special kinds of spatial entity.9 Essentially, a physical endurant, i.e., something existing in the real physical world, can serve as a site for activities of various kinds. The examples in the
Linguistic interaction and ontological mediation
131
Figure 2. Illustrative selection of some concrete relationships between the two ontological strata primarily relevant for language
diagram concern the much discussed example of a ‘school’ being either a physical building or an institution and the various ways of discussing ‘roads’. We then can see that the lexeme ‘school’ is indeed ambiguous with respect to which particular ontological category is being picked out: those categories have, however, completely different identity and rigidity properties and are thus ontologically distinct. The move between using the lexeme ‘school’ to pick out an institution or the physical endurant that is its site is, of course, a completely general and frequently occuring reference strategy in languages: it has rather little ontological import. When we examine the grammatical clauses in which the lexeme may appear, such as, e.g.: (7)
The school has decided to accept you.
vs.
The school is being demolished next week.
we can readily ascertain which Upper Model category is being invoked (i.e., mental-process vs. material-process) and, accordingly, which attribution is appropriate for the nominal “the school” (i.e., conscious-being vs. nonconscious-being).10 Again, it is the grammatical reactances that work to position the category in the linguistic ontology. The inter-stratum ontological
132
John A. Bateman
links in Figure 2 then let us trace back to appropriate socio-semiotic categories, such as physical endurant or social agent. This demonstrates two central properties of the organisation considered. First, lexical selections serve to position us within possible ‘vertically’ related chains within the socio-semiotic ontological stratum – e.g., are we dealing with the class of all buildings or with just those buildings that are serving as schools, etc.? – while grammatical selections serve to position us ‘horizontally’ – e.g., are we dealing with a social agent or the physical endurant that serves as its site? And second, the sources for evidence for the organisations shown in the socio-semiotic stratum can be quite different. This restates the point made more abstractly above with respect to Figure 1. Whereas our embodied physical experiences and perceptual/cognitive systems may be expected to have a substantial influence on the precise characterisation of the physical endurant region of the ontology, this will not be the case for more abstract regions, such as graphs – unless these are related metaphorically to physical endurants. Despite these differing sources, the socio-semiotic stratum still combines different realms of experience in a coherent whole.11 The main consequence that then needs to be drawn is that two ontological domains motivated by different concerns are recognised: – Conceptual or world ontology: providing intended models of the commonsense world, motivated by perception and action. – Linguistic ontology: providing a construction of reality as revealed through language, i.e., language as world-construing, and motivated by linguistic regularities.
The organisational motivation of the two kinds of ontology are quite different and a mapping between them may not be straightforward. In the next section, we show this concretely drawing on the example of treatments of ‘space’ and language about space in more detail. 2.
Relating linguistic and non-linguistic spatial ontologies: The role of interaction
There are many non-linguistic ontological treatments of space (cf., for an overview: Bateman and Farrar 2004b). Despite many differences, a fundamental role is commonly accorded to mereotopology (Casati and Varzi 1999) and to connection (Cohn and Hazarika 2001). When we try and bring these kinds of ontological organisations into relation with natural language, prob-
Linguistic interaction and ontological mediation
133
lems of a similar nature to those shown above for mixed ontologies arise. Essentially, it is of questionable value insisting that the linguistically and nonlinguistically motivated organisations necessarily align. In this section, we sketch this source of problems in the interpretation of spatial terms and show why the incorporation of linguistic interaction is necessary both for bringing the linguistic and non-linguistic together and for revealing more clearly the ontological commitments that uses of language reveal. 2.1.
Geometric and functional views of space
The different levels of phenomena at issue for us here can be related directly to the central role that we argued to be played in the previous section by purpose in assigning correspondences between the linguistic and non-linguistic ontological strata. This centrality persists for even the most basic of spatial relationships: that is, not only are lexemes such as ‘school’ or ‘road’ to be considered functionally variable but also simple spatial terms, such as ‘in’, ‘on’, ‘in front of’, etc. The need for this can be seen in standard discussions of spatial expressions, such as, for example, the well-known treatment offered by Herskovits (1986). When considering problematic cases such as: (8)
“the fly is in the glass” vs. “the fly is in amber” “the bulb is in the socket” vs. “the bulb is in the box”
it is clear that there is something more than purely ‘spatial’ information involved. The question is just how that ‘extra’ is to be worked into the account. Herskovits’ proposal is a kind of ‘coercion’. Drawing on non-linguistic mereotopology and the notion of containment within the convex closure of an entity, she allows various ‘pragmatic deviations’ from a geometrically defined ‘ideal meaning’. Unfortunately, this means that much of normal language usage has to be declared as ‘exceptions’ or a bending of the rules within unclear tolerances. In the terms being developed in this chapter, what we have is a case of mixing linguistic and non-linguistic characterisations. An alternative to this can be pursued from the ontological side as follows. The fine-grained and sophisticated accounts of mereotopology and geometry developed by non-linguistic ontologies of space allow complex spatial configurations to be described ontologically but do not, as yet, take us far enough to explain how ‘in’ comes to be used in each of the examples above. As Casati and Varzi (1999) note:
134
John A. Bateman
It is apparent that these cases reveal the limits of the approach insofar as it is purely geometric: a full account calls for a step into other territories where pragmatics, or functional and causal factors at large, must be taken into account. Our point is that explicit reference to holes can mark an improvement as far as the geometric part of the story goes. True, only some holes count for the purpose of representing containment. But which holes do count is not a question for the geometric analysis of the problem. (Casati and Varzi 1999: 141)
The ‘holes which count’ can, however, be explored in approaches where far more weight is given to the functional properties that are involved. This alternative tradition of explanation, begun in work such as that of Vandeloise (1985), is now coming to occupy a central place in the exploration of space – see, for example, the discussion in Brala (this vol.), where considerably more detail is given. Researchers differ on just how complete an explanation in terms of functional categories can be, but there is a growing conviction that the functional categories are central for explaining the phenomena of spatial language. Detailed studies of the respective contributions of geometric and functional components for a variety of prepositions are discussed, for example, by Coventry and Garrod (2004). Under this account, the uses of ‘in’ above would be characterised as instances not primarily of geometric ‘spatial inclusion’ but of functional control. The bulb is “in” the socket because its behaviour is then totally controlled by what happens to the socket. One is “in” an institution because the institution defines what is acceptable or unacceptable in some respect and determines behaviour. This kind of flexibility in use and interpretation is extremely widespread, perhaps even foundational for language as such. Consider, for example, the following uses of the spatial relationship ‘in front of’: (9)
a. b.
in front of the TV in front of the fridge
If we wish to identify the precise spatial region that is being referred to here (as we do in our more recent work on human-robot interaction: Ross et al. (2005)), then a purely spatial account of the relationship involved is again insufficient. Being ‘in front of the TV’ might include sitting on a sofa some distance away from the TV if you are watching it but would be a few centimeters in front and on the floor if you were looking for something that had been dropped there. Therefore, someone ‘sitting in front of the TV’ and someone
Linguistic interaction and ontological mediation
135
‘sitting in front of the fridge’ might translate to very different spatial areas. This is also not just a matter of spatial refinement of more or less vaguely defined areas, as the following examples show: (10)
a. b. c.
Plant the flowers in front of the house. Park your car in front of the house. They built a tower-block right in front of our house!
The spatial regions referred to in each of these cases can quite easily be distinct and non-overlapping. Is the front of the house that is being referred to the front of the building, and the flowers are then planted in the flowerbed running along the front wall, or the front of the property, and the car is then parked in the road that runs along that edge of the property, or the general area that can be seen when one looks out of a window at the front of the house: so that anything big enough that is built in this direction (presumably at least on the other side of the road) obscures the view? The abstract meaning of ‘in front of’ is therefore generic rather than specific and functional rather than geometric. In order to account for such meanings, we are currently reworking our Upper Model linguistic ontology so as to re-interpret spatial relationships in favour of cryptotypes of functional control, functional attachment, functional positioning, functional orientation (‘front’, ‘back’, etc.) and so on. This seems to us to be the only way of dealing with the very flexible intepretations of spatial terms that occur. But, when doing this, we are also thrown back to a consideration of just how ‘function’ is to be anchored ontologically. If the occurence of lexemes such as ‘school’, ‘road’, etc. can no longer be taken as indicative of particular ontological classes, and simple prepositions such as ‘in’, ‘on’, etc. reveal themselves to be similarly slippery, just what ontological import can language expressions be taken to have at all? To place some bounds on the variability, we need to take our final step and to include linguistic interaction within the account. 2.2.
Purpose and interaction
We have argued that it is the task of a linguistic ontology to describe precisely what particular selections of lexicogrammatical material are committing to when they are used. These commitments, and the organisation that they presuppose, may differ substantially from organisations motivated by accounts drawn on the basis of cognition/perception.
136
John A. Bateman
A much more active role can be allocated to language by following the line of argument concerning ‘vagueness’ proposed by Smith and Brogaard (2003). Taking as an illustrative example the statement: (11)
This glass is empty.
they explore the vagueness of the category ‘empty’ in terms of how it can be made ontologically precise. Depending on the particular kind of use at issue, the statement may be true or it may be false. A use that defines ‘empty’ in terms relevant for a thirsty drinker may well ignore – indeed, following Smith and Brogaard, may not even perceive – small drops of water attaching to the sides. But a use that defines ‘empty’ in terms relevant for a government health inspector will certainly not ignore the, possibly contaminated, drops of water, and so will have a different attribution of truth value. Smith and Brogaard’s first main advance is therefore to add a notion of context to the resolution of vagueness and to bind that notion of context into their ontological account. Portions of reality are accordingly demarcated ontologically in terms of ‘granular partitions’ (Bittner and Smith 2001). Partitions place a selected degree of resolution on the ontology of the world and, crucially, that degree of resolution is purpose-bound rather than merely involving some specification of ‘size’. They form an essential part of multiperspectival ontology. For a partition to do its work, it needs to have cells large enough to contain the objects that are of interest in the portion of reality which concerns the judging subject, but at the same time these cells must somehow serve to factor out the details which are of no concern. A partition . . . is accordingly a device for focusing upon what is salient and for masking what is not salient. (Smith and Brogaard 2003: 77)
Ontologies are then, according to Smith, Brogaard and Bittner, seen as necessarily parameterised according to granular partitions that define, or rather, make visible or accessible, those aspects of reality at issue for current purposes. This access to reality is then crucially related to the kinds of discourses into which interacting subjects enter: A context, for our purposes, is a portion of reality associated within a given conversation or perceptual report and embracing also the beliefs and interests and background knowledge of the participants, their mental set, patterns of language use, ambient standards of precision, and so forth. Above all it is a
Linguistic interaction and ontological mediation
137
matter of fact of what is paid attention to by participant speakers and hearers on given occasions. (Smith and Brogaard 2003: 53)
This link between portions of reality and interaction is central for our account. Although Smith and Brogaard’s primary concern is that of ontology and the foundations of theories of truth and they do not, as a consequence, specify in any detail just how such a link is to be constructed, we in fact find ample evidence pointing in a similar direction from both psycholinguistic and social semiotic research. Psycholinguistically, there is now a considerable body of empirical results concerning the process of alignment, whereby participants in an interaction commonly develop locally stable description styles. In research on spatial communication, for example, particular ways of naming spatial configurations as well as particular schemes for locating particular positions are implicitly negotiated during interaction (e.g., Garrod and Sanford 1988; Tversky, Lee, and Mainwaring 1999). Such negotiated linguistic expressions are not, in general, re-usable across different participants or even across different interactions – their meaning is dynamically fixed within individual interactions. The representations employed by interlocutors appear to be drawn together during interaction at all levels. We can see this quite literally as reflecting the fact that particular granular partitions are being constructed, activated or de-activated as Smith and Brogaard propose. Research on spatial communication has shown this process of ‘alignment’ to be essential for all levels of linguistic description. Similarly, from the social semiotic perspective, the view of context of Smith and Brogaard also makes immediate contact with the linguistic theory of register as developed within Hallidayan social semiotics (Halliday 1978). Context within this view has precisely the function of demarcating what is salient and what not. The notion of register goes further, however, in attempting in addition to map out the precise relationships between linguistic forms and particular social contextual purposes of language use. The assumption is that linguistic forms co-vary with social purpose in precisely specifiable ways. In Bateman (1986), we related this behaviour to an extension of the register notion: individual interactions are said to construct microregisters – i.e., particular social activities that are being carried out by participants at a given point in time. This then brings the kind of relationship seen above between social situations in general and the language used in those situations (the traditional notion of register) in line with the fine-grained relationship be-
138
John A. Bateman
tween ongoing social processes and their accompanying language. Microregisters draw on the generic background provided by the registers of a language, but allow further negotiation of fine-grained linguistic usage within particular histories of interaction. We therefore link an unfolding account of social activities and relationships (the register) with the growth and destruction of granular partitions, and hence with the primary ontological mechanism for making accessible aspects, properties and objects of the world. Thus, when we are concerned with the field of navigation, then we may also talk about a road as getting from A to B in our trip; similarly, when we are concerned with the actual driving along a road that is in less than good repair, then we may talk of a road as being full of potholes and bumps. The corresponding granular partitions pick out different aspects of the world, and these are in addition quite different entities as we saw in Figure 2 above; we are then quite simply talking about different things. 3.
Conclusion: From language to ontology via interaction
In the discussion so far, we have drawn attention to some of the arguments in favour of stratifying ontological description into a linguisticallymotivated area of concern and a contextual/conceptual-area of concern. The linguistically-motivated area aligns with what has traditionally been called (linguistic) semantics, the contextual/conceptual-area with ‘ontology proper’. We have suggested that, despite much evidence against mixing the two, it is still common to find suggestions for ontologies where the stratification is not respected; we have picked out examples of how this, on the one hand, weakens a resulting ontological organisation, compromising its use in both linguistic description/processing and ontological generalisation while, on the other hand, makes useful modularisations of the information maintained more difficult. One of the questions an appropriate stratification then places on the research agenda is that of addressing more fully the nature of the relation between these two ontological areas: since, having separated them, we still need to show how they combine in order to support complete descriptions of linguistic and other behaviour. Here we suggest that this relation is essentially one of negotiated mediation. The extent to which it is possible to pre-calculate a permanent alignment between ontologies from the two domains is limited. A consideration of negotiated mappings related to both social purpose and to interactional development appears necessary.
Linguistic interaction and ontological mediation
139
The model is thus one in which linguistic patterns, particularly lexicogrammatical reactances, motivate a linguistic ontology whose primary function is to define or activate granular partitions. In a sense, linguistic semantic selections serve as instructions for the construction of partitions. The process of alignment during interaction then brings together the partitions employed by interlocutors. The kinds of functional differentiation of spatial terms that we saw above are then seen as demarcating partitions of space according to purpose rather than reflecting conceptualisation directly. This model of alignment may also be related more broadly to the general problem of how to relate differing ontologies. We suggest that just as humans interactants deploy negotiation in order to settle on a functioning medium of communication – captured in a microregister – so may negotiation be necessary for aligning distinct ontologies in general. Such mediation is very powerful, but needs to be carried out with respect to specific tasks and communicative purposes. In order to address this basic problem of ontology-based interoperability, we suggest that much is to be learnt from how meanings are exchanged in natural human dialogue. For the present, however, we see a research agenda where the constructivist role of linguistic semantics needs to receive considerably more attention than it has received hitherto.
Notes 1. Space precludes a more detailed presentation of the broad terms of this debate concerning distinctions or lack thereof between linguistic and nonlinguistic information; for a much more extensive literature review, however, see Brala (this vol.). 2. For a more detailed historical overview of the use of stratification within ontology, see Poli (2001). 3. The earlier development of the Upper Model ontology, from its inception as the Upper structure of the JANUS project of ISI and BBN, up to its inclusion as a standard component of several NLP systems is discussed in Bateman (1992). The most recent development, called the Generalised Upper Model, is documented in Bateman, Henschel, and Rinaldi (1995) and is currently being revised and extended within the Web Ontology Language, OWL. 4. We omit from the the discussion here the additional components of the specification that control textually significant options such as those that guide determiner choice, voice, information focus, etc., as well as the temporal semantics that determine tense selections and the speech function semantics that determine clause type. These do not effect the characterization in terms of the Upper Model.
140
John A. Bateman
5. These categories are taken from the latest version of the Generalised Upper Model and so their names differ somewhat from those found in earlier Upper Model papers. 6. Note also that the grammatical patterns are extremely powerful: if a speaker uses the reactances for a particular cryptotype, then that is the interpretation that will win. This can therefore force recodings of the, in many respects far weaker, lexical readings of items. If a speaker says “I am being tall” this is not ‘ungrammatical’, it rather suggests an alignment with the doing-and-happening reactance. It means that effort is being put into the maintenance of the property – i.e., it is more like an action. This therefore explains several ‘creative’ uses of language in a straightforward fashion and is why we have generally favoured marking examples with “?” rather than an outright “*” in the examples. 7. This turns out to be quite a reliable construal in English; other languages, even quite close languages such as German, require differing treatments. 8. For example, I have to say in English on the bus – linguistically a surface – but this does not seem sufficient grounds for maintaining that buses are necessarily conceptualised as surfaces. 9. This would need to be set out in far more technical detail to show the actual use made of niches and sites here and how they differ from the primarily biological and ecological examples discussed by Smith and colleagues. The loose connection suggested here will suffice for present purposes however. 10. Naturally there are also cases where the precise contextualisation must remain underspecified on the basis of the grammatical information alone. 11. Traditionally and philosophically, the socio-semiotic stratum used here is to be related to the construct of the lifeworld (cf. Bateman 1995).
Acknowledgements Much of the work reported in this paper was undertaken as part of the Collaborative Research Center for Spatial Cognition (Sonderforschungsbereich/Transregio SFB/TR8) of the Universities of Bremen and Freiburg. The SFB/TR8 is funded by the Deutsche Forschungsgemeinschaft (DFG), whose support we gratefully acknowledge. Particular thanks go to the members of the OntoSpace and SharC projects.
References Baker, Collin F., Charles J. Fillmore, and John B. Lowe 1998 The Berkeley FrameNet Project. In Proceedings of the ACL/COLING-98, Montreal, Quebec, Christian Boitet and Pete Whitelock (eds.), 86–90. San Francisco: Morgan Kaufmann.
Linguistic interaction and ontological mediation
141
Bateman, John A. 1986 Utterances in context: Towards a systemic theory of the intersubjective achievement of discourse. Ph.D. diss., School of Epistemics, University of Edinburgh. Available as Edinburgh University, Centre for Cognitive Science In-House Publication EUCCS/PhD-7. 1992 The theoretical status of ontologies in natural language processing. In Text Representation and Domain Modelling – Ideas from Linguistics and AI (Papers from the KIT-FAST Workshop, Technical University Berlin, October 9th–11th 1991), Susanne Preuß and Birte Schmitz (eds.), 50–99. (KIT-Report 97, Technische Universit¨at Berlin, Berlin, Germany). http://xxx.lanl.gov/cmp-lg/9704010. 1995 On the relationship between ontology construction and natural language: a socio-semiotic view. International Journal of HumanComputer Studies 43 (5/6): 929–944. Bateman, John A., and Scott Farrar 2004a Towards a generic foundation for spatial ontology. In Formal Ontology in Information Systems: Proceedings of the Third International Conference (FOIS-2004), Achille C. Varzi and Laure Vieu (eds.), 237–248. Amsterdam: IOS Press. 2004b Spatial ontology baseline. SFB/TR8 Internal Report I1-[OntoSpace]: D2, Collaborative Research Center for Spatial Cognition, University of Bremen, Germany. http://www.ontospace.uni-bremen.de. Bateman, John A., Renate Henschel, and Fabio Rinaldi 1995 Generalized upper model 2.0: documentation. Technical Report, GMD/Institut f¨ur Integrierte Publikations- und Informationssysteme, Darmstadt, Germany. http://purl.org/net/gum2. Bateman, John A., Robert T. Kasper, Johanna D. Moore, and Richard A. Whitney 1990 A general organization of knowledge for natural language processing: the PENMAN upper model. Technical Report, USC/Information Sciences Institute, Marina del Rey, California. Bittner, Thomas, and Barry Smith 2001 A taxonomy of granular partitions: Ontological distinctions in the geographic domain. In Proceedings of the Conference on Spatial Information Theory – COSIT 2001, Daniel Montello (ed.), 28– 43. (Lecture Notes in Computer Science 2205.) Berlin: Springer. http://people.ifomis.uni-leipzig.de/thomas.bittner/tp.pdf. Borgo, Stefano, Nicola Guarino, and Claudio Masolo 1996 Stratified ontologies: The case of physical objects. In Proceedings of the Workshop on Ontological Engineering at ECAI’96, Budapest, Hungary, Paul E. van der Vet (ed.), 5–15. Budapest: ECAI. http://citeseer.ist.psu.edu/borgo96stratified.html. Brala, Marija M. this vol. Spatial ‘on’ – ‘in’ categories and their prepositional codings across languages: Universal constraints on language specificity.
142
John A. Bateman
Casati, Roberto, and Achille C. Varzi 1999 Parts and Places: The Structures of Spatial Representation. Cambridge, MA/London: MIT Press (Bradford Books). Cohn, Anthony G., and Shyamanta M. Hazarika 2001 Qualitative spatial representation and reasoning: An overview. Fundamenta Informaticae 43: 2–32. Coventry, Kenny R., and Simon C. Garrod 2004 Saying, Seeing and Acting. The Psychological Semantics of Spatial Prepositions. (Essays in Cognitive Psychology Series.) Hove/New York: Psychology Press. Davidson, Donald 1967 The logical form of action sentences. In The Logic of Decision and Action, Nicholas Rescher (ed.), 81–95. Pittsburgh, PA: University of Pittsburgh Press. Donnelly, Maureen, and Barry Smith 2003 Layers: A new approach to locating objects in space. In Spatial Information Theory: Foundations of Geographic Information Science, Werner Kuhn, Michael F. Worboys, and Sabine Timpf (eds.), 50– 65. (Lecture Notes in Computer Science 2825.) Berlin: Springer. http://ontology.buffalo.edu/geo/Layers.pdf. Frawley, William 1992 Linguistic Semantics. Hillsdale, New Jersey: Lawrence Erlbaum. Garrod, Simon C., and Anthony J. Sanford 1988 Discourse models as interfaces between language and the spatial world. Journal of Semantics 6: 147–160. Guarino, Nicola, and Christopher Welty 2004 An overview of OntoClean. In Handbook on Ontologies, Steffen Staab and Rudi Studer (eds.), 151–172. Berlin: Springer. Gurevych, Iryna, Robert Porzel, and Rainer Malaka 2006 The SmartKom ontology. In SmartKom – Foundations of Multimodal Dialogue Systems, Wolfgang Wahlster (ed.). (Cognitive Technologies.) Berlin: Springer. Halliday, Michael A. K. 1978 Language as Social Semiotic. London: Edward Arnold. Halliday, Michael A. K., and Christian M. I. M. Matthiessen 1999 Construing Experience Through Meaning: A Language-Based Approach to Cognition. London: Cassell. 2004 An Introduction to Functional Grammar. 3d ed. London: Edward Arnold. Herskovits, Annette 1986 Language and Spatial Cognition: An Interdisciplinary Study of the Prepositions in English. (Studies in Natural Language Processing.) London: Cambridge University Press.
Linguistic interaction and ontological mediation
143
Herzog, Otthein, and Claus-Rainer Rollinger (eds.) 1991 Text Understanding in LILOG: Integrating Computational Linguistics and Artificial Intelligence. Final Report on the IBM Germany LILOG-Project. (Lecture Notes in Artificial Intelligence 546.) Berlin: Springer. Hobbs, Jerry R. 1995 Sketch of an ontology underlying the way we talk about the world. International Journal of Human-Computer Studies 43 (5/6): 819– 830. Kasper, Robert T. 1989 A flexible interface for linking applications to PENMAN’s sentence generator. In Proceedings of the DARPA Workshop on Speech and Natural Language, Lynette Hirschman (ed.), 153–158. San Mateo, CA: Morgan Kaufmann. Klose, Gudrun, and Kai von Luck 1991 The background knowledge of the LILOG system. In Herzog and Rollinger (eds.), 455–463. Kuhn, Werner, and Martin Raubal 2003 Implementing semantic reference systems. In AGILE 2003 – 6th AGILE Conference on Geographic Information Science, Lyon, France, Michael F. Gould, Robert Laurini, and St´ephane Coulondre (eds.), 63–72. Lausanne: Presses Polytechniques et Universitaires Romandes. Lang, Ewald 1991 The LILOG ontology from a linguistic point of view. In Herzog and Rollinger (eds.), 464–481. Levin, Beth 1993 English Verb Classes and Alternations: A Preliminary Investigation. Chicago/London: University of Chicago Press. Masolo, Claudio, Stefano Borgo, Aldo Gangemi, Nicola Guarino, and Alessandro Oltramari 2003 Ontologies library (final). WonderWeb Deliverable D18, ISTC-CNR, Padova, Italy. Miller, George A. 1990 WordNet: an online lexical database. International Journal of Lexicography 3 (4): 235–312. ftp://ftp.cogsci.princeton.edu/pub/ wordnet/5papers.pdf. Pease, Adam this vol. Formal representation of concepts: The Suggested Upper Merged Ontology and its use in linguistics. Poli, Roberto 2001 The basic problem of the theory of levels of reality. Axiomathes 12: 261–283.
144
John A. Bateman
Quantz, Joachim, Manfred Gehrke, Uwe K¨ussner, and Birte Schmitz 1994 The VERBMOBIL domain model. Version 1.0. Verbmobil Report 29, University of the Saarland, Saarbr¨ucken, Germany. Ross, Robert J., Hui Shi, Tilman Vierhuff, Bernd Krieg-Br¨uckner, and John A. Bateman 2005 Towards dialogue based shared control of navigating robots. In Spatial Cognition IV: Reasoning, Action, Interaction. International Conference Spatial Cognition 2004, Frauenchiemsee, Germany, October 2004, Proceedings, Christian Freksa, Markus Knauff, Bernd KriegBr¨uckner, Bernhard Nebel, and Thomas Barkowsky (eds.), 478–499. Springer, Berlin. Smith, Barry, and Berit Brogaard 2003 A unified theory of truth and reference. Logique et Analyse 43 (169–170): 49–93. http://ontology.buffalo.edu/smith/articles/ truthandreference.htm. Smith, Barry, and Pierre Grenon 2004 The cornucopia of formal-ontological relations. Dialectica 58 (3): 279–296. http://wings.buffalo.edu/philosophy/ontology/smith/ articles/cornucopia.pdf. Smith, Barry, and Achille C. Varzi 1999 The Niche. Noˆus 33 (2): 214–238. Tenbrink, Thora 2005 Semantics and application of spatial dimensional terms in English and German. Technical Report, SFB/TR8 Spatial Cognition, University of Bremen, Germany. Trautwein, Martin this vol. On the ontological, conceptual, and grammatical foundations of verb classes. Tversky, Barbara, Paul Lee, and Scott Mainwaring 1999 Why do speakers mix perspectives? Spatial Cognition and Computation 1: 399–412. Vandeloise, Claude 1985 Description of Space in French. (Series A, No. 150.) L.A.U.D.T. – University of Duisburg: Linguistic Agency. Whorf, Benjamin 1956 Whorf: Language, Thought, and Reality: Selected Writings, John Carrol (ed.). Cambridge, MA: MIT Press.
Semantic primes and conceptual ontology Cliff Goddard 1.
Introduction
The Natural Semantic Metalanguage (NSM) approach to language analysis, originated by Anna Wierzbicka (1972, 1996; Goddard and Wierzbicka 1994, 2002), claims to have identified some 65 universal semantic primes. As explained below, they can be grouped in various ways, using syntactic and/or “thematic” criteria. The present study concentrates on a set of primes which may be termed “substantive”, and which form the foundation of the nominal lexicon.1 After an introduction in Section 1., Section 2. gives an account of the NSM substantive primes. Section 3. addresses the question of how major divisions within the nominal vocabulary are constructed either exclusively from semantic primes, or from primes in combination with semantic molecules. Concluding remarks form Section 4. The foundational assumption of the NSM program is the assumption of the meta-semantic adequacy of natural languages, i.e. that it is possible to exhaustively and reductively paraphrase the words and grammatical constructions of any language using a subset of the word-meanings and grammatical constructions of the language itself. The methodological attractions of this approach can be itemised as follows. (i) Any system of semantic representation has to be interpreted in terms of some previously known system and since the only such system shared by all language users is natural language itself, it makes sense to keep the system of semantic representation as close as possible to natural language.2 (ii) Clear and accessible semantic representations enhance the predictiveness and testability of hypotheses. Most other systems of semantic analysis are hampered by the obscurity and artificiality of the terms of description. (iii) To the extent that the system is intended to represent the cognitive reality of ordinary language users, it would seem problematical to employ symbols whose meanings are completely opaque to language users themselves. The NSM assumption is that the basic conceptual elements relevant to natural language semantics exist as the simplest wordmeanings in all natural languages, i.e. as universal semantic primes. The current NSM model is the result of a 30-year program inaugurated by Wierzbicka (1972). It started with lexical semantics but has since extended
146
Cliff Goddard
into grammatical and illocutionary semantics, and into cultural pragmatics (with the theory of cultural scripts, cf. Goddard and Wierzbicka 2004; Goddard 2006). Over the years there have been a number of significant revisions and adjustments to the model in response to descriptive-analytical work (cf. Goddard 2002b). The current inventory of primes numbers in the mid-sixties, each of which has a well-specified set of syntactic properties (combinatorics, valency, complementation),3 so that the system as a whole can be seen as a kind of “mini-language”. The NSM metalanguage is not a fully natural system, however, precisely because it works within such strict and well-defined confines. It is better characterised as a formal system based on natural language and understandable on the basis of natural language (cf. Allan 2001: 8–9). A significant body of cross-linguistic evidence indicates that the primes and their associated grammar are universally manifested in all human languages. The current inventory is listed in a conventional arrangement in Table 1, using English exponents. All these terms identify simple and intuitively intelligible meanings which are grounded in ordinary linguistic experience. The NSM system aims to be exhaustive, i.e. capable of adequately paraphrasing the entirety of the vocabulary of any language.4 The claim is that semantic primes are essential for explicating the meanings of other words and grammatical constructions, and that they cannot themselves be explicated in a noncircular fashion. In relation to Table 1, however, it must be admitted that the groupings shown, though useful for some purposes, are based on several different, and sometimes competing, criteria. For example, the elements I and YOU appear in the “substantives” grouping, because in most contexts they are cross-substitutable with substantive SOMEONE; but they could equally well be grouped with HERE and NOW, on the grounds that all four items are deictic, and do not accept specification with determiners or quantifiers. Conversely, WHEN / TIME and WHERE / PLACE do not appear in the “substantives” grouping, but in the thematic groupings of “time” and “space”, respectively. On syntactic grounds, however, they could equally be regarded as substantives, because they can accept specification with determiners and quantifiers. Likewise, the thematic grouping of “speech” primes contains an element, namely WORDS, with similar substantive properties. Table 1 has some other questionable features as well. For example, the five items in the “logical concepts” grouping are heterogeneous from a syntactic point of view: MAYBE, CAN, and NOT can perhaps be termed “operators” (albeit with different scope prop-
Semantic primes and conceptual ontology
147
erties), but BECAUSE forms adjunct phrases, and IF introduces a dependent clause. This is not the place to canvass revisions and improvements to the current thematic table as a whole (cf. Goddard in press a). I will, however, return to the issue of a better delimitation of the “substantives” grouping in Section 2. Table 1. Thematic table of semantic primes – English exponents (updated from Goddard and Wierzbicka 2002) Substantives
I , YOU , SOMEONE , SOMETHING / THING , PEOPLE , BODY
Relational substantives Specifiers
KIND , PART THIS , THE SAME , OTHER / ELSE , ONE , TWO , SOME , ALL , MUCH / MANY
Attributes
GOOD , BAD , BIG , SMALL
Mental predicates Speech
THINK , KNOW, WANT, FEEL , SEE , HEAR
Actions, events, movement, contact Location, existence, possession, specification Life and death
DO , HAPPEN , MOVE , TOUCH
Time
SAY, WORDS , TRUE
BE ( SOMEWHERE ), THERE IS / EXIST, HAVE , BE
( SOMEONE / SOMETHING ) LIVE , DIE WHEN / TIME , NOW, BEFORE , AFTER , A LONG TIME , A SHORT TIME , FOR SOME TIME , MOMENT
Space
WHERE / PLACE , BE ( SOMEWHERE ), HERE , ABOVE , BELOW, FAR , NEAR , SIDE , INSIDE
Logical concepts Augmentor, intensifier
NOT, MAYBE , CAN , BECAUSE , IF
Similarity
LIKE / AS
VERY, MORE
Importantly, semantic primes are not postulated to exist as whole lexemes, but as identifiable meanings of lexical units.5 Language-specific polysemy can therefore obscure the identification of exponents of particular primes in particular languages. Nevertheless, there has been a great deal of gritty empirical research into the realisations of semantic primes across a wide range of languages (Goddard and Wierzbicka 1994, 2002; Goddard in press b), and
148
Cliff Goddard
it appears that language-specific evidence is always available to support an identification which depends on a polysemy analysis. The lexical exponents of primes can be formally complex. The English terms something and someone, for example, each consist of two morphological elements, while a long time and for some time are phrasemes. The NSM claim is that these expressions nevertheless represent unitary meanings. Something, for example, does not mean the same as some thing; and a long time cannot be composed from long plus a time. In many languages, of course, exponents of SOMETHING and A LONG TIME are morphologically simple. Exponents of semantic primes can also have multiple realisations (allolexes) in a single language (Wierzbicka 1996: 26–27). The “doublebarrelled” items in Table 1, such as SOMETHING / THING , OTHER / ELSE, and WHEN / TIME , indicate meanings which, in English, are expressed by means of different allolexes in different grammatical contexts. The English forms something and thing, for example, can be regarded as expressing the same meaning, but with different combinatorial properties: something cannot combine with a specifier, while thing requires one. Compare: (a) Something happened, (b) The same thing happened again, (c) I don’t know when this thing happened. The forms something and thing are therefore regarded as Englishspecific allolexic realisations of a single prime, which is designated SOME THING . The criterial property for items to be regarded as allolexes is that there is no paraphrasable meaning difference between them. On the same grounds, else is an English-specific allolex for OTHER, and don’t is an English-specific allolex of NOT. Generally, one of the allolexes, either the most common or the form which is found in “bare” (minimal) syntactic contexts, can be regarded as the primary allolex, and is adopted as the name for the meaning in question, e.g. SOMETHING for something/thing, OTHER for other/else, NOT for not/don’t. When the distribution of allolexes is conditioned by the syntactic (combinatorial) context, as it frequently is, allolexes can be thought of as having a “structure-indexing” function (Goddard 2002a). In an English-based NSM, for example, the use of the allolex thing “indexes” the presence of a specifier. Such a structure-indexing function is not paraphrasable however. Furthermore, patterns of allolexy vary from language to language. They can be regarded as part of the language-specific instantiation of the underlying “pure” NSM. Semantic primes are not necessarily maximally independent from an intuitive point of view, and at first blush it may seem that certain groups of them
Semantic primes and conceptual ontology
149
share some semantic component; for example, SOMEONE and SOMETHING may seem to share an “indefiniteness” component; BEFORE and AFTER may seem to share a “temporal sequence” component; GOOD and BAD may seem to share an “evaluation” component, etc. On the other hand, the mere existence of an intuitive affiliation between primes does not mean that they necessarily share a common paraphrasable component of meaning. It is entirely possible for a group of semantic primes to have functional alignments and to share common properties without necessarily having any parts in common. Abstract and/or technical components such as “indefiniteness”, “sequentiality”, or “evaluation” would not pass muster in the NSM framework, because they are more obscure and complex than those they are supposed to explain, and because they are not translatable into most languages. On the contrary, it is better to view the abstract terms as “derived” from semantic primes. “Indefinite”, for example, can be seen as a collective term for expressions which use the primes SOMEONE or SOMETHING to mention a person (or thing) without saying about it ‘it is this person (thing)’. The term “temporal sequence” can be seen as a collective term for expressions which say about some event that ‘it happened before’ or ‘it happened after’ some other event. “Evaluation” can be seen as a cover term for saying about something that ‘it is good’, ‘it is bad’, or the like. Nor, in the case of pairs of converse terms like GOOD and BAD, BEFORE and AFTER, and so on, can one term be satisfactorily explicated using the other. Sometimes the putative equation simply does not work, e.g. GOOD does not mean NOT BAD. At other times the putative equation appears to work from a purely referential point of view, but fails as a plausible representation of human thinking. For example, ‘X happened after Y’ cannot be analysed from a cognitive point of view as ‘X happened at some time, not before Y, not at the same time as Y’. From a cognitive point of view it would be very peculiar to claim that someone who says ‘John was born after Mary’ means to say the same as: ‘John was born at some time, not before Mary was born, not at the same time Mary was born’.
2. 2.1.
The “substantive” primes Delimiting the substantive grouping
As mentioned, one way of grouping semantic primes is in terms of shared syntactic (combinatorial) properties. This applies in particular to the set being singled out for treatment in this chapter, which I will refer to as “sub-
150
Cliff Goddard
stantives”, with the caveat that this grouping has an expanded membership compared with that shown in Table 1. The term substantive is familiar from traditional grammar. For present purposes, we can think of it as grouping together a collection of semantic primes which share the following syntactic properties: (i) they can combine with the specifiers THIS , THE SAME and OTHER to form “substantive phrases” which have the same syntactic possibilities as themselves; or to put it another way, they can form the heads of specified phrases, such as THIS PER SON , THE SAME THING , OTHER PEOPLE, and so on; (ii) they can combine with predicate primes such as MOVE , THERE IS / EXIST, HAPPEN, and DO to form clauses, in the sense of combinations of words which can express self-contained messages; or to put it another way, they can fill argument and adjunct slots for predicates, in combinations such as SOMETHING MOVED , SOMETHING HAPPENED IN THIS PLACE , SOMEONE DID SOMETHING AT THIS TIME, etc. (cf. Goddard and Wierzbicka 2002). According to NSM research, natural languages universally recognise four general substantive categories which correspond largely with classical Aristotelian categories. They surface as lexical items in indefinite/interrogative systems in the world’s languages (‘who’, ‘what’, ‘where’, ‘when’), and as generic nouns (‘person’, ‘thing’, ‘place’, ‘time’) which can be combined with determiners and quantifiers. They will be discussed in Section 2.2. The NSM system has also identified three more specific ontological categories: PEO PLE, BODY , and WORDS. These less conventional items will be discussed in Section 2.3. The primes KIND and PART can also be regarded as substantives, insofar as they can combine directly with specifiers and quantifiers in expressions such as ‘this kind/part’, ‘the same kind/part’, ‘one kind/part’, ‘two kinds/parts’, ‘many kinds/parts’, ‘all kinds/parts’, etc., which appear to be realisable in all languages. These will be discussed in Section 2.4. The relationships between the primes considered in this chapter can be summarised as shown in Table 2 below. 2.2.
Categorical substantives: SOMETHING, SOMEONE, WHERE / PLACE, WHEN / TIME
The classical predicate calculus does not recognise any fundamental ontological categories, assuming a notion of an unspecified “entity”, “individual”, etc. represented by a free variable, which can be characterised any way one wishes by way of a predicate. For example, one could characterise persons
Semantic primes and conceptual ontology
151
Table 2. Substantive primes discussed in this chapter Categoricals
Specifics
SOMEONE (PERSON)
PEOPLE
Relationals KIND
BODY SOMETHING (THING)
WORDS
PART
WHERE (PLACE)
WHEN (TIME)
and things by means of predicates ‘be a person’ and ‘be a thing’. The NSM claim is that ordinary human thinking does not work this way but rather draws a fundamental categorical distinction between persons and things, or, more precisely, between SOMEONE and SOMETHING. Categoricals are indefinite and number-neutral. Although in isolation a term like SOMEONE or SOME THING may suggest a single referent, in indefinite contexts there is no real implication of singularity. If, for example, one asks ‘Who did it?’ or ‘What happened?’ one is not implying that only one person did it or that only one thing happened. SOMETHING does not mean the same as “entity”, or any other technical notion. It can be used indifferently in contexts such as I said something, I did something, I saw something, Something moved, and even Something happened. That this same range of occurrence is possible across languages, cf. Goddard and Wierzbicka 1994, 2002) is really quite astonishing. In the technical literature, it is common to find the distinction between SOMEONE and SOMETHING analysed into “finer features”, such as +human, +animate, +personal, and so on. As Wierzbicka (1996: 39) says, “accounts of this kind are a good example of pseudo-analysis, since the features which are invented to account for the difference between SOMEONE and SOMETHING need to be defined (or explained) in terms of SOMEONE and SOMETHING”. Needless to say, such pseudo-analyses are unable to meet the NSM requirement of preserving intelligibility upon substitution into natural contexts. For example, as Wierzbicka notes, the sentence I met someone nice can hardly be paraphrased (except in jest) as ‘I met a nice human entity (animate entity, personal entity)’,
152
Cliff Goddard
and I saw something interesting can hardly be paraphrased as ‘I saw an interesting non-human entity (inanimate entity, impersonal entity)’. The vast majority of languages have separate words for SOMEONE and SOMETHING (Goddard 2001: 8–11; Haspelmath 1997). Sometimes the same words are used as interrogatives; more commonly, one set of forms is morphologically basic and the other is built upon it. Very occasionally, a single expression appears to cover both SOMEONE and SOMETHING, but in all cases examined closely to date there is language-internal evidence for polysemy. For example, in the Australian language Wambaya (Nordlinger 1998: 120–122) the stem gayini is used for both categories, but the distinction is made by way of different gender suffixes: with the inanimate gender suffix gayini means SOMETHING, with the masculine animate gender suffix it means SOMEONE (masculine animate is used when the actual gender of the referent is unknown). In some languages, the word for SOMEONE is identical in form with the word for OTHER (also a semantic prime). For example, Yankunytjatjara kutjupa means OTHER when it is adnominal (e.g. kungka kutjupa ‘another woman’) and SOMEONE when it is the head of an NP in its own right, as in Kutjupa-ngku katingu ‘Someone-ergative took (it)’. It might be thought that the latter usage is elliptical, with an implied head noun such as anangu ‘people, person’, but this analysis is not viable since kutjupa SOMEONE can be used to refer to non-human beings such as the Christian God, who could never be referred to as an anangu ‘(human) person’ (Goddard 1994). On the NSM view, the distinction between WHEN / TIME and WHERE / PLACE is also an irreducible element of human thinking, deeply embedded in the structure of all human languages. Both concepts are capable of combining with determiners and quantifiers. Like many languages, English provides distinctive allolexes in these contexts, namely, TIME and PLACE.6 Unlike the other categoricals however, WHEN / TIME and WHERE / PLACE can appear with an “adjunct” status in syntactic terms, frequently indexed by a grammatical element such as an adposition or case marker; for example, in English phrases such as: at this time, at the same time, at one time, at many times, at all times in this place, in the same place, in one place, in many places, in all places
Just as interrogative systems in the world’s languages typically distinguish WHO ( SOMEONE ) from WHAT ( SOMETHING ), so too they typically distinguish ‘when?’ from ‘where?’, though the division is not quite as vivid (God-
Semantic primes and conceptual ontology
153
dard and Wierzbicka 1994; Haspelmath 1997). Some languages are reported to have a single interrogative item which can be used to elicit either time or place information, i.e. a ‘when/where’ word, but in such cases the WHEN vs. WHERE distinction is clearly made elsewhere in the lexicon.
2.3.
Specific substantives: PEOPLE , BODY, WORDS
As far as I know, no other semantic theory has proposed PEOPLE , BODY, and WORDS as fundamental semantic units. Perhaps no other theory would, because the motivation behind it is closely tied to the twin NSM principles that all non-primes must be paraphrasable in terms of primes and that primes must be identifiable meanings in ordinary language. Before summarising the “indefinability arguments” in favour of PEOPLE , BODY and WORDS, it may be helpful to observe that all three terms differ from the categoricals in their “numerosity”. While the categoricals are number-neutral, PEOPLE has a clear collective quality (cf. Mihatsch this vol.), BODY has a palpable “discreteness”(a bounded, unit-like aspect), while WORDS is characterised by vaguely “multiple” feel. The semantically prime sense of PEOPLE is found in non-specific uses like ‘people think ...’, ‘people say ...’, and so on. These expressions are crucial for explicating a host of “social” meanings. These include nominal social categories and collectives (the names for ethnic categories and nationalities, words like family, crowd, etc.), and for contrary “non-human” categories, such as animal, God, angel, etc. It is also needed for human activities and practices (words like game, language), and for the names of artefacts and tools (various kinds of things made by people). Many abstract concepts involve PEOPLE, including terms for values and ideals (ways of acting, etc. which ‘people’ think are good), so-called social emotions such as shame and pride (which involve a cognitive component ‘people can think something bad/good about me’), and less obviously certain epistemic meanings which appeal to “public knowability” (words like facts, evidence, etc.). Obviously the prime PEOPLE has an intuitive affiliation with SOMEONE, but the crucial point is that ‘someones’ are not necessarily PEOPLE. The essence of the term SOMEONE is that it represents a ‘being’, regardless of its kind. Most, if not all, cultures recognise the possible existence of nonhuman beings, such as deities, nature spirits, jins, angels, elves, and so on. Consequently, there is an irreducible semantic difference between SOMEONE
154
Cliff Goddard
and PEOPLE. Presumably the concept human is a combination of the two, e.g. a human being = ‘a being (someone) of the kind people’. In some European languages the word for PEOPLE is a collective term, unrelated to the word for an individual human being; for example, French les gens, Russian ljudi; in others, a single word exists in both plural (collective) form, expressing PEOPLE , and as a singular noun, referring to an individual human person, as with German Menschen and Mensch.7 In languages without obligatory number marking, the word for PEOPLE can usually be used to refer to a single human individual, as with Japanese hito, Yankunytjatjara anangu, and Malay orang. Sometimes the expression for PEOPLE appears to be a pluralised version of the term for SOMEONE; for example, MangaabaMbula zin tomtom (lit. pl. marker + ‘someone’), but this expression cannot be regarded as the sum of its parts, semantically speaking, because tomtom SOMEONE can refer to beings other than humans, whereas zin tomtom PEO PLE cannot. These various language-specific overlaps in formal realisation between PEOPLE and SOMEONE, reflective of non-compositional affiliations, deserve deeper treatment than is possible here (cf. Wierzbicka 2002: 68–77). As for BODY, it has a clear intuitive affiliation with both SOMETHING, given that a BODY is necessarily SOMETHING, and also with SOMEONE, given that most someones in this life have a BODY which is intimately tied up with their identity and with their experience. Many years ago, Wierzbicka (1972) proposed to explicate BODY as ‘something that can be thought of as someone’, but this explication is not adequate for several reasons and has long since been abandoned. First, there were technical problems with the explication. The predicate THINK lacks an explicit subject (presumably, PEOPLE would be most appropriate), and the expression ‘to think of as’ presumably involves semantic prime LIKE. When these matters are attended to, the explication appears more complex and less intuitively appealing: ‘something, people can think about this something: this is like someone’. Second, it would be hard to get this formulation to work in practical explications for “physical” meanings, such as words for body-parts, for sensations (hunger, thirst, pain, etc.), for bodily processes, for illnesses, and so on. Early surveys of body-part nomenclature, such as Andersen (1978), reported that the meaning BODY is universally lexicalised, but such claims have also been disputed. Unfortunately, many of the counter-claims are found in the anthropological literature, where either insufficient data is given to decide whether or not polysemy is involved, or the data given is at best equivocal (cf. the discussion of Lewis 1974 by Wilkins 1996 and Goddard 2001). Even
Semantic primes and conceptual ontology
155
when linguists claim that a particular language makes no distinction between, say, ‘body’ and ‘flesh’ (e.g. Meira 2006), one often finds that such claims are impressionistic and that no systematic lexicographic investigation has been undertaken. One well-established fact about the lexical expression of BODY is that it is often highly unstable, due to polysemic overlaps with meanings such as ‘person’, ‘skin’, and ‘flesh’ (Wilkins 1996). Wierzbicka (1996) proposed WORDS as a universal semantic prime, not in the sense of a discrete individual unit, but in a vaguer sense as something that can be used to express a message and that can be heard (cf. Polish sl˜owo, rather than wyraz). Semantic prime WORDS is needed to explicate concepts such as name, language, paraphrase, read, write; for explicating speech formulas, for certain speech act verbs, e.g. speak, utter, pronounce, swear, curse, and for explicating the meaning of metaphors and other figurative language (Goddard 2004). Obviously there is a semantic affiliation between SAY and WORDS, but on the other hand there is a clear difference between ‘saying something’ and ‘saying words’. For example, there is a clear difference between ‘X said something bad to person Y’ and ‘X said bad words to person Y’. Equally, it makes sense to speak of ‘saying the same thing in other words’. The form of expression (WORDS) must be distinguished from their content (what is said). Nevertheless it must be acknowledged that the conceptual syntax of WORDS is not yet fully understood. Concerning the cross-linguistic lexicalisation of WORDS, it is known that in many languages the word for WORDS is polysemous, and can also carry meanings such as ‘talk’, ‘way of speaking’, ‘story’ and ‘message’. In some languages it is morphologically related to ‘say’ or to ‘mouth’ ; for example, Malay kata-kata ‘words’ is a reduplication of kata ‘say’; Lao kham3 is polysemous between ‘word’ and ‘mouthful’. One question which comes easily to mind is whether polysynthetic languages have an item meaning WORDS, given that words in such languages are often multimorphemic, and that they can often convey a self-contained meaning equivalent to that of a whole sentence in English. The available data suggest that terms for WORDS do exist in polysynthetic languages, though as one might expect they are sometimes bound items. For example, Mohawk (Iroquoian) wenn-, Ojibwa (Algonquin) kidwin (Rhodes 1985: 207), Ngan’gityemerri (Daly, Australia) ngan’gi (Nicholas Reid, pers. comm.), Yimas (Papuan) pia-/-mpwi (William Foley, pers. comm.), Sm’algyax (Salish) algayax (Tonya Stebbins, pers. comm.). The available evidence indicates that sign languages also have a word for WORDS (Zeshan 2002: 154).
156
Cliff Goddard
2.4.
Relational substantives: KIND , PART
NSM research affirms that the concepts of KIND and PART, which underlie “vertical” taxonomic and meronomic relations (Schalley and Zaefferer this vol.), are universally lexicalised semantic primes. It is widely recognised that the concept ‘a kind of thing’ is the basic component for prototypical “common nouns”. That is, that words like dog, fish, seed, table, cup, etc. do not designate individual things but rather “classes” (kinds) of thing, cf. e.g. Carlson and Pelletier (1995). It would be mistaken to assume that all common nouns are semantically alike in this respect, but there can be no doubt that it is valid for a very large section of the nominal vocabulary. Linguists and cognitive anthropologists agree, furthermore, that taxonomic hierarchy is a basic principle of lexical organisation in the realm of living things (Berlin 1992). All languages have hierarchies of designations which specify that certain individually named animals and plants are ‘kinds’ of some higher-level “life forms”; for example, a sparrow is a kind of bird, a trout is a kind of fish, an oak is a kind of tree.8 From a syntactic point of view such expressions represent a distinctive construction type, which may be termed the “classifier relation”. In a phrase like many kinds of birds the relationship between many kinds and the phrase of birds is not like that between a head and a modifier in an attributive relation, but needs to be recognised as sui generis. All 17 of the languages canvassed in Goddard and Wierzbicka (1994) had a lexical unit with the meaning KIND, though in several languages this unit belonged to a polysemous lexeme. For example, in Yankunytjatjara KIND is expressed by a lexeme ini which can also mean ‘name’; e.g. wayuta kuka ini kutjupa ‘the possum is another kind of meat-animal’, while in another Australian language Kayardild, it is expressed by minyi, which can also mean ‘colour’ (Goddard 1994; Evans 1994). It is not possible to eliminate KIND (categorisation) in favour of shared properties (similarity), based on semantic prime LIKE. Though it is natural to think of ‘things of one kind’ as LIKE one another, the implication does not work the other way around, i.e. things of very different kinds may be alike in certain respects without this in any way compromising their categorical distinctiveness. There is a considerable literature in cognitive and developmental psychology upholding the early emergence of the KIND concept and its independence of similarity judgements (e.g. Gelman 2005; Gopnik and Meltzoff 1997, Chapter 6.; Keil 1989).
Semantic primes and conceptual ontology
157
Regarding PART, linguists seem to agree that the part-whole relationship is fundamental to the vocabulary structure of all languages.9 It is well-known, however, that there are languages which do not have a unique word for PART, and that in such languages the meaning is typically expressed by means of the word for ‘something’, ‘thing’, or ‘what’, used in a grammatical construction associated with “possession”. This can be illustrated with Yankunytjatjara example below (Goddard 1994); cf. Bugenhagen (2002) on Mangaaba-Mbula, Durie, Daud and Hasan (1994) on Acehnese. (1)
Puntu kutju, palu kutjupa-kutjupa tjuta-tjara. many-having body one but something ‘one body, but with many parts (somethings).’
This is a convenient time to observe that PART has a special compositional affinity with BODY. All languages provide numerous lexemes for ‘parts of a person’s body’ and the human BODY is, in a sense, the common prototype of SOMETHING with PARTS. The relational substantives KIND and PART can both combine with all the categorical substantives, not only with SOMEONE/PERSON and SOME THING /THING , but also with PLACE and TIME. The latter combinations are not as salient, but they are necessary for explicating certain kinds of concept. For example, towns and cities are ‘a kind of place’ where many people live, seasons (summer, winter, etc.) are ‘a kind of time’ when certain things happen and certain conditions prevail. Words such as day (in the “unit of time” sense) seem to qualify as ‘parts of a time’, and words like interior and outskirts are ‘parts of a place’. Geographical terms such as mountain, hill, and lake involve four elements SOMETHING, KIND , PART and PLACE ‘a kind of something, things of this kind are parts of places’, plus additional specifications. We turn now to an overview of how the nominal lexicon is built up in complex ways from substantive primes and other elements. 3. 3.1.
Semantic molecules and the structure of the nominal lexicon Semantic molecules
In addition to “atomic-level” categories and relations (semantic primes), much of the vocabulary structure of natural languages relies on intermediate-
158
Cliff Goddard
level concepts, termed “semantic molecules”, which are constructed from semantic primes (Goddard 1998, Chapter 6.; Wierzbicka 2003, 2004a, 2004b).10 For example, the semantic molecule ‘animal’ is necessary in the explications of cats, dogs, horses, etc.; the molecule ‘tree’ is needed in the explications of birch, pine, elm, and so on; the molecules ‘mother’ and ‘father’ are needed for the explication of other kin-words such as brother, sister, aunt, uncle, cousin, etc.; the molecule ‘hand’ is needed as a standard of size in the explications of numerous artefact items and other things, such as fruits and vegetables, which are handled by people; and almost all concrete vocabulary items require semantic molecules such as ‘long’, ‘round’, ‘flat’, ‘hard’, among others. Before we inquire into different ways in which semantic molecules function in explications, it is helpful to observe that there are many recurrent semantic components which are not molecules in the sense intended here. Some simple examples are top-level semantic components, such as: ‘a part of someone’s body’ (for body-part terms), ‘a kind of living thing’ (for natural kind terms), ‘a kind of place where people live’ (for words like town, city, village, etc). Though these components are very simple meanings to grasp, they are not lexicalised as individual words in the ordinary English vocabulary.11 Likewise, many verbs contain components like ‘X did this because X wanted to do it’ (for volitional verbs) or ‘X didn’t do this because X wanted to do it’ (for non-volitional verbs). For a more complex example, we can look to emotion terms in English, such as angry, sad, happy, jealous, etc. The explications for such terms (in frames like ‘this person feels angry/sad/happy’) conform to a common semantic template (Wierzbicka 1999). They begin with the top-level component: ‘this person feels something (good/bad), as people feel when they feel something because they think something like this: —’. Then follows a prototypical cognitive scenario, setting out certain characteristic thoughts, wants and assumptions appropriate to the individual emotion term being explicated. But the recurrent top-level component is not a “semantic molecule” in the sense under discussion, because it is not encapsulated, so to speak, as the meaning of a surface lexical item. When we speak of semantic molecules, what we have in mind is a packet of semantic components which exists as the meaning of a lexical unit: properly speaking, a lexico-semantic molecule. This is not to dismiss or to marginalise the significance of those recurrent but “non-molecular” semantic components. They are extremely important for the creation of lexical classes and for setting up correspondences between
Semantic primes and conceptual ontology
159
lexical items. For example, “emotion words” such as angry, sad, happy, jealous, and so on, are a lexical class precisely because they are all constructed in terms of a common template or schema, as just described. Recurrent but non-molecular semantic components are also important for the interface between lexical and grammatical semantics. For example, so-called semantic roles in the verbal lexicon are recurrent components of this kind, e.g. ‘X did something’ (actor), ‘someone did something to X’ (patient), ‘someone did something to something with X’ (instrument). Semantic molecules, however, have a particular kind of cognitive significance. They allow a kind of “conceptual chunking” which makes it possible for the mind to manage concepts of great semantic complexity. It is an empirical finding of NSM research that some kinds of concept (such as those to do with emotions, values, speech acts, and interpersonal relations) are semantically much simpler than others (such as those to do with artefacts, animals and plants, the environment, and human activities), precisely because the former can be explicated directly in terms of semantic primes, whereas the latter can only be explicated in stages using semantic molecules. Consider artefact terms such as cup, knife, umbrella, hammer, and the like (Wierzbicka 1985, 2003; Goddard 1998, Chapter 6.). To characterise such items as things made by people requires us to make use of the concept of ‘making something’, and although this concept appears to be very widely lexicalised across languages (Goddard 2001), it is not semantically simple. It involves a configuration of a certain intention, some actions involving some materials, and an outcome, i.e. something exists which did not exist before (cf. Wierzbicka 2003: 21): (2)
someone made something (Y) out of Z (e.g. made an omelette out of eggs) someone did some things to some things because this person thought like this: “I want there to be something of kind Y here now because of this I have to do some things to these other things (Z)” because this person did like this, afterwards there was something of kind Y in this place the other things (Z) were parts of this thing of kind Y
Having established a satisfactory explication for ‘make’, we are free to employ it as a semantic molecule in other explications, marking its status by means of an [M]. Artefact words can then begin with the following component, which is itself a recurrent component of hundreds of English words:
160 (3)
Cliff Goddard one kind of thing things of this kind exist because people make [M] them people make [M] things of this kind because they want to do things with them
To take an example from the realm of natural kinds, specifically from ethnozoology, let us adopt the standard assumption that the meanings of generic terms like cats, mice, horses, etc. incorporate reference to a “life form”; specifically, that they begin with the component ‘a kind of animal [M]’. The concept of animals can be explicated as follows. (4)
animals one kind of living thing living things of this kind are like people in some ways, they are not like people in other ways some parts of their bodies are like parts of people’s bodies, other parts of their bodies are not like parts of people’s bodies there are many kinds of living things of this kind
‘Animal [M]’ qualifies as a semantic molecule because, firstly, it is the meaning of an ordinary English lexeme,12 and secondly, because this lexical item appears in the explication of subsequent terms, such as cats, mice, horses, etc. Another kind of semantic molecule concerns certain features of the natural world, such as the sky and the ground. They can be explicated directly in terms of primes; for example (Wierzbicka 2003: 18): (5)
sky a place this place is very big this place is above all other places this place is very far from all other places people can see this place
The molecule ‘sky [M]’ is needed in the explication of other “environmental” concepts, such as rain, clouds, sun, and moon, and to explicate the English colour concept blue, and allied concepts in other languages. Molecule ‘sun [M]’ enters into words for certain temperature concepts (such as English hot and warm, cf. Goddard and Wierzbicka in press) and certain colour concepts (such as English yellow), and for various other visual concepts such as bright and shiny.
Semantic primes and conceptual ontology
3.2.
161
Levels of structure in the nominal lexicon
It is already apparent that there can be multiple levels of “nesting” within the structure of complex concepts. Questions such as the following arise: How many levels of semantic nesting are there? Are there any differences in the kinds of molecules found at different levels? Do they function differently, in terms of how they fit into the overall meaning structure? To approach these questions it is useful to look at part of the explication for a complex concept, such as the natural kind cats. The following is the first part of such an explication (adapted from Goddard 1998). According to Wierzbicka (1985), explications for animal terms follow a uniform semantic template: (a) category within the taxonomic hierarchy, (b) habitat, (c) size, (d) appearance, (e) behaviour, (f) relation with people. The explication below only includes sections (a)–(d). In the case of cats, the subsequent sections are very long and elaborate, reflecting the detailed folk knowledge people have of these animals which share their lives. The (a) component establishes cats as ‘a kind of animal [M]’. The (b) components claim that cats are conceptualised primarily as domestic animals. Notice that the size component (c) is defined in relation to the human body, a kind of anthropocentrism which recurs in countless words of diverse types. In the case of cats, the size component mentions handling by humans, further confirming their status as a domestic animal. The components in (d) identify the distinctive physical features of cats as soft fur, a round head with pointy ears, a special kind of eyes, whiskers, a long tail, and soft feet with small sharp claws. (6)
cats ⇒ a. one kind of animal [M] b. animals of this kind live with people they can live in places where people live they can live near places where people live c. they are not big a person can pick up [M] one with two hands [M] d. they have soft [M] fur [M] their ears [M] are on both sides of the top [M] part of the head [M], they are pointed [M] their eyes [M] are not like people’s eyes [M]
162
Cliff Goddard they have some stiff [M] hairs [M] near the mouth [M], on both sides of the mouth [M] they have a tail [M] they have soft [M] feet [M] they have small sharp [M] claws [M]
Two kinds of semantic molecules predominate in this explication: bodypart terms (hands, head, ears, eyes, mouth, hairs, fur, feet, tail, claws), and physical descriptors of various kinds (round, pointed, soft, stiff, sharp). To these two groupings, one can add bodily actions and postures (eat, climb, jump, etc.) which are found in abundance in subsequent sections of the cats explication, and in many other explications. Obviously bodily actions and postures themselves require the use of body-part terms. If we now look into the semantics of body-part terms, it emerges that certain physical descriptors are required as molecules in these explications. For example, head (in the sense of a human person’s head) requires the shape descriptor ‘round [M]’, and legs requires the shape descriptor ‘long [M]’ (Wierzbicka 2003, to appear).13 For example:14 (7)
(someone’s) head one part of someone’s body this part is above all the other parts of this someone’s body this part is something round [M]
(8)
(someone’s) legs two parts of someone’s body these two parts are below the other parts of this someone’s body these two parts are long [M] these two parts of someone’s body can move as this someone wants because people’s bodies have these parts, people can move in many places as they want
Given that these body-part terms rely on shape descriptors, can we conclude that shape descriptors are relatively more basic, in the sense of belonging to a deeper level of decompositional semantics? Recent research indicates that the true situation is more complex and more interesting than that (Wierzbicka 2003, 2004a, to appear). It turns out that one human body-part, namely, hands, has a special priority over all others. Not only can hands be explicated directly in semantic primes, without the need for shape descriptors or any other semantic molecules, but hands itself is required as a molecule
Semantic primes and conceptual ontology
163
in the explication of shape descriptors. This is because shape descriptors designate properties which are jointly visual and “tangible”, and to spell out the nature of the latter concept requires both the semantic prime TOUCH (contact) and the semantic molecule ‘hands [M]’. For example (Wierzbicka 2004a): (9)
something long (e.g. a tail, a stick, a cucumber) when a person sees this thing this person can think about it like this: “two parts of this thing are not like any other parts because one of these two parts is very far from the other” if a person’s hands [M] touch this thing everywhere on all sides, this person can think about it in the same way
(10)
something round (e.g. an orange) when a person sees this thing this person can think about it like this: “I can’t say: ‘some parts of this thing are not like some other parts because some parts are far from some other parts’ ” if a person’s hands [M] touch this thing everywhere on all sides, this person can think about it in the same way
To back up the assertion that hands can be explicated directly in terms of semantic primes, I will adduce the following explication (Wierzbicka 2003: 28, to appear). Unfortunately, for reasons of space it is not possible to discuss or justify it in any detail. (11)
(someone’s) hands two parts of someone’s body they are on two sides of this someone’s body these two parts of someone’s body can move as this someone wants these two parts of someone’s body have many parts if this someone wants it, all the parts on one side of one of these two parts can touch all the parts on the same side of the other one at the same time because people’s bodies have these two parts, people can do many things with many things as they want because people’s bodies have these two parts, people can touch many things as they want
The idea that shape concepts depend on the body-part concept ‘hand’ has profound implications for human experience and human conceptualisation: [W]hat might seem to be objective properties of the physical world are in fact often projections of tactile experience onto the world of objects. The experience of “handling” things, of touching them with one’s hands and moving
164
Cliff Goddard
the hands in an exploratory way plays a crucial role in making sense of the physical world... [The] very hiddenness of ‘hands’ in some of our most basic everyday concepts – such as ‘long’, ‘round’ and ‘flat’ – is a witness to its fundamental importance in human cognition: human hands mediate, to a large extent, between the world and the human mind (Wierzbicka in press).
3.3.
Conceptual complexity and conceptual transparency
So far as the questions raised above about semantic levels and nesting are concerned, we can provisionally conclude as follows. Concepts can differ greatly both in their conceptual complexity and in their “conceptual transparency”, in the sense of how readily their semantic complexity is apparent to ordinary untutored semantic intuition. It would appear that concepts such as those designated by non-concrete vocabulary, such as words for emotions, epistemic states and processes, and speech-acts are moderately complex (in the sense that they are constituted directly from semantic primes), but also relatively transparent. The same goes for environmental terms, like sky and ground, and for certain social category concepts, such as men, women, children (Goddard and Wierzbicka, forthcom.). Words like these can be explored relatively easily by systematic introspective study and tested relatively easily against na¨ıve speakers’ intuitions, because they can be brought relatively easily to the surface of people’s consciousness. The situation is different with words for generic physical descriptors, such as those of shape (long, round, flat, among others) and for “ethnogeometrical” terms, such as edges and ends (cf. Brotherson in press). Although they are only of an moderate level of semantic complexity, inasmuch they can be explicated in terms of semantic primes and a single semantic molecule (‘hands’), they are conceptually opaque. To ordinary intuition they appear basic and unanalysable; and explications for them require considerable concentration to process and evaluate (let alone to derive), even for highly trained and experienced semantic analysts. Presumably this is because such concepts are formed very early in childhood and are subsequently incorporated into the molecular substructure of so many other concepts. Terms for most body-parts and for simple bodily actions and processes represent a step up in semantic complexity, because they usually incorporate several semantic molecules for shape and dimension, but they too are relatively conceptually transparent. Several other lexical domains also combine moderate semantic complexity with relative transparency; including words
Semantic primes and conceptual ontology
165
for simple items of furniture such as chair, bed, table; words for bodily activities such as walk, sit, eat; words for places connected with human activity, house, shop, school; and words for landforms and geographical features and zones, such as river, mountain, beach, forest, bush. See the Appendix for two sample explications. Terms for natural kinds and artefacts are still more complex because they incorporate many more semantic molecules, and also because they can encapsulate tremendous amounts of cultural knowledge. For natural kinds, this applies in particular to terms for animal species with which people have close relationships, so to speak, either currently or in historically recent times. To illustrate from English, cat, dog, mouse, and horse are far richer in conceptual content than, say, stoat, moose, and kangaroo. Similarly, artefact terms can differ in their amount of conceptual information depending on the complexity of the function or functions they are intended to serve, including their degree of cultural embeddedness. For example, a word like knife is simpler, in this sense, than one like cup, because cup involves a great deal more information about the canonical uses of the referent (Wierzbicka 1985). Despite the extra complexity, however, words of these kinds can be regarded as relatively conceptually transparent. Relatedly, there is usually an abundance of lexical evidence of various kinds (common phrases, endonyms, patterns of reference, etc.) which assist the analyst in bringing to light different aspects of the underlying concept. It appears that there are as many as five levels of nesting in high complexity words such as natural kind and artefact terms. In the explication for cats or chairs, for example, the most complex molecules are bodily action verbs like ‘eat [M]’ or ‘sit [M]’. Nested within them are body-part molecules such as ‘mouth [M]’ and ‘legs [M]’. They in turn contain shape descriptors, such as ‘long [M]’, ‘round [M]’, ‘flat [M]’, and others, and they in turn harbour the molecule ‘hands [M]’, composed purely of semantic primes. Additionally, a further level of nesting occurs in some words, because natural kind and artefact terms can themselves function as semantic molecules at a shallow level of semantic structure. This can occur in at least two ways. First, there are words for unfamiliar species, such as tiger and zebra, which contain a “likeness” reference to a familiar kind, such as cat and horse, respectively. Second, there are endonymic terms such as purr and saddle, which again contain references to cat and horse, respectively. Finally, it should be noted that while on current evidence it appears that some semantic molecules (such as ‘hand’) are probably universal, many oth-
166
Cliff Goddard
ers are known to be language-specific. For example, Wierzbicka (2006) argues that the English shape concept ‘long [M]’ does not exactly match the comparable Polish molecule ‘podl˜u˙zny [M]’ ‘oblong, elongated’. Brotherson (in press) argues that the English “ethnogeometrical” molecule ‘ends [M]’ differs from its nearest counterpart tapu in Makasai (East Timor), with followon effects for many shape words (e.g. Makasai asan, bokun and leben do not exactly match their nearest English counterparts ‘long’, ‘round’ and ‘flat’, respectively). There can also be differences in the inventories of productive higher-level molecules in particular languages. For example, though many taxonomic concepts in the Polish lexicon are either identical or closely correspond to their counterparts in English (e.g. zwierz ‘animal’, ptak ‘bird’, ryba ‘fish’, drzewo ‘tree’), the word grzyb ‘mushroom’ functions as a semantic molecule in Polish while its English counterpart does not (Polish has many words for kinds of grzyby ‘mushrooms’, and various other endonymic words which include ‘grzyb [M]’ in their meanings). Though still in a formative stage, the exploration of semantic molecules promises to shed a great deal of light on conceptual structure, as well as to contribute to a general theory of vocabulary structure. 4.
Concluding remark
This chapter has outlined a section of the ontology implicit in the NSM approach to semantic analysis, concentrating on the substantive elements. In part this ontology can be seen as rather traditional and conservative; for example, in its recognition of the four general categoricals SOMEONE , SOME THING / THING , WHERE / PLACE, and WHEN / TIME, and the relational substantives KIND and PART. On the other hand, its postulation of the specific substantives PEOPLE , BODY, and WORDS is unusual and distinctive. The chapter also sought to explain and illustrate how very complex, language-specific meanings can be built up from semantic primes, via a series of nested semantic molecules composed from semantic primes.
Semantic primes and conceptual ontology
167
APPENDIX: Two complex meanings: walking (Wong, Goddard, and Wierzbicka to appear), chairs (unpublished work); semantic molecules marked as [M] A. someone (person X) was walking (in a place) for some time a. person X was doing something somewhere for some time at the same time some parts of this person’s body were moving as this person wanted b. this person was doing this as people do when they are doing something somewhere with their legs [M] for some time because a short time before they thought like this: “I want to be after some time in another place” c. when someone does something like this, this person does the same thing with the legs [M] many times: when this person does something with one leg [M], one foot [M] moves for a short time as this person wants at the same time the other foot [M] is touching the ground [M] after this, this person does the same thing with the other leg [M] at this time, the other foot [M] moves for a short time in the same way it happens like this: before one foot [M] moves, it is touching the ground [M] somewhere after this, for a short time, it is not touching the ground [M] after this, it is touching the ground [M] in another place this other place is in front [M] of this person d. if someone does something like this for some time, afterwards they can be far from somewhere where they were before B. chairs a. one kind of thing things of this kind exist because people make [M] them b. people make [M] them because they want them to be in places where people live people want them to be in these places because they want people to be able to sit [M] on them when they have to do something somewhere for some time people want people to be able to sit [M] on them at times like these because they don’t want them to feel something bad in their bodies at these times
168
Cliff Goddard
c. people make [M] things of this kind for one person to sit [M] on something of this kind doesn’t have to be in the same place all the time d. two parts of something of this kind are big one of these two parts is flat [M] the other one of these two parts is above this one, on one side the bottom [M] part of this part touches the flat [M] part some other parts are below these two parts the top parts of these other parts touch the big flat [M] part the bottom [M] parts of these parts touch the ground [M] e. when a person is sitting [M] on something of this kind, this person’s bottom [M] is touching the flat [M] part the bottom [M] parts of the upper [M] parts of this person’s legs [M] are touching the same part the bottom [M] parts of this person’s feet [M] are touching the ground [M] the person’s back [M] is touching the other big part
Notes 1. The choice of substantive primes is partly motivated by reference to traditional metaphysical ontology, which concerned itself with the nature of existence and with “kinds of being”. This also accounts for my use of the word ‘conceptual’ in the expression ‘conceptual ontology’, which would seem to be redundant from the standpoint of cognitive semantics and AI (Schalley and Zaefferer this vol.). 2. Many logicians and formal semanticists assume that rigourous representation of meaning using natural language, even a disciplined and regularised form of natural language, is impossible. From an NSM perspective, there is little point in debating this assumption in the abstract, since we claim to be able to show by demonstration that it is false. As to the argument that “higher mathematics needs to go beyond natural language, so why not linguistic semantics?”, my reply would be as follows. In the case of mathematics it has been shown, not simply assumed, that ordinary language is insufficient; but more importantly, mathematics is not purporting to represent the meanings expressed in ordinary everyday utterances or to represent the cognitive reality of ordinary language users. 3. It would be an exaggeration to say that the NSM metalanguage is fully specified, since the syntax of some proposed primes is still subject to ongoing research, and because there is still a deal of work to be done on formalising the structure of explications. Nonetheless, the system is much better specified than most critics or casual observers realise. 4. The published body of NSM work is not confined to a few staple examples (such as tiger, gold, bachelor, chase, etc.) but consists of many hundreds, if not thousands, of explications. Numerous languages are also represented, including Chi-
Semantic primes and conceptual ontology
5. 6.
7.
8.
9. 10. 11.
169
nese, Ewe, French, Japanese, Korean, Malay, Mangaaba-Mbula, Polish, Russian, Yankunytjatjara, among others. The NSM research bibliography is extensive. See the NSM Homepage: http://www.une.edu.au/arts/LCL/nsm/. A lexical unit is the pairing of a single specifiable sense with a lexical form (Cruse 1986: 77–78, cf. Mel’cuk 1989). A polysemous word is a lexeme which consists of more than one lexical unit. Of course one does not expect that exponents of semantic prime WHEN / TIME in other languages will exhibit the same range of polysemic extensions and phraseological potentials as English time, e.g. in expressions like to have time, time flies, lunch-time, and so on. Semantic prime WHEN / TIME is required to manifest itself only in a narrow range of syntactic contexts, such as in expressions like ‘at this time’, ‘at the same time’, and ‘when this happens, . . . ’. German Leute is not suitable as the exponent of PEOPLE because it does not work in certain necessary contexts and uses. For example, in order to explicate the concepts of ‘men’ and ‘women’ (cf. Goddard and Wierzbicka, forthcom.), the following components are necessary (among others): ‘there are two kinds of people, the bodies of people of one kind are not like the bodies of people of the other kind’. Leute would not be a suitable match for PEOPLE here. However, the exact semantics of Leute and its relationship to Menschen remains an important problem for research. Taxonomic hierarchy does not apply across the board, even in the realm of living things. For example, the explication of apple does not begin ‘a kind of fruit’, because fruit is not a true taxonomic term at all, but rather a “functional collective” term (Wierzbicka 1985). For the same reason, explications for words like table and chair do not begin with the component ‘a kind of furniture’. Functional collective terms generalise across a range of kinds of things in terms of function; e.g. furniture (very roughly) designates ‘many kinds of things people make to have in places where they live, because they want some things to be in these places, because they want to be able to do some things with them’. Functional collectives, such as furniture, have distinct grammatical characteristics compared with true taxonomic terms. For example, they are not count nouns (*two furnitures), and they can enter into “unit counter” constructions, such as two pieces/items of furniture. In English the term part can be used not only for semantic prime PART, but also in “partitive” contexts, such as He ate part of the melon. However, the latter usage is semantically based on another prime, namely SOME ( OF ). The importance of intermediate-level concepts has long been stressed by the Moscow School semanticists; cf. Apresjan (1997, 2000a, 2000b, 2003), Mel’cuk (1989). The English word creature could possibly match ‘a kind of living thing’, if one were to assume that semantic prime LIVE does not correspond precisely to English live. The problem is that, in ordinary English, plants can count as ‘living things’ (albeit not particularly salient instances of the category). Notably however, the LIVE WITH valency of LIVE is not applicable to plants, though it can be extended to animals which live in flocks, herds, etc., or which look after their young in “families”, cf. Goddard (in press a).
170
Cliff Goddard
12. This explication applies to the common “na¨ıve” usage of animal, which contrasts with bird, insect, etc (cf. book titles like Animals and Birds of Australia). There is another, quasi-scientific sense of the word, in which even an insect or a spider could be termed an animal. 13. The English words long, round and flat are polysemous. The simpler meanings, in each case, can be termed “shape descriptors”. They have a classifier-like function, are not gradable, and are found in expressions of the form ‘something long’, ‘something round’, and ‘something flat’, respectively. An animal’s tail, for example, is inherently ‘something long’, in this sense. In a different and (slightly) more complex meaning, long is a dimension term, it is gradable, and in contrast with the term short. It should be evident from the syntactic context which meaning is intended in an explication. 14. In the NSM view, the primary meanings of body-part terms are based on the human body and when the same words are used about other creatures, they are being used in different meanings, which are polysemic extensions from the core “anthropocentric” meanings. This explains why these explications are not applicable to animals, which typically have four legs rather than two, and whose heads are typically not above the other parts of their bodies.
Acknowledgements I am grateful to Andrea Schalley and Birgit Hellwig for comments which have helped improve an earlier version of this chapter. Needless to say, they are not responsible for remaining errors or infelicities. The explications were developed collaboratively with Anna Wierzbicka.
References Allan, Keith 2001
Natural Language Semantics. Oxford: Blackwell.
Andersen, Elaine S. 1978 Lexical universals of body-part terminology. In Universals of Human Language, Volume III: Word Structure, Joseph H. Greenberg (ed.), 335–368. Stanford: Stanford University Press. Apresjan, Juri 1997 Novyj Ob”jasnitel’nyi Slovar’ Sinonimov Russkogo Jazyka [New Explanatory Dictionary of the Synonyms of the Russian Language]. Vol. 1. Moscow: Jazyki Russkoj Kul’tury. 2000a Novyj Ob”jasnitel’nyi Slovar’ Sinonimov Russkogo Jazyka [New Explanatory Dictionary of the Synonyms of the Russian Language]. Vol. 2. Moscow: Jazyki Russkoj Kul’tury.
Semantic primes and conceptual ontology 2000b 2003
Berlin, Brent 1992
171
Systematic Lexicography. Translated by Kevin Windle. Oxford: Oxford University Press. Novyj Ob”jasnitel’nyi Slovar’ Sinonimov Russkogo Jazyka [New Explanatory Dictionary of the Synonyms of the Russian Language]. Vol. 3. Moscow: Jazyki Russkoj Kul’tury.
Ethnobiological Classification: Principles of Categorization of Plants and Animals in Traditional Society. Princeton, NJ: Princeton University Press. Brotherson, Anna in press The ethnogeometry of Makasai. In Goddard (ed.), in press b. Bugenhagen, Robert D. 2002 The syntax of semantic primes in Mangaaba-Mbula. In Goddard and Wierzbicka (eds.), Vol. II, 1–64. Carlson, Gregory N., and Francis J. Pelletier (eds.) 1995 The Generic Book. Chicago: University of Chicago Press. Cruse, D. Alan 1986 Lexical Semantics. Cambridge: Cambridge University Press. Durie, Mark, Bukhari Daud, and Mawardi Hasan 1994 Acehnese. In Goddard and Wierzbicka (eds.), 171–201. Evans, Nicholas 1994 Kayardild. In Goddard and Wierzbicka (eds.), 203–228. Gelman, Susan A. 2005 The Essential Child: Origins of Essentialism in Everyday Thought. New York: Oxford University Press. Goddard, Cliff 1994 Lexical primitives in Yankunytjatjara. In Goddard and Wierzbicka (eds.), 229–262. 1998 Semantic Analysis. Oxford: Oxford University Press. 2001 Lexico-semantic universals: A critical overview. Linguistic Typology 5 (1): 1–66. 2002a Ethnosyntax, ethnopragmatics, sign-functions, and culture. In Ethnosyntax. Explorations in Grammar and Culture, Nick J. Enfield (ed.), 52–73. Oxford: Oxford University Press. 2002b The on-going development of the NSM research program. In Goddard and Wierzbicka (eds.), Vol. II, 301–322. 2004 The ethnopragmatics and semantics of “active” metaphors. Journal of Pragmatics 36: 1211–1230. 2006 Natural Semantic Metalanguage. In Encyclopedia of Language and Linguistics, 2d ed., Keith Brown (ed.), 544–551. Oxford: Elsevier. in press a Towards a systematic table of semantic elements. In Goddard (ed.), in press b. Goddard, Cliff (ed.) in press b Crosslinguistic Semantics. Amsterdam: John Benjamins.
172
Cliff Goddard
Goddard, Cliff, and Anna Wierzbicka 2002 Semantic primes and universal grammar. In Goddard and Wierzbicka (eds.), Vol. I, 41–86. in press NSM analyses of the semantics of physical qualities: sweet, hot, hard, heavy, rough, sharp in cross-linguistic perspective. Studies in Language. forthcom. Men, women and children: The semantics of basic social categories. Goddard, Cliff, and Anna Wierzbicka (eds.) 1994 Semantic and Lexical Universals – Theory and Empirical Findings. Amsterdam: John Benjamins. 2002 Meaning and Universal Grammar – Theory and Empirical Findings. Vol. I & II. Amsterdam: John Benjamins. 2004 Cultural Scripts. Special issue of Intercultural Pragmatics 1 (2). Gopnik, Alison, and Andrew N. Meltzoff 1997 Words, Thoughts and Theories. Cambridge, MA: MIT Press. Haspelmath, Martin 1997 Indefinite Pronouns. Oxford: Clarendon Press. Keil, Frank C. 1989 Concepts, Kinds, and Cognitive Development. Cambridge, MA: MIT Press. Lewis, Gilbert 1974 Gnau anatomy and vocabulary for illness. Oceania 45 (1): 50–78. Mel’cuk, Igor 1989 Semantic Primitives from the Viewpoint of the Meaning-Text Linguistic Theory. Quaderni di Semantica 10 (1): 65–102. Meira, S´ergio 2006 Tiriy´o body part terms. Language Sciences 28: 262–279. Mihatsch, Wiltrud this vol. Taxonomic and meronomic superordinates with nominal coding. Nordlinger, Rachel 1998 A Grammar of Wambaya, Northern Australia. Canberra: Pacific Linguistics. Rhodes, Richard A. 1985 Eastern Ojibwa-Chippewa-Ottawa Dictionary. Berlin: Mouton. Schalley, Andrea C., and Dietmar Zaefferer this vol. Ontolinguistics – An outline. Wierzbicka, Anna 1972 Semantic Primitives. Frankfurt: Athen¨aum. 1985 Lexicography and Conceptual Analysis. Ann Arbour: Karoma. 1996 Semantics: Primes and Universals. Oxford: Oxford University Press. 1999 Emotions across Languages and Cultures. Cambridge: Cambridge University Press. 2002 Semantic primes and universal grammar in Polish. In Goddard and Wierzbicka (eds.), Vol. II, 65–144.
Semantic primes and conceptual ontology 2004a
173
Colour and shape in language and thought. Plenary paper at the International Language and Cognition Conference, Coffs Harbour, Australia. September 10–12, 2004. 2004b The semantics of colour: a new paradigm. Keynote paper at the 4th international conference Progress in Colour Studies (PICS04), Glasgow University, June 29–July 3, 2004. 2006 Shape in grammar revisited. Studies in Language 30 (1): 115–177. in press Shape and colour in language and thought. In Mental States: Language and Cognitive Structure, Andrea C. Schalley and Drew Khlentzos (eds.). Amsterdam: John Benjamins. to appear Bodies and their parts: An NSM approach to semantic typology. Wilkins, David P. 1996 Natural tendencies of semantic change and the search for cognates. In The Comparative Method Reviewed: Regularity and Irregularity in Language Change, Mark Durie and Malcolm Ross (eds.), 264– 304. New York: Oxford University Press. Wong, Jock Onn, Cliff Goddard, and Anna Wierzbicka to appear Walk, run, climb, jump: Linguistic conceptualisation of bodily motion. Zeshan, Ulrike 2002 Towards a notion of ‘word’ in sign languages. In Word: A Cross-Linguistic Typology, R. M. W. Dixon and Alexandra Aikhenvald (eds.), 153–179. Cambridge: Cambridge University Press.
Using ‘Ontolinguistics’ for language description Scott Farrar 1.
Introduction: The knowledge sort problem
An aim of descriptive linguistics is to provide an account of the observable facts concerning individual languages. As such, descriptive linguistics is primarily concerned with data that bring out notable characteristics of particular languages. But in descriptive accounts, as well as those of a more theoretical nature, it is often a problematic endeavor to determine the difference between language and what language is about. The difficulty is often reflected in the terminology used to analyze language data, as very often linguists mix linguistic with non-linguistic terms. For example, an outsider to linguistics might be surprised to find in our theoretical machinery such notions as ‘animate’ and ‘shape’, and especially to see them intermixed so freely with notions as ‘tense’ and ‘grammatical gender’. Such notions are juxtaposed routinely in descriptive accounts of language and usually without consideration of how such notions relate to one another, or if they even make sense as categories. The two sorts listed here differ such that the former is usually associated with what linguists call world or simply non-linguistic knowledge. The latter on the other hand falls squarely under the heading of linguistic knowledge. The situation with descriptive linguistics is indicative of a larger issue within the entire field: the precise ontological nature of linguistic knowledge, a primary aim of Ontolinguistics. Indeed there exists a relatively large body of research debating the necessity of world knowledge in accounting for linguistic phenomena and whether, in the context of particular linguistic theories at least, there ought to be such a distinction in the first place. An excellent introduction to the main issues of the knowledge sort problem, as it is called here, is given in Peeters (2000). Theories of world knowledge have traditionally been left up to philosophers or, more specifically, to ‘ontologists’. Recently, however, the fields of knowledge engineering and artificial intelligence have emphasized the creation and use of ontologies as tools for organizing world and domain-specific knowledge for expert systems and other knowledge-rich applications (e.g., Niles and Pease 2001). Though this work (and the literature concerning Ontology proper) seems quite relevant to the knowledge sort problem in linguistics, it
176
Scott Farrar
has until very recently been largely separate from the linguistics literature. Bateman (1997) offers a discussion of some notable exceptions to this trend. This situation is changing, and the emergence of knowledge-rich linguistics – known in this volume as ‘Ontolinguistics’ – is one example of the change. Ontolinguistics concerns itself with describing the conceptual content behind linguistic code and how the conceptual system might affect the organization of language, and possibly vice versa. Ontolinguistics, then, must address the knowledge sort problem head on. This chapter, then, proposes a descriptive ontology that is compatible with the aims of Ontolinguistics. This ontological account makes explicit the distinction between linguistic and non-linguistic knowledge and allows for flexible relationships among the elements of these knowledge sorts. It is hoped that by clearly defining the knowledge sort problem, it may be better formalized, and the issue will become easier to resolve. The ontology that is proposed is the General Ontology for Linguistic Description (GOLD), first introduced by Farrar and Langendoen (2003). GOLD attempts to give an account of the most basic categories and relations used in the scientific description of human language. That is, GOLD should – ideally at least – capture the knowledge of a well-trained documentary linguist. In an attempt to capture knowledge that is widely accepted – what can be considered the “standard knowledge” of the field – canonical academic sources were used in the construction of GOLD, especially for the wide variety of morphosyntactic features (e.g., Crystal 1997). As a descriptive ontology meant to be useful for real-world data, GOLD is also empirically grounded in the sense that actual data produced by field linguists has been consulted, especially for the more specific categories. The specific goals of this chapter are as follows: Section 2. describes the methodology for ontology construction with reference to various knowledge components. Section 3. describes the GOLD ontology itself. It is concluded in Section 4. that GOLD offers a possible starting point for further development of a comprehensive Ontolinguistics framework.
2.
Methodology for ontological engineering
This section will serve to introduce a concrete methodology that derives from the basic tenets of ontological engineering. As such, the details of the proposed ontology will not be taken up in the current section, rather the general
Using ‘Ontolinguistics’ for language description
177
steps in its creation will be explained. The following discussion is inspired from Borgida and Brachman (2003: 379), Guarino and Welty (2002, 2004), and Franconi (2004: 30–34). The first step in the realization of the ontology is to enumerate the basic categories found in the domain of discourse. Linguistics has been traditionally subdivided into coarse domains such as syntax, discourse, phonology and the like, but there are also subdomains within the major domains, e.g., feature theory in phonology. Linguistic (sub)domains are relatively well-delineated in particular theories, but, as discussed in the introduction, a much more far-reaching, theory-independent organization is difficult to achieve due to overlapping terminologies and the mixing of linguistic types. For example, ‘Warumungu absolutive’, ‘Russian perfective’, and ‘English past tense’ are instances of morphosyntactic categories, and they are usually taken to be the atoms in approaches to morphosyntax. It would be odd to categorize a notion such as ‘noun’ as an instance of a morphosyntactic category or, more to the point, as an instance at all. The notion of ‘noun’ is clearly categorial in nature, and is thus a class. Classes have instances, e.g., particular nouns in particular sentences. In general whether an entity is category-like or instance-like is determined by meta-ontological criteria. To some extent, this is an arbitrary modeling choice. For example, in some special contexts ‘noun’ could be modelled just as well as an instance of ‘part of speech’. However, the commonly accepted notion of ‘noun’ is usually construed as a category for most linguistic theories. The point here is that ontological principles can be applied to reveal what the best modeling choice is for a notion such as ‘noun’. Once the meta-ontological status of the various domain entities is determined, the next step is to develop class taxonomies. The way taxonomies are structured ultimately derives from basic philosophical assumptions and from the theoretical assumptions in the domain. One such general methodology for constructing taxonomies is the OntoClean methodology (cf. Guarino and Welty 2002, 2004). OntoClean uses the notion of meta-properties to motivate distinctions within an ontology. Meta-properties are properties of properties, not of objects in the world and are used to constrain ontology development and to evaluate particular proposed ontological organizations. The meta-properties particularly important for OntoClean are: ‘rigidity’, ‘identity’, ‘unity’, and ‘dependence’. Rigidity refers to essential properties, i.e., properties that an entity cannot loose without ceasing to exist; identity refers to properties used for discriminating among entities; unity refers to the wholeness of an entity, i.e., whether it has parts, boundaries and so on; and depen-
178
Scott Farrar
dence determines whether an entity can exist independently or if it needs to be carried by another (e.g., the color of an object is dependent for its existence on that object, the hole of a doughnut is dependent for its existence on that doughnut, etc.). Ontoclean would ensure that a class such as C OMPOUND could not be both a subclass of W ORD and C ONSTITUENT , at least not in the same ontological stratum. But at a more general level, the various domain-specific classes must be merged with the upper ontology. In order to carry this out, the nature of each domain-specific class must be determined. For instance, to use the categories of SUMO (Niles and Pease 2001), whether a class is A BSTRACT or P HYSICAL is the most basic distinctions that can be made at the upper level. As most linguistic phenomena are indeed abstract – that is, other than actual utterances or printed words (see Section 3.) – a more problematic decision is whether an entity belongs to ATTRIBUTE or R ELATION . To anticipate the discussion that follows, it will be argued that linguistic features are one example of a class that lends itself to classification as an ATTRIBUTE , actually I NTERNAL ATTRIBUTE . Discourse relations are one example of a class of entities best modelled as instances of R ELATION . Once the various taxonomies are developed, it is then necessary to add the individuals that are always present in the domain, those class instances that must be there for further ontological modeling. Particular linguistic relations, such as ‘constituentOf’, are examples of this. The next step in the methodology is to take the basic relations and identify the domain and range restrictions according to the available classes. For example, the discourse relation ‘expansion’ must take two discourse entities as its domain and range arguments. At this point, it is necessary to further refine the various ontological entities by providing definitional axioms that state what must hold given the classes, relations, and individuals in the ontology. For example, one axiom might postulate the following: syntactic elements belonging to the category ‘adverb’ never have a ‘tense’ feature. For example, if Greenbergian-type universals (Greenberg 1966) are to be encoded in the ontology, then it is at this point where they should be included. The amount of detail depends of course on the particular application of the ontology. In general axioms should be limited to asserting what must be the case versus what can be the case. In short, the following enumeration shows the methodology used for the current work, and is a suggestion for constructing ontologies in general:
Using ‘Ontolinguistics’ for language description
179
1. Development of the overall structure of the ontology (a) Enumerate the entities found in all states of the domain of discourse. (b) Classify the entities according to whether they are a kind of class, relation, or instance. (c) Develop class and relation taxonomies. (d) Add the instances that always occur in the domain. (e) Devise partitions in class taxonomy. (f) Establish relations among classes according to available relations. 2. Axiomatization (a) Define internal structure of classes. i. Add intrinsic properties of classes. ii. Enumerate part-whole relationships and include them as relations in the relations hierarchy. iii. Add equalities and inclusions for classes. (b) Define the internal structure of relations. i. Encode the value and cardinality restrictions on relations. ii. Add equalities and inclusions for relations.
3. 3.1.
The ontology Strata of linguistic analysis
A complete ontological account of language must take into account its ‘multistratal’ nature, as noted for example by Halliday and Matthiessen (2004: 24). Language is multi-stratal because it can analyzed from a variety of points of view: form, meaning, structure, expression, etc. (We limit ourselves here to these particular strata, but note that a complete ontological account requires a treatment of the social aspects of language as well.) For descriptive purposes at least, the separation of these different kinds of entities into various strata is necessary, because it is then possible to focus only on one stratum in an analysis, as is often done in descriptive linguistics. This section, then, argues for a separation of the concerns of linguistics into various strata, but also for a unifying entity that acts as the “glue” holding the various linguistic strata together.
180
Scott Farrar
Traditionally, the notion of the ‘linguistic sign’ is used for such a unifying entity (cf. de Saussure 1959; Hjelmslev 1953). In Saussurean terminology, the sign exists only by virtue of the existence of a signifiant ‘expression’ and a signifi´e ‘content’. That is, the construct of the sign captures the so-called ‘duality of language’, or the simultaneous merging of form and meaning. GOLD embraces a more detailed account of the sign that is closer to semiology of Hjelmslev. For Hjelmslev, a sign’s expression is actually composed of ‘expression-form’ and ‘expression-substance’, the former being the abstract (signifi´e) portion of the sign and while the latter is the physical realization of that form, i.e., the actual speaking, writing, or signing event in the world. Likewise, ‘content-form’ is the abstract thought/concept (signifiant) which is ‘meant’ by the expression-form, while ‘content-substance’ is the actual physical event or object in the world. Modern approaches to grammar of course recognize a more complex sign, one that also takes into account its structure, for example, any of the various generative approaches to grammar (cf. HPSG, Pollard and Sag 1994). We can say, then, that the GOLD sign is a merging of form, meaning, and structure. Including also a class for the sign’s realization, Figure 1 shows the GOLD class S IGN and illustrates its relationship to the major strata of interest for descriptive linguistics. The following sections give an account of each of the classes listed in the figure. S TRUCT U NIT HAS S TRUCT
S EM U NIT
HAS S EM
S IGN
HAS F ORM
F ORM U NIT
REALIZATION
G ESTURE
Figure 1. The S IGN as it relates to various strata
3.2.
Substance and form of the sign
The first stratum to be discussed is that of linguistic form, called in the Hjelmslevian account ‘expression-form’. We propose the class S IGN F ORM to subsume all kinds of expression-form units. For this preliminary investi-
Using ‘Ontolinguistics’ for language description
181
gation, we focus on the type of form that corresponds to spoken expressions, P HON F ORM . Using the terminology of de Saussure (1959), these types of form are the abstract sound “images” of the sign. We do note, however, that a complete account of sign form must take into account various other kinds of form units, including those based on the visual medium (e.g., various sign languages) and tactile medium (e.g., in the case of Braille). P HON F ORM, and all kinds of S IGN F ORM , however, have to be formulated according to expression-content, or what is physically observably in the world, e.g., the process and product of speaking (see also Bateman 2004 for a discussion). The strategy for this section is to start with various observables of the spoken sign – acoustic and articulatory – and build up various form units. The investigation into the observables of the form-content is facilitated given GOLD’s connection to various sound- and process-related categories from the upper ontology. The general assumption is that various sorts of gestures or actions produce sounds. Following Browman and Goldstein (1992) and others, for example, we may capture various “units of action” by introducing the category S PEECH G ESTURE . Such a class can be minimally defined as a subclass of SUMO:B ODY M OTION . Such gestures have an inherent temporal dimesion and form the articulatory basis of S IGN F ORM . Subclasses of S PEECH G ESTURE include gestures such as G LOTTIS O PENING , T ONGUE ROOTA DVANCING , VOCALIZING , etc. The character of various types of P HON F ORM, then, can be defined according to what kinds of articulatory qualities or features are possessed by particular types of S PEECH G ESTURE . But P HON F ORM can also be defined in terms acoustic observables. The primary entity of concern in acoustic analysis is the sound signal, captured in GOLD as S PEECH S IGNAL , a proposed subclass of the more general SUMO S OUND R ADIATING process. We can state a preliminary dependency relation between S PEECH G ESTURE and S PEECH S IGNAL : any instance of a S PEECH S IGNAL is dependent upon an instance of S PEECH G ESTURE , as there can be no sound signal without a speech gesture. Various types of P HON F ORM , then, can be defined according to combination of articulatory and acoustic observables. For example, a key type of form is the ‘phoneme’, or P HONEME in GOLD. The traditional view is that a phoneme is a ‘bundle’ of recurring articulatory or acoustic features in a language. The obvious ontological question that arises is where do the features come from? Do they carry out an independent existence, or are features dependent on some other kind of entity? We take up a more systematic
182
Scott Farrar
discussion of features in Section 3.5. It should be noted here, however, that an ontological treatment suggests a view of ‘distinctive features’ that goes against the traditional understanding: phonological features are “...the ultimate components [of speech], capable of distinguishing morphemes from each other” (Jakobson and Halle 2002: 3). If the class P HONEME is said to exist at all, then it is possible to create subclasses of phonemes, e.g., VOWEL and C ONSONANT , based on a certain observable phenomena. But simply enumerating various subclasses of P HON F ORM according to bundles of features and calling them phonemes is not sufficient. Finally, forms are either simple or complex. While the simple forms are captured by P HONEME , the complex units are composed of one or more instances of P HONEME . The simplest type of complex unit is the P HONOLOGICALW ORD . Further investigation needs to be undertaken to build out these subclasses.
3.3.
Content of the sign
To begin the discussion of how GOLD models the sign content, some assumptions concerning the overall approach to meaning are in order. GOLD assumes that the semantic system of a language is the bridge between its grammar and the non-linguistic knowledge which it is meant to convey. In other words the semantics ‘mediates’ (in the sense of Bateman this vol.) between the non-linguistic entities, e.g., SUMO:P ROCESS and SUMO:O BJECT , and actual linguistic expressions. It is the job of semantic structure to lay out those aspects of meaning that are encoded directly in the grammar and lexicon, or simply how the speaker represents the world given the limitations of the linear linguistic signal. The link between a non-linguistic account of the world and a linguistic one expressed by the grammar is indirect. Thus, GOLD’s account of meaning adopts a form of two-level semantics in line with that of Bierwisch (1982), Lang (1991), Bateman (1997), Wunderlich (2000), and others. Since it is the semantic system of a language that acts as an intermediary between a non-linguistic and a language-based account of the world, a language’s semantic system is necessarily unique. It must be unique because it is intimately tied to the the grammar and lexicon of a particular language. In order to accommodate this view in the current ontology, it is proposed to have language-specific extensions of GOLD for individual semantic systems. It is possible to introduce generalized ‘upper’ categories into the semantics.
Using ‘Ontolinguistics’ for language description
183
Inspiration for this assumption is taken from the Generalized Upper Model (Bateman et al. 1990; Bateman, Henschel, and Rinaldi 1995) mentioned earlier. However, more research into semantic typology is necessary before a more complete account of semantics can be given. GOLD captures all aspects of ‘sign content’ under the class S EM U NIT , as shown in Figure 1. To exemplify what kinds of categories could fit here, we explore one possible theory of meaning that is particularly complete, that which is suggested by Halliday (1985). According to Halliday the semantic system is composed of at least three levels or metafunctions. A metafunction can be described as a particular mode, facet, or layer of meaning. The three metafunctions in the work of Halliday are the textual, the interpersonal, and the ideational metafunction. The textual metafunction captures the meaning of the clause as ‘message’, or how it is used to construct a text. The textual metafunction is manifested by the theme-rheme and information structure of the grammar. The interpersonal metafunction captures the meaning of the clause as ‘interaction’, or how it is used to act in a discourse. The interpersonal metafunction is associated with the mood element of the grammar. Finally, the ideational metafunction captures the meaning of the clause as ‘experience’, or the propositional content of the sentence. The ideational metafunction can be seen, for example, in a language’s transitivity system and reflects the way the grammars classify and organize the world. In the ontology, then, the three metafunctions are represented by three subclasses of S EM U NIT : T EXTUAL U NIT , I NTERPERSONAL U NIT , and I DEATIONAL U NIT . It is precisely the ideation units that are laid out in the Generalized Upper Model (Bateman, Henschel, and Rinaldi 1995). That is, classes in the GUM correspond to conceptualized units of meaning that make up the propositional content or ideation metafunction of an utterance. Ontologically, it is not possible for the meaning of an utterance to consist of only one of the above units. In many semantic theories, it is only the ideational content that is given.
3.4.
Structure of the sign
Aspects of the sign’s structure are the traditional concerns of morphology, formal syntax, and to some extent discourse analysis. That is, the structural component accounts for notions such as morpheme, syntactic word, and text unit. We refer to the general category subsuming all structural units as
184
Scott Farrar
S TRUCT U NIT , which in turn subsumes the classes M ORPH U NIT , S YN U NIT , and T EXT U NIT . The criterion used to classify S TRUCT U NIT into the three major classes is the kind of relationship in which they participate. Since the kinds of relations relevant for morphological, syntactic, and discourse level units are quite different (cf. from highly structured syntactic relations to functional discourse relations), there is something fundamentally different about the structures themselves. From another point of view, a split based on types of relations makes clear the intuitive distinction (for linguists anyway) among morphology, syntax, and text. The relation of morphological constituency is called CONSTITUENT , and holds between various instances of M ORPH U NIT , that is, morphological units both simple (morphemes proper) and complex (e.g., multi-morphemic signs such as S TEM ). The point is that structures of type M ORPH U NIT do not otherwise participate in syntactic relations, and only subclasses of M ORPH U NIT can participate in CONSTITUENT relations. Next, there are various syntactic relations: HEAD , ADJUNCT , SPEC just to name a few. These types of relations hold between instances of S YN U NIT , that is, syntactic constituents both simple (e.g., S YN W ORD) and complex (e.g., C ONSTRUCTION ). Finally, there are textual constituency relations that relate instances of D ISC U NIT . As the classification of S TRUCT U NIT gets more specific, the criteria of (a) structural complexity and (b) how the units relate to a system as a whole are used. Consider S YN W ORD and S YN C ONSTRUCT , where the former is a single structure that occupies exactly one syntactic position and the latter consists of multiple signs in standing in syntactic relations to one anther. Relations also hold between the various levels of the S TRUCT U NIT including, for example, those that hold between a S YN W ORD and a M ORPH U NIT : INFL , DERIV , and ROOT . These relations are available for use in a language description, but mainly they act to axiomatize the entities in the ontology. So, for example, S YN W ORD by definition must have at least one root, or S TEM must have at least one derivational component. 3.5.
Features of the sign
It is a very common assumption in linguistics that signs carry features, as they are commonly called. And since the analysis of linguistic data routinely utilizes the notion of a linguistic feature, GOLD should provide for the integration of features into its overall ontological account. First of all we pro-
Using ‘Ontolinguistics’ for language description
185
pose that features are attribute-like entities predicated of the sign instances themselves. That is, the various components of the sign, e.g., F ORM U NIT or S EM U NIT , are not composed of features; rather, their make-up determines the sign’s features. There are several varieties of features which linguists find useful for language description, including formal, semantic and phonological features. The structural component provides formal, structural features, called S TRUCT F EATURE ; the form component provides form-related features subsumed under F ORM F EATURE , e.g., phonological features; and the content component provides features that show the meaning of the sign, called S EM F EATURE . As an illustration of the ontological treatment of features, consider the so-called ‘grammatical features’, a special type of S TRUCT F EATURE . The class of structural features determining the formal behavior of signs in morphosyntax is called M ORPHOSYN F EATURE . Morphosyntactic features determine the behavior of morphosyntactic units. We propose that the morphosyntactic features TENSE , ASPECT , NUMBER , etc. are instances of M ORPHOSYN F EATURE . We also give a general class representing the morphosyntactic feature values, called M ORPHOSYN VALUE . Subclasses of M ORPHOSYN VALUE include T ENSE VALUE , A SPECT VALUE , N UMBERVALUE , etc., those corresponding to the instances of M ORPHOSYN F EATURE . For example, each instance of M ORPHOSYN F EATURE , e.g., TENSE , has a corresponding value from M ORPHOSYN VALUE , in this case, from T ENSE VALUE .
3.6.
Types of signs
With the major components of the sign now described, we turn to the classification and organization of the sign itself. Traditional conceptions of the sign usually focus on single words or morphemes. Following Hervey (1979) we assume that signs are not limited to words and morphemes, but encompass more complex syntactic and textual units. In short, every level of formal linguistic structure (from morphemes to whole texts) can be related to the S IGN . Various linguistic theories assume some sort of ‘ontology’ for these kinds of units, though a limited number of theories specifically refer to the sign in their account. Head-Driven Phrase Structure Grammar (Pollard and Sag 1994) is one example that does refer to the sign. In fact the GOLD account draws inspiration from and is similar in many respects to that of HPSG’s.
186
Scott Farrar
Signs can be classified according to a number of ontologically different, cross-classifying criteria: (1) according to the kinds of relationships in which the sign participates (e.g., syntax- vs. discourse-level relations) (Hjelmslev 1953: 41; Pollard and Sag 1994); (2) according to the complexity of the sign (cf. the complexity of a morpheme to a main clause) (Hervey 1979: 10); and (3) according to the kind of system to which the sign belongs (e.g., English signs vs. Swahili signs). Signs can conceivably be classified according to what they mean or according to their form (e.g., how they sound); however, no literature was found to support either of these possibilities. Since all these criteria play a role in the nature of signs, the problem becomes how to make the cut in a way that accords with commonly-accepted categories of modern linguistics and in a way that makes sense according to principles of formal ontology, e.g., without resorting to multiple inheritance, which leads to overly complex taxonomies and is not considered good practice in ontological engineering (cf. Guarino and Welty 2002, 2004). The question really comes down to this: why categorize S IGN ? The answer that will be defended here is that signs are categorized to capture generalizations. Since many of the formal characteristics of the sign are already captured in the structural component, a classification according to Criterion 1 is redundant. A similar argument can be made for Criterion 2, since complexity is expressed as every stratum. The generalization that has yet to be dealt with concerns the nature of the sign with respect to the overall framework of language, namely Criterion 3. Thus, some resulting subclasses would be: S WAHILI S IGN , E NGLISH S IGN , RUSSIAN S IGN , etc. But having sign classes based on particular languages does not provide any insight in the ontological treatment of language itself, assuming that a class L ANGUAGE can be formalized. The problem is that using the mechanisms of formal ontology, there needs to be a way to state that a sign belongs to a particular language. The problem is actually much more general, in that there needs to be a way to distinguish between the knowledge of specific languages and the knowledge of language as a general concept. Preliminary to this discussion is the ontological status of language itself. Given GOLD’s sign-based approach, L ANGUAGE could simply be defined, after the Saussurean tradition, as a ‘system of signs’. The first step in such an endeavor is to formalize notion of a sign inventory for particular languages, that is, to formalize L ANGUAGE as a set-theoretic notion. The next step would be to elaborate on the notion of ‘system’. But in this chapter, because GOLD is still on-going research, we will only introduce some preliminary ideas on this
Using ‘Ontolinguistics’ for language description
187
topic, leaving aside an ontological treatment of ‘system’. Thus the questions to be addressed here are: – How can the notion of a sign that is shared across a system (a speech community) be formalized? – What criteria can distinguish individual signs from one another?
The first question concerns the difference between, for example, a word as shared among speakers of a particular language and the word used in a particular linguistic event. For example, the current argument differentiates between, for example, the sign D OG as shared by speakers of English and the instance of that sign in my saying of Only mad dogs and Englishmen go out in the noonday sun last summer. Fortunately, the work of Hervey (1979) already establishes a basis for such an ontological treatment and attempts to answer just this question. In Hervey’s analysis, an individual sign in a given language is actually the set of all its occurrences in the language or, as he puts it, “a model for a set of speech facts” (Hervey 1979: 10), where the notion of a ‘speech fact’ is equated with the products of linguistic events (i.e., speaking, writing, or signing events). The model for a particular speech fact is called an ‘utterance’ (Hervey 1979: 11). Utterances, then, are the sign classes shared among members of a speech community. Hervey is not working in the framework of formal ontology, however, the formalization of the sign is a useful one for the purposes of GOLD. The task, then, is to alter Hervey’s formalization slightly to coincide with the formal machinery of ontology: namely, the characterization of the sign in terms of classes and instances. This means that D OG is the class of signs that are similar enough in form and meaning to be recognized over and over in the speech community. If signs are classes, then every time someone utters dog, the sign D OG is instantiated. These instances can be represented graphically as: Dog A , Dog B , . . . , Dog n . Each one of these instances, by virtue of its being a sign, has the structure given in Figure 1. Thus, language specific signs can be added to the taxonomy of signs in general: the sign D OG is a subclass of E NGLISH S IGN ; P ERRO is a subclass of S PANISH S IGN ; etc. The second question concerns sign identity. That is, given two utterances, mad dogs which was uttered on January 1st, 1999, and mad dogs which was uttered on June 6th, 2004, how can it be determined whether or not their corresponding signs are instances of the same class? The identity criteria used by Hervey include whether or not the signs have similar forms and whether or not the signs have similar reference. (Hervey uses the term ‘reference’
188
Scott Farrar
for what we have called S EM U NIT earlier.) A set of signs that have similar forms, but not similar references, is called a ‘form class’. A set of signs that have similar reference, but not similar forms, is a ‘reference class’ (Hervey 1979: 13). Crucially, in neither case are the signs identical. It is only when two signs have the similar form and similar reference that they can be classed together as instances of the same sign. Mentioning form and reference as a means to determine sign identity raises the question of why form or content cannot be used in the classification of the sign taxonomy. That is, why not group together all signs in all languages that refer to the concept D OG? Or why not group together all signs in all languages that share a similar form? In terms of form, the class of all signs that share a similar form, even across languages, is not a very interesting concept linguistically – except perhaps when considering the issue of the arbitrariness of the sign. In terms of content, such a classification already exists in the ontological model, namely in the classification of the entities that signs refer to, as in the structure of various semantic units. That is, there is no need to repeat such a structure in the taxonomy of the sign when it can be derived from the overall organization of knowledge.
4.
Summary and Discussion
A descriptive ontology was discussed in the context of Ontolinguistics, namely the General Ontology for Linguistic Description (GOLD) as introduced in Farrar and Langendoen (2003). First, a step-by-step methodology for creating such an ontology was given. Of particular importance is whether or not entities in the domain of discourse are classes, relations, or instances. Finally, a detailed description of GOLD was presented which focused on the class of S IGN as the central entity that binds elements from all strata of language. Concrete suggestions were given concerning how GOLD could be linked to SUMO (Suggested Upper Merged Ontology) (Niles and Pease 2001). The resulting ontology has been presented in the context of what I have referred to as ‘the knowledge sort problem’, the problem of distinguishing linguistic from non-linguistic knowledge. It has been argued that GOLD offers a jumping off point for exploring knowledge sort problem by offering a clear formalization the kinds of knowledge needed for linguistic analysis and description. The knowledge sort problem is so fundamental to the field of linguistics, because the pursuit of a solution gets at the relationship of linguistics to other scientific disciplines. How can it be possible for the linguistic sign
Using ‘Ontolinguistics’ for language description
189
to mean something to someone? What aspects of reality are coded in speech? How is the conceptualization of reality related to language? How does the conceptualization of reality affect the way in which these signs are arranged and interpreted? These questions are exactly what the emerging field of Ontolinguistics attempts to answer.
Acknowledgements The initial development of GOLD was supported by the Electronic Meta-structure for Endangered Language Data (E-MELD) grant from the U.S. National Science Foundation (NSF 0094934), which is greatfully acknowledged. Subsequent support came from the Data-Driven Linguistic Ontology grant (NSF 0411348). I would also like to thank Andrea Schalley, Dietmar Zaefferer, Achim Stein, John Bateman, and Adam Pease for their comments on earlier drafts of this paper. I am indebted to Terry Langendoen, Will Lewis, Gary Simons, and Adam Pease for their help in the development of GOLD.
References Bateman, John A. 1992 The theoretical status of ontologies in natural language processing. In Text Representation and Domain Modelling – Ideas from Linguistics and AI (Papers from the KIT-FAST Workshop, Technical University Berlin, October 9th–11th 1991), Susanne Preuß and Birte Schmitz (eds.), 50–99. (KIT-Report 97, Technische Universit¨at Berlin, Berlin, Germany). 2004 The place of language within a foundational ontology. In Formal Ontology in Information Systems: Proceedings of the Third International Conference (FOIS 2004), Achille C. Varzi and Laure Vieu (eds.), 222–233. Amsterdam: IOS Press. this vol. Linguistic interaction and ontological mediation. Bateman, John A., Renate Henschel, and Fabio Rinaldi 1995 Generalized upper model 2.0: documentation. Technical Report, GMD/Institut f¨ur Integrierte Publikations- und Informationssysteme, Darmstadt, Germany. Bateman, John A., Robert T. Kasper, Johanna D. Moore, and Richard A. Whitney 1990 A general organization of knowledge for natural language processing: The PENMAN upper model. Technical report, USC/Information Sciences Institute, Marina del Rey, California.
190
Scott Farrar
Bierwisch, Manfred 1982 Formal and lexical semantics. Linguistische Berichte 80: 3–17. Borgida, Alex, and Ronald J. Brachman 2003 Conceptual modeling with Description Logics. In The Description Logic Handbook, Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi, and Peter F. Patel-Schneider (eds.), 349–372. Cambridge, UK: Cambridge University Press. Browman, Catherine P., and Louis Goldstein 1992 Articulatory phonology: An overview. Phonetica 49: 155–180. Crystal, David 1997 Cambridge Encyclopedia of Language. 2d ed. Cambridge, UK: Cambridge University Press. Farrar, Scott, and D. Terence Langendoen 2003 A linguistic ontology for the Semantic Web. GLOT International 7 (3): 1–4. Franconi, Enrico 2004 Description logics for conceptual design, information access, and ontology integration: Research trends. Online tutorial. http://www.inf.unibz.it/∼franconi/dl/course/tutorial/ (accessed 22 May 2006). Greenberg, Joseph 1966 Language Universals. The Hague: Mouton. Guarino, Nicola, and Christopher Welty 2002 Evaluating ontological decisions with OntoClean. Communications of the ACM 45 (2): 61–65. 2004 An overview of OntoClean. In Handbook on Ontologies, Steffen Staab and Rudi Studer (eds.), 151–172. Berlin: Springer. Halliday, Michael A. K. 1985 An Introduction to Functional Grammar. London: Edward Arnold. Halliday, Michael A. K., and Christian M. I. M. Matthiessen 2004 An Introduction to Functional Grammar. 3d ed. London: Edward Arnold. Hervey, S´andor 1979 Axiomatic Semantics: A Theory of Linguistic Semantics. Edinburgh: Scottish Academic Press. Hjelmslev, Louis 1953 Prolegomena to a Theory of Language. Bloomington, IN: Indiana University Publications in Anthropology and Linguistics. Jakobson, Roman, and Morris Halle 2002 Reprint. Fundamentals of Language. 2d revised edition. Berlin/New York: Mouton de Gruyter. Original edition, The Hague: Mouton, 1971.
Using ‘Ontolinguistics’ for language description Lang, Ewald 1991
191
The LILOG ontology from a linguistic point of view. In Text Understanding in LILOG: Integrating Computational Linguistics and Artificial Intelligence. Final Report on the IBM Germany LILOGProject, Otthein Herzog and Claus-Rainer Rollinger (eds.), 464–481. (Lecture Notes in Artificial Intelligence 546.) Berlin: Springer. Niles, Ian, and Adam Pease 2001 Toward a Standard Upper Ontology. In Formal Ontology in Information Systems. Proceedings of the 2nd International Conference (FOIS-2001), Christopher Welty and Barry Smith (eds.), 2–9. New York: ACM Press. Peeters, Bert (ed.) 2000 The Lexicon-Encyclopedia Interface. New York: Elsevier. Pollard, Carl, and Ivan Sag 1994 Head-Driven Phrase Structure Grammar. Chicago: University of Chicago Press. Saussure, Ferdinand de 1959 Course in General Linguistics. London: Peter Owen. Wunderlich, Dieter 2000 Predicate composition and argument extension as general options: A study in the interface of semantic and conceptual structure. In Lexicon in Focus, Barbara Stiebels and Dieter Wunderlich (eds.), 247– 270. Berlin: Akademie.
Language as mind sharing device: Mental and linguistic concepts in a general ontology of everyday life Dietmar Zaefferer The challenge of theory formation is not so much in finding correct generalizations as in creating fruitful conceptualizations. (after Ernst Mayr 20021 )
1. 1.1.
The ontolinguistic approach and its motivations The Cross-linguistic Reference Grammar database project
The ideas to be presented in this article arose from an effort to develop a general framework for comparable descriptions of languages of any kind with the help of contemporary technology. In the electronic age it has become possible to manage amounts of data that a century ago nobody would have dared to dream of. So the size of a database that would include comprehensive descriptions of all the currently known languages (somewhere between six and seven thousand) would be nothing that could scare a specialist. But such a database, something not only every typologist, but also every theoretical linguist would love to have access to, is far from being realized although, as is well known, its realization is a rather urgent matter since the linguistic diversity on this planet is rapidly decreasing.2 Fortunately, there are already several projects under way that are working on this huge task.3 One of them is CRG, the Cross-linguistic Reference Grammar database project.4 The CRG enterprise goes back to a joint initiative of Bernard Comrie, Bill Croft, Christian Lehmann and the author of this article at the end of the last century (cf. Comrie et al. 1993; Zaefferer 1998). The initial idea was to create some kind of revised electronic version of the famous Lingua descriptive studies questionnaire (Comrie and Smith 1977), a general framework for the description of human languages. Any project like this has to come to grips with three fundamental problems:
194
Dietmar Zaefferer
1. The comparability problem; 2. The typological bias problem; 3. The theoretical bias problem.
In the following sections I will not address the second and the third problem (cf. Zaefferer 2006a for a discussion of these issues), I will rather present ideas that originated in an effort to come to grips with the first problem above: What are the foundations for a fair and unbiased comparison of languages, if such a thing is possible at all? 1.2.
The challenge of general comparability
The first, rather obvious, aspect of the comparability problem is terminology. Since languages are abstract and large it is next to impossible to compare them directly, one has to describe them first. But as long as it is not clear that the terms used in the descriptions of different languages are the same, i.e. that they are based on the same cross-linguistically valid operationalizations, there is plenty of room for misanalysis. Both faux amis (ambiguity: use of the same terminological label for different concepts) and faux ennemis (synonymy: use of different labels for the same concept) occur again and again and are a big obstacle for the proper comparison of languages. So there has to be a common terminology and since the best way to organize the descriptive categories of a domain consists in building an ontology of that domain, proposals like the AVG systematics5 or the GOLD ontology6 are necessary and highly welcome. Another aspect of the comparability problem is organizational in nature and therefore related to metadata: How are all the examples and partial descriptions organized into a whole? This problem has been solved in the Descriptive Grammars series (now published by Routledge) by the use of a common table of contents (provided by the Lingua descriptive studies questionnaire), but not for instance in the Mouton Grammar Library volumes (published by Mouton de Gruyter). The latter have the advantage that they are able to evolve, but they pay the price of problematic comparability, the former have the advantage of high comparability at the expense of becoming outdated in their structure (cf. Comrie 1998). This seems to be a dilemma that cannot be solved in the realm of paper grammars and that therefore requires a migration to electronic grammars (for a comparison of the two cf. Zaefferer 1997), where updates and reorganization are much less expensive.
Language as mind sharing device
195
The third aspect of the comparability problem is a methodological one. Whenever one speaks of standardizing language descriptions across language types, taking up the challenge of general comparability, the specter of Procrustes enters the room and scares the discussants. Procrustes, as is well known, used brute force in making different customers alike, stretching the short ones and cutting off the feet of the taller ones. Wouldn’t a general framework for language description be bound to be like Procrustes? This question is inseparably connected with the typological bias problem mentioned above. Once the typological bias problem is solved, a more adequate picture should replace the one of Procrustes. It would be the picture of a linguist who simply measures distances on different scales of dimensions of variation. Instead of saying these two languages are rather similar in a given respect, say their vowel inventory, she could come up with an objective value on a scale. This methodological aspect of the comparability problem is also related with the sociology of linguistics. As is well known among typologists, there are what one might call familiarizers, linguists who try to show that all languages are basically the same and therefore tend to play down the differences, and there are exoticizers, linguists who try to prove that the language they are working on is incommensurable with all others and who therefore tend to play down the commonalities. Both attitudes are of course exaggerations, but without measurements it will be hard to find a non-arbitrary balanced position between the two extremes. The fourth and deepest aspect of the comparability problem is a theoretical one. It concerns the foundations for the solution to the first, the terminological aspect. For an ontology for linguistic descriptions to be general it has to be based on an adequate conceptualization of its domain and this means first and foremost an adequate explication of the notion of language itself. That this is not a trivial matter becomes clear when we have a look into the well-respected Encyclopedia Britannica. There we read as definition of language: (D *)
Language: a system of conventional spoken or written symbols by means of which human beings, as members of a social group and participants in its culture, communicate.7
It seems that the editors of the famous Encyclopedia have not noticed the fact that linguists nowadays agree on the status of the gestural-visual communication systems of the deaf as full-fledged languages. The discussion that led to this consensus has sharpened our awareness that the essence of language is not in its means but in its ends.
196
Dietmar Zaefferer
Therefore it seems advisable to base the comparability of human languages not on the similarity of their means, but on their global functional equivalence. 1.3.
Conceptualizing language from the other side
“Language is a defining characteristic of the human species . . . ” (Hyman 2004: 1). This is a claim frequently made by linguists, but why should this be so? One possible answer is that the human species differs so much from its fellow primates because it is the only one with a highly differentiated evolving distributed mind,8 and because human language is its primary mind sharing device. And this is the definition of language that will be presupposed for the remainder of this article: (D 1)
A language is a general purpose unbounded mind sharing device.
Conversely, the human species could be defined as that species of animals that has the least constrained mind sharing device. Note that (D 1) does not prevent other species than homo sapiens from having a language as long as they are equipped with a mind.9 Mind sharing is based on attention sharing10 and subsumes as its three core components knowledge sharing, emotion sharing and goal sharing. All of them are possible without language as well, but only for special purposes and to a very limited extent. So the evolution of mind made a quantum leap when it crossed the line between special-purpose limited and general-purpose unlimited mind sharing. Seen from this point of view, the different languages that have evolved are all solutions of the same problem: how to enable one mind to share unlimited mental content with other minds. In order to flesh out the seemingly nebulous notion of mind sharing it is necessary to recognize first that there are two levels on which a given content can be said to be shared by two or more minds. The first one is the dispositional level on which there are time-stable representations of categories and concepts that are largely equivalent across minds. On this level mind sharing is concept sharing. The second one is the manifestation level and contains behavior that is based on these representations of categories and concepts. Mind sharing on this level is concept activation sharing. In order to find out about the first level one has to go through the second: A necessary condition for concepts to be shared by (or type-identical across) several minds and a central indicator for this condition to hold is parallel
Language as mind sharing device
197
behavior in the task of distinguishing phenomena that fall under the given concept from others that do not. From these assumptions a picture emerges according to which the possession of a common language requires the possession of a shared system of concepts (disposition) and the communicative use of a language requires the coactivation of concepts and conceptual structures from this system (manifestation). Obviously the possession of a system of concepts or ontology is logically and therefore also phylo- and ontogenetically prior to the possession of a language.11 The basic idea of this chapter is to conceptualize language from the other side, from the ends’ end, so to speak, i.e., to conceptualize it not from its forms, but from its contents. This entails viewing the individual languages not as posing decoding problems, but rather as different solutions to roughly the same coding problem. In order to see what exactly this coding problem consists in the reader is kindly invited to go through the following Gedankenexperiment: Suppose there is a group of human minds12 each of which (a) possesses a full-fledged ontology for the phenomena of its everyday life and (b) assumes that this ontology is more or less shared among the members of the group. How would they go about to achieve activated mind sharing on this basis? With respect to current emotion sharing they would not have to do very much: They would have to get the other indidividual’s (and mind’s) attention, to express the emotion, i.e. to make it perceivable,13 and to hope that it will be shared. An example for a corresponding content, described from the perspective of the active mind, would be this: Look here, I am happy, and I want you to be happy too! No special means are needed. The problem is somewhat more difficult with respect to current goal sharing: Again the active mind would have to get the other one’s attention, but then it would have to get the other one to activate the concept of a possible future course of events that leads to its goal,14 and finally, again, it would have to hope that it will be shared. Here is an example: Look here, I am too small to reach that fruit, and I want you to reach it for me. The hardest problem for activated mind sharing concerns currently activated knowledge sharing: As before the members of our group would have to get the other one’s attention first, but then they would have to get the other one to activate the same knowledge that is currently active in their mind, and finally, once more, they would have to hope that it will be shared. An example would be this: Look here, I know that this fruit is poisonous, and I want
198
Dietmar Zaefferer
you to know that too. The coactivation of knowledge is relatively easy with respect to perceivable properties, but a challenge as soon as non-perceivable properties are at stake, as in the example. The key to the solution for this coding problem is the same in all languages: In order to become accessible to one mind, the hidden content of another one has to be correlated with something perceivable. This holds already for one agent’s happiness, which is inaccessible to another mind unless she makes it inferable through her smile. The progress towards language began when the relation between perceivable patterns and inferable contents became less and less restricted to natural correlations and was more and more complemented by artificial correlations. Artificial correlations can start out from mimicking natural correlations (as, e.g., in demonstrative yawning), but they have to extend into arbitrary correspondences if they are to fulfill their design purpose, namely unlimited coding of concepts and conceptual structures, since there are too many concepts that are not naturally correlated with any perceivable patterns, and this holds even more when the latter are restricted to audible patterns.15 The precise ways these artificial correlations have evolved are unknown and the present paper will not contribute to the speculations about more or less plausible scenarios for that development.16 It will rather focus on a consequence of the assumptions outlined above: If it is correct that linguistic communication consists in utterance-driven coactivation of conceptual structures, then it presupposes, as stated above, both phylo- and ontogenetically the existence of shared conceptual systems alias ontologies. So a central concern of any research based on these ideas must be a precise representation of concepts and conceptual structures. Therefore the bulk of this paper will be devoted to the presentation and partial discussion of a proposal for a general ontology of everyday life that includes most of the concepts and categories that are currently used in linguistics under such labels as aspectuality and others. But is this enough? Any reflection on the very foundations of general cross-linguistic language description shows that another ontology is required for that purpose: A special ontology or system of concepts for the description of linguistic phenomena in addition to the general ontology comprising the concepts we are able to coactivate through the use of linguistic signs. However, given that every human language allows its users to speak not only about non-linguistic but also about linguistic phenomena, this domain ontology cannot be a separate system, it rather has to be embedded in a reentrant and recursive way into the general ontology of everyday life.
Language as mind sharing device
199
The ontology to be presented in the following section has these properties, it is a general ontology that includes a special or domain ontology of linguistic phenomena. Illocutions play a key role in the latter, situations, different kinds of propositions and the corresponding attitudes in both. Some aspects of attention, knowledge, emotion and goal sharing and the devices provided for that purpose by all human languages will illustrate the proposal. Given the philosophical origins of the term ‘ontology’ it seems appropriate to emphasize that this general ontology, including the embedded ontology of linguistic phenomena, differs from philosophical ontologies insofar as it aims to model the conceptual structures that have evolved in the natural history of the human mind and these are restricted only by their usefulness for everyday life. And since in everyday life ontological parsimony seems to be much less important than in philosophy, Occam’s razor does not play the role it plays in philosophy and is replaced by the razor of common sense. What is the razor of common sense? Whereas Occam’s razor (abbreviated OR below) postulates ontological minimalism by admitting only categories that are required, the razor of common sense (labeled RCS below) practices pragmatism by admitting all categories that are useful: (OR) (RCS)
Entia non sunt multiplicanda praeter necessitatem. Entia non sunt multiplicanda praeter utilitatem.
The two coincide of course as long as any form of redundancy is useless. But since in cognitive systems as well as in many others lack of redundancy causes brittleness, the most useful degree of redundancy is certainly not zero and therefore a (OR)-guided ontology will considerably differ from an (RCS)-guided one.17 1.4.
Ontology-based cross-linguistic research
One basic assumption of all ontology-based approaches in linguistics is that the relation between the ontologies in our minds and the languages we participate in is not arbitrary. This assumption is supported by the fact that across different types languages tend to include monomorphemic lexical codings for concepts like ‘water’ or ‘big’ or ‘die’, which are called basic level concepts in Rosch et al. (1976)’s terms and which are ontologically clearly central, and they tend to code with closed-class items (grammemes) more general concepts such as ‘instrument’ or ‘causation’ or ‘past’. Languages all over the world offer sentential codings for conceptual structures such as those denoted
200
Dietmar Zaefferer
by verbs, eventity frames, as Schalley (this vol.) calls them, i.e., combinations of a static periphery including the participant roles and a dynamic core like a transition from being asleep to being awake (Schalley 2004, this vol.), as soon as these frames are filled to constitute propositions (cf. below). The linguistic coding means for concepts tend to reflect frequency of use in their compactness (number of syllables, morphemes and words; Zipf’s law) and generality in their degree of grammaticalization. The consequences of this kind of considerations for cross-linguistic grammatography are not difficult to draw: The semasiological (decoding) and the onomasiological (encoding) perspective are not on a par, rather the latter should be given priority over the former. The reason for this asymmetry is simple: If comparison is based on assumptions like ‘there must be a way, compact or complex, of expressing this given content in the language under consideration’, it is safe, but if it is based on assumptions like ‘there must be a copula or a noun-verb distinction in this language’, it is not. (The copula assumption is clearly falsified by many pidgins and creoles, the noun-verb assumption is quite controversial.) In order to fulfill its mind sharing function a language must enable its user to ask and answer questions, but it need not mark the difference between assertions and questions by sentence particles. Unfortunately existing reference grammars are almost exclusively organized from a semasiological perspective (cf. Mosel 2006). The explanation is not hard to provide: Linguistics has a longer tradition of precisely defining forms than of precisely defining contents.18 Investigating the connection between the ontologies in our minds and language contributes to the explanatory adequacy of linguistic theories since, as mentioned above, ontologies precede language not only logically, but also onto- and phylogenetically. In their contribution to the present volume Thomas Metzinger and Vittorio Gallese write: To have an ontology is to interpret a world. . . . the brain, viewed as a representational system aimed at interpreting our world, possesses an ontology too. . . . the motor system constructs goals, actions, and intending selves as basic constituents of the world it interprets. . . . Empirical evidence demonstrates that the brain models movements and action goals in terms of multimodal representations of organism-object-relations. Under a representationalist analysis, this process can be conceived of as an internal, dynamic representation of the intentionality-relation itself. (Metzinger and Gallese this vol.)
Language as mind sharing device
201
A shared action ontology seems also to be what is required for successful linguistic communication: Agents must agree on what for instance an assertive or erotetic speech act is and what kind of perceivable action counts as performing one. So the domain ontology of linguistic phenomena must include several speech act concepts such as illocution and propositional content. These issues will be addressed in Section 3. below. The following section will outline the top distinctions of the general ontology.
2.
Conceptual building blocks of everyday life: GOEdL
In order to have a nice nickname and by the same token to be constantly reminded of its necessary incompleteness the general ontology that will be presented in this section has been dubbed GOEdL.19 The idea of GOEdL, the General Ontology of Everyday Life, is to make explicit the most basic distinctions humans make in everyday life, i.e., the distinctions that seem to underlie their normal behavior and which may contrast with what they think up when they start philosophizing. The relevant parts of the latest version of GOEdL are included in the appendix. In this section only the first few top levels will be discussed. The embedded domain ontology of linguistic phenomena will be the topic of the next section as well as some parts of the domain ontology of mental entities that are crucial for the former.
2.1.
Top level distinctions
Many ontologies start out with a binary distinction between concrete and abstract entities,20 the distinguishing criterion being that the former are spatiotemporally located whereas the latter are not. But how about space and time and their kin themselves? They seem to be neither fully concrete (local space lacks temporal location, temporal space lacks localization) nor entirely abstract (they are not beyond both time and space) and so they are assigned to a third basic category called framing entity.21 Finally there is a fourth toplevel category for those entities that are both, albeit only partially, one part being concrete, the other one abstract. Therefore they are called concreteabstract hybrids. Examples will be provided below. Here is the quaternary branching of the top node called Entity:
202 A. B. C. D.
Dietmar Zaefferer Framing entity (provides spatiotemporal location) Concrete entity (has spatiotemporal location) Abstract entity (lacks spatiotemporal location) Concrete-abstract hybrid (partly has spatiotemporal location)
In order to understand the rest of the structure it is important to have a look at the daughters of the Concrete entity node first. There are three of them: External entities, located entirely outside anybody’s mind, mental entities, located completely inside some mind or minds, and external-mental hybrids, with a location that includes outside and inside portions, and each of the three has three daughters in turn. B. Concrete entity (has spatiotemporal location) I. External entity (outside any mind) A. Situation (container; inherently bounded) B. Inventity (content; primarily spatial meronomy) C. Eventity (content; primarily temporal meronomy) II. Mental entity (inside some mind or minds) A. M-situation (mental container; inherently bounded) B. M-inventity (mental content, primarily ‘spatial’ meronomy) C. M-eventity (mental content; primarily temporal meronomy) III. External-mental hybrid (partly outside, partly in some mind(s)) A. E-m hybrid situation (compound of external situation and mental entity) B. E-m hybrid inventity (compound of external inventity and mental entity) C. E-m hybrid eventity (compound of external eventity and mental entity)
The first one of the three kinds of external entities is the situation kind. A situation is conceived of as a spatiotemporally coherent container with inherent boundaries. An external situation can be something very big like our universe from the Big Bang until today or something rather modest in size such as my office this afternoon. Being conceived as containers, situations can be viewed exclusively, abstracting away from their contents, or inclusively, together with their contents. The content of a situation consists of its inventory entities (individuals and other entities or, shorter, inventities) and the events or similar entities (eventities) that take place or are instantiated in it.
Language as mind sharing device
203
Inventities include objects such as my coffee mug or substances such as the air in my office. Although inventory entities exist both in local and temporal space, our everyday life conceptualizations seem to emphasize the former at the expense of the latter.22 So when we start to think about the parts of inventities we think about the handle of the mug (which just came off) and the portion of the air over the heater (which is warmer than the rest) and not about the mug or the air half an hour ago, when the mug was still intact and the air was much colder. Eventities, by contrast, especially events, tend to be subdivided primarily into temporal sections or phases (initial, central and final part of eating lunch) rather than local parts (contributions of left and right hand to eating lunch). Internal or mental entities show a structure that is analogous to that of the external entities, but there are some significant differences. A mental situation is the mind frame that includes the mental contents of a given mind in a given time span. The inventory of a mental situation consists of all the mental objects it contains including the mental images of what goes on outside the mind. The eventities of a mental situation include the mental properties, states, activities, events and processes. The third kind of concrete entities will prove crucial for the cognitionoriented approach to ontolinguistics advocated here, it is the category of the external-mental hybrids. The prototype of an external-mental hybrid inventity is the person. A person is a compound consisting of an external inventity, the person’s body, and a mental situation, the person’s mind.23 The prototype of an external-mental hybrid eventity is an action. An action is a compound consisting of an external eventity, an agent’s bodily movements, and a mental inventity, a currently activated representation of a reward-oriented goal-state.24 And therefore a linguistic action (speech act, illocution) is a compound consisting of an agent’s (mostly sound producing) bodily movements and a currently activated representation of a goal-state. Since for normal illocutions this goal-state includes a state where the agent has been understood, this need not be at variance with Searle’s notion of illocutions as intrinsically rooted in power and deontology,25 but it does not imply it either. 2.2.
Further distinctions
At the next level of specificity the subcategories of the three types of external entities introduced above include what is more or less known from
204
Dietmar Zaefferer
the literature on noun-related and verb-related classes of entities and what is treated under different labels such as count-mass distinction, aspectualities etc. What is not so common, but needed for the upcoming sections, is the division of individuals into absolute individuals, which are meronomically free, autonomous so to speak, and relational individuals, which are meronomically bound to either their constituting sub-individuals or to some superindividual they co-constitute. Examples are hats (absolute individuals), pairs of shoes or scissors (super-individuals), and single shoes or scissor halves (subindividuals).26 Eventities include the well-known Aristotle-Ryle-Kenny-Vendler categories27 (states, activities, achievements, and accomplishments28 ), but also properties and transients, i.e. events and processes that are not conceptualized as leading from an anterior to a posterior characteristic (property or stage changes), and that are sometimes called semelfactives or intergressives. I. External entity (outside any mind) A. Situation (container; inherently bounded) 1. Exclusive situation (situation without its content) 2. Inclusive situation (situation including its content) B. Inventity (content; primarily spatial meronomy) 1. Individual (inherently space-bounded, question of completeness central) a. Absolute individual (meronomically free) b. Relational individual (meronomically bound) i. Super-individual (meronomically superordinated) ii. Sub-individual (meronomically subordinated) 2. Dividual (not inherently space-bounded, question of completeness peripheral, homogeneous) a. Substance (non-atomic) b. Collection (atomic) C. Eventity (content; primarily temporal meronomy) 1. Characteristic (not inherently time-bounded) a. Property (inalienable) b. Stage (alienable) i. Static stage: State (force required for termination) ii. Dynamic stage: Activity (force required for maintenance)
Language as mind sharing device
205
2. Transition (stage changing; doubly time-bounded change) a. Transitional event: Achievement (not extended) b. Transitional process: Accomplishment (extended) 3. Transient (stage preserving; doubly time-bounded interlude) a. Transient event: Semelfactive (not extended) b. Transient process: Intergressive (extended)
3.
A Domain Ontology of Mental Entities: DOME
The five top levels of DOME are included in the Appendix 1. under B. II. Mental entity. In order to develop DOLPhen, however, the Domain Ontology of Linguistic Phenomena to be presented in Section 4. below, two categories will be required which are more specific, hence belong to lower levels and therefore are presented in a different place (they are printed in Appendix 2.). These categories are special cases of mental individuals on the one hand and of mental states on the other and will be presented in turn. 3.1.
An ontology of propositional contents
The concept of propositional content that is used in DOLPhen is not based on possible worlds (whole worlds don’t show up in GOEdL at any place), but on an idea that goes back to Austin and has been revived and reshaped under the name of Austinian propositions by Barwise and Etchemendy (1987). According to this idea propositions are compounds consisting of a situation token and a situation type. Whereas Barwise and Etchemendy give a completely formal definition, DOME (and based on it DOLPhen) uses a cognitivized version of Austinian propositions where a mental image of a situation token (be it concrete or abstract, external, mental, or e-m-hybrid) is paired up with a mental representation of a situation type (which latter is always abstract). So each DOME-proposition is a mental superindividual consisting of two subindividuals, a meronomic sum of two constituents. A DOME-proposition is true (exactly in the spirit of Austin and Barwise and Etchemendy) if the given token is of the given type, and false else. Since propositional contents of interrogatives are neither true nor false, they are treated as underspecified or near-propositions, as opposed to the full propositions that can be the content of assertions. The situation token of a nearproposition is only a token of an exclusive situation, so the corresponding truth value must remain indeterminate until the situation token is completed (filled with some content) to become that of an inclusive situation.
206
Dietmar Zaefferer
Open situation types are types that are parametric in at least one respect, so they describe the corresponding token only partially in the sense that the token value of the corresponding parameter is missing. Here are the two top levels of the propositional content token subontology (the plus-sign denotes meronomic sum formation, the shorthand i-situation stands for mental image of a situation): Propositional content token 1. Near-proposition token (exclusive i-situation + i-situation type) a. Closed near-proposition (exclusive i-situation + closed i-sit type) b. Open near-proposition (exclusive i-situation + open i-sit type) 2. Full proposition token (inclusive i-situation + i-situation type) a. Closed proposition (inclusive i-situation + closed i-situation type) b. Open proposition (inclusive i-situation + open i-situation type) 1. Plain proposition token (i-situation + plain i-situation type) 2. Modalized proposition token (i-situation + modalized i-situation type) a. Non-mentally modalized proposition token b. Attitudinally modalized proposition token
Italicized category numbering codes cross-classification with categories on the same level marked by roman numbers, i.e., both full and nearpropositions can be either plain or modalized. Note that the elements of the first section of the propositional content subontology, let us call it the completeness section, consist exclusively of elements that are there in DOME anyway, so no new building blocks are needed. The same is not true of the second section of this subontology, which can be called the modalization section. Therefore a few words are in order about the domain ontology of modal categories included in GOEdL.29 Both non-mental and attitudinal modalizations map propositions to ‘higher’ propositions that include them, but only the latter are about mental states or events, in short, propositional attitudes. Now to the propositional attitudes themselves. 3.2.
An ontology of propositional attitudes
The notion of a propositional attitude used in DOME is that of a mental role of some propositional content, i.e., of a mental eventity of an agent with respect to such a content. Here are the four top levels of the subontology of propositional attitudes included in DOME:
Language as mind sharing device
207
Propositional attitude token 1. Presentative attitude token (mental role of a blueprint proposition) a. Intention token (feasibility by attitude holder required) b. Volition token (feasibility required) c. Wish token (feasibility irrelevant) 2. Representative attitude token (mental role of a picture proposition) a. Knowledge token (uncontroversially true) i. Transparent knowledge token (content: full proposition token) A . Passive transparent knowledge (mental state) B . Activated transparent knowledge (mental activity) ii. Opaque knowledge token (content: near-proposition token) A . Passive opaque knowledge (mental state) B . Activated opaque knowledge (mental activity) b. Belief token (content: full proposition token; assumed to be true by attitude holder) c. Hypothetical assumption token (content: full proposition token; truth irrelevant)
The first binary branching is in line with what most theoreticians assume, although under different labels: Davidson (1963) for instance speaks of conative or ‘pro’ attitudes as opposed to cognitive attitudes. In DOME the labels are presentative as opposed to representative attitudes, the former correspond to a world-to-mind direction of fit of their content (‘blueprint’), the latter to a mind-to-world direction of fit of their content (‘picture’) (cf. Searle 1983). Of the three presentative attitudes the volitional one will be of prime importance for illocutions, but wishes play a role as well and intentions are of course required for the performance of an illocution to take place at all. Among the representative attitudes knowledge has the most central role for language use. In line with Zaefferer (2005, 2006b) an important distinction is made between transparently conceptualized knowledge, where the content including its truth value is fully visible because it is a full proposition, and transparently conceptualized knowledge, where the content is only partly visible and hence no truth value is available because it is a near-proposition. The distinction will be mainly needed for capturing the ontological difference between the propositional contents of questions and assertions. At the lowest level, the difference between passive and activated knowledge is represented, which is inherited by the state vs. activity distinction at higher levels of DOME. It will be needed for the description of the range of goals of epistemic volitionals below.
208 4. 4.1.
Dietmar Zaefferer
Conceptualizing linguistic phenomena: DOLPhen The overall picture
All building blocks that are needed for DOLPhen have been presented in the preceding sections. Here is one way of putting them together, a conceptualization of language use that is intended to be compatible with most if not all linguistic research programs, schools, enterprises and theories that are currently in use. The first thing to note is that linguistic phenomena show up in several different places in GOEdL. For the sake of illustration only the top categories of oral illocution tokens will be discussed here. An oral illocution token is an action and hence an external-mental hybrid. Therefore it can be either monadic, involving only the agent’s mind, or polyadic, involving also what happens in the mind(s) of the addressee(s) or other witnesses. In both cases it is an external-mental hybrid eventity where an external eventity causes one or more internal eventities. The subcategories of these eventities, however, are significantly different: Externally, the sound waves fade away, and the external situation shows no traces of a change, therefore the external part of an oral illocution is a (semelfactive or intergressive) transient. Internally, however, the situation representations of the involved minds have changed in a characteristic way,30 therefore the internal parts of oral illocutions are transitions, in general internal achievements. This leads to the conclusion that ontologically both spoken and signed illocution tokens are transition-causing transients. The conceptualization of illocutions presented below is based on modal categories of different kinds (cf. Zaefferer 2001, 2005): It is assumed that structured illocutionary modalities can be defined as stacked modalities, i.e., a combination of the general autonomous modality of causation, and different attitudinal modalities in its scope. What is expressed in a structured illocutionary act is a certain propositional attitude towards a certain propositional content. And the achievement of mind sharing in the sense of sharing (part of) this attitude is the core of performing an illocution. But in order to share something mental and hence internal, one has to make it external, one has to express it. Expressing a propositional attitude is conceptualized in DOLPhen as making it ascribable: If an agent expresses her wish to be famous she entitles her audience to ascribe this attitude to her. In other words, she causes or brings about a situation in which this attitude can be inferred, if sincerity is added
Language as mind sharing device
209
to the premises. In DOME propositions about acts of causation are treated as modalizations of the propositions that are about the corresponding effects. Causation belongs to the external modalizations, a sister category of attitudinal modalizations. This rather encompassing notion of modal category makes it possible, as mentioned above, to construct all higher ingredients of DOLPhen from other parts of GOEdL. Below is an excerpt from the Domain Ontology of Linguistic Phenomena containing the major and some minor speech act types that are discussed in the literature. The first branching separates the holistic from the structured illocutions. Examples of the former are utterances of interjections like Ouch, Oops, or Wow; they have purely expressive functions and they lack a propositional content. Alleged paraphrases like I am in pain for Ouch are different insofar, as only the former contain possible antecedents for propositional anaphora such as the occurrence of that in reactions like That’s a lie.31 From the structured illocutions only those with propositional contents are considered. Structured illocutions are subdivided according to the major attitude they express. The largest group are the volitionals which aim at a clearly inferable goal and are divided into general and epistemic volitionals. The content of a general volitional is a proposition that defines the goal of the action directly, a goal which either comes from the speaker and can only be reached when the addressee shares this goal or it comes from the addressee and is to be shared by the speaker. The former case is that of a directive which is defined by a strong volitional attitude (Open the door! in the sense of ‘I want you to do that’), the latter case is that of a permissive which is defined by a weak volitional attitude (Open the door! in the sense of ‘I don’t want you to refrain from doing that’). By contrast with the epistemic volitionals, whose propositional content is interpreted as epistemically modalized and which will be discussed in the next section, the general volitionals are not constrained by the kind of goal they aim at, which is also the reason for their denomination. The second group, the purely expressive illocutions is the smallest one. It includes the optatives, which are defined as primarily serving to express some emotional attitude towards their content and apart from expressing or sharing this attitude don’t involve further goals. Third, there is an interesting group of hybrids. These are called expressive epistemics because they aim not only at shared knowledge but also at shared emotions. It is proposed that the indirect assertions or rhetorical questions are subsumed here alongside with three kinds of exclamations: Those which base
210
Dietmar Zaefferer
their form on constituent interrogatives, those which use demonstratives to mark the constituent whose scalar value is at issue, and those without special means for this purpose. Here are the four top levels of DOLPhen’s category ‘Oral illocution token’: DOLPhen-o Oral illocution token a. Holistic illocution token (without propositional content) b. Structured illocution token (with propositional content) i. Volitional illocution A.
General volitional (for goal sharing) I . Directive II . Permissive B . Epistemic volitional (for knowledge sharing) I . Assertive (content is a full proposition) 1. Assertion 2. Commissive 3. Declaration II . Erotetic (content is a near-proposition) 1. Polar question (content is a closed near-proposition) 2. Constituent question (content is an open near-proposition)
ii. Expressive illocution (for emotion sharing) A.
Optative I . Narrow optative II . Imprecative
iii. Hybrid illocution (for knowledge and emotion sharing) A.
Expressive assertive I . Direct expressive assertion (content is a full proposition) II . Indirect assertion (rhetorical question) 1. Rhetorical polar question (content is a closed nearproposition) 2. Rhetorical constituent question (content an open nearproposition)
Language as mind sharing device B.
211
Exclamation (assertion with degree ineffability enrichment) I . Overt constituent exclamation 1. Interrogative constituent exclamation (content is an open near-proposition) 2. Demonstrative constituent exclamation (content is an open proposition) II . Covert constituent exclamation (content is an open proposition)
The place these entities are assigned in GOEdL entails that they are all hybrids consisting of a perceivable external behavior and an inferable mental entity, a currently activated representation of a goal-state in at least the agent’s mind. Finally it should be noted that oral illocution types as well as other types of language use are ontologically quite different in that they are not action tokens and therefore not external-mental hybrids, but being types they are homogeneously abstract entities. In linguistics sometimes a third kind of ontological category plays a role, concrete-abstract hybrids. When Searle (2006: 12) states: “It is an objective fact that the piece of paper in my hand is a twenty dollar bill”, he is talking about two ontologically different, but in their external aspect coinciding entities. The piece of paper in his hand is an external individual, the twenty dollar bill in his hand perceived by him as such is an external-mental hybrid, but the the twenty dollar bill in his hand being what it is as an objective fact belongs to still another category, that of the concrete-abstract hybrids. 4.2.
A central kind of mind sharing: Knowledge sharing
Epistemic volitionals differ from the general ones in that their goal is not directly defined by their propositional content, but indirectly as the state of having activated knowledge of this content. As I am arguing elsewhere in more detail (Zaefferer 2006b), this is the common denominator of both assertive and erotetic illocutions, whereas the difference between them results from the two ontologically different kinds of propositional content introduced above: full propositions, which are either true or false, and near-propositions, which are neither. Note that ascribing knowledge of a near-proposition to a person does not imply that this person ignores the truth value, on the contrary it implies that the knower knows the corresponding true (and hence full)
212
Dietmar Zaefferer
proposition, but this part of the knowledge ascription is not spelled out by the ascriber. In other words, knowledge is always knowledge of true full propositions, by it can be characterized either transparently with the help of a full proposition or opaquely by using a near-proposition which stands in for its true completion. Among the most important kinds of mind sharing, knowledge sharing plays a special role. This is reflected in the fact that the grammars of all human languages include a clause type called declarative which when it surfaces as sentence type is normally said to have as its primary function the making of statements. In this section I will outline a more precise picture of the content of the declarative clause type that not only unites its different functions but also shows its close connections with another clause and sentence type called interrogative which shows a structure that is in general closely related to and derived from the structure of declaratives. Since the primary function of interrogative sentences is to perform erotetic illocutions, i.e., to ask questions, and since I claim the propositional content of erotetics to be near-propositions it follows that I assume that in all human languages interrogatives code near-propositions just as declaratives code full propositions. The fact that there are also embedded declaratives and interrogatives fits well with the claim that their formal distinction codes a difference in content but how about the different forms of root sentences of the two kinds? Of course, they also code different contents according to this theory, but do they also code different forces? Whereas it seems uncontroversial in the literature that the answer has to be in the positive, I claim that a fuller picture emerges from a negative answer. The corresponding theory, which I have come to call the theory of the content-type-driven interrogative-declarative distinction (CID Theory), consists of several hypotheses, but its central claim, content-type drivenness, can be formulated as follows: (CTD)
The difference in structural meaning between declarative and interrogative sentences is the difference between different kinds of propositional content.
Whereas (CTD) entails that the illocutionary forces of utterances of declaratives and interrogative sentences are identical to the extent that they are determined by the structural meanings of these sentences, it does not entail that assertions and questions have the same illocutionary force. But given that to make assertions and to ask questions are the default functions of ut-
Language as mind sharing device
213
terances of declarative and interrogative sentences, there must be a default mechanism that turns identical basic forces determined by structural semantics into different actual forces provided by the pragmatics of concrete utterance tokens. CID Theory says that this is indeed the case and that the key to this default mechanism is the difference between the two ways of characterizing knowledge described above. To say on the one hand that both assertive and erotetic speech acts are volitionals which differ from the general volitionals by an implicit epistemic operator, and on the other that the contents of these kinds of speech acts differ in being either full or near-propositions entails that both aim primarily at knowledge sharing (at least in the dialogical cases), but one kind uses a transparently described goal whereas the other presents an opaquely described goal, one wants it to be known that, the other one wants it to be known whether something is the case. Two factors are important for a correct understanding of the CID Theory: First, the Davidsonian conceptualization of the goal and second its precise aspectuality. For the goal to be conceptualized in a Davidsonian way means that the propositional attitude itself can be focused and its possessor can be abstracted away from. This is necessary since according to the CID Theory the identity of the intended possessor of the intended knowledge is not fixed by the structural meaning of the sentence mood, but rather by the utterance situation. Furthermore, the intended goal is not any kind of knowledge, but rather activated knowledge or awareness, which means that there are three ways of reaching this goal: First, by knowledge creation, if the corresponding knowledge was not there beforehand, second by knowledge activation, if the corresponding knowledge was there, but was inactive beforehand, and third by activation maintenance, if the corresponding activated knowledge is already there, but the speaker wants to make sure that this activation goes on. We are now prepared to state a second hypothesis of CID Theory, which complements the CTD-hypothesis (content-type drivenness), and which will be called CFS, Common Force-type Specification: (CFS)
The common denominator of the structural meanings of declarative and interrogative sentences is their underlying force type which is characterized by the Epistemo-Volitional Schema (EVS).
(EVS)
An agent A performs an epistemo-volitional illocution with propositional content p in doing a iff a causes the possibility of ascribing to A a volitional towards the goal that there be an activated epistemic attitude towards p.
214
Dietmar Zaefferer
A further claim of CID Theory is that the default forces of assertion and question that are commonly associated with the declarative and interrogative sentence type can be derived from (EVS) with the help of some principles of rational action and the markedness asymmetry between normal and exceptional situations. This is spelled out in detail in Zaefferer (2006b), here only the basic ideas of this derivation will be outlined. First, suppose the utterance under consideration is that of an interrogative sentence, say (1): (1)
Are you hungry?
According to (CFS) and the (EVS), if an agent A utters (1), this licenses the inference that A wants this utterance token to lead to the existence of activated knowledge of the proposition that correctly describes the addressee as being in a state of present or absent food deprivation. Who is the most likely possessor of this knowledge? Of course the addressee, and it does not matter whether the result of his understanding the act performed by the speaker amounts to knowledge activation maintenance (he was aware of it anyway), knowledge activation (he had not thought of it), or knowledge creation (he had to find out about it), in any case the fact that the knowledge is described opaquely points to the possibility that the speaker does not have it, so under normal cooperative circumstances the addressee will conclude that it is a request for knowledge sharing. Second, let us look at the utterance of a declarative sentence, say (2): (2)
You are hungry.
According to (CFS) and the (EVS), if an agent A utters (2), this licenses the inference that A wants this utterance token to lead to the existence of activated knowledge of the proposition that describes the addressee as being in a state of food deprivation. Again the addressee is the most likely possessor of this knowledge, but in this case the fact that the knowledge is described transparently entails that speaker has it too, so the addressee will assume that the speaker wants to share it, either in order to maintain activated knowledge (Don’t forget!), or to activate it (Let me remind you!), or maybe even to create it (You didn’t notice but, . . . ). These short remarks conclude the illustration of the usefulness of GOEdL in linguistic theorizing in the sense of linguistic conceptualizations. The building blocks of this general ontology together with its domain ontologies,
Language as mind sharing device
215
especially those of mental entities and linguistic phenomena, allow for wide variety of different assemblies and it is hoped that they prove helpful in other applications as well. 5.
Conclusion
Starting from the problem of language comparison I have proposed in this article a radically functional conceptualization of language: Independent of the means used a language is conceived as any open-ended general purpose mind sharing device for mind-endowed agents. Mind sharing in terms of sharing activated concept representations presupposes shared systems of concepts, in other words, shared ontologies. Therefore an ontology has been proposed that is meant to include the most basic building blocks of everyday life conceptualizations which are reflected in everyday language. Against this backdrop I have outlined a way of recombining some of these building blocks for a basic account of what happens when language is used for activated mind sharing, an ontology-based speech act theory. In order to show its usefulness, the core of this theory, a theory of knowledge sharing with the help of questions and assertions, has been elaborated a little further. The article will have served its basic purpose if it has provided its reader with an idea of what an ontolinguistic research program might be and achieve. And it will have reached its more ambitious goals if at least some readers will be convinced that this is a promising line of research. Appendix 1.: GOEdL (General Ontology of Everyday Life) A. Framing entity (provides spatiotemporal location) I. Local space (not inherently directed) II. Temporal space (inherently directed) III. Direction (directed, not bounded or closed) IV. Path (bounded or closed directed space) A. Bounded open path (semi-bounded) 1. Allative path (goal bounded) 2. Ablative path (source bounded) B. Bounded closed path (completely bounded) C. Unbounded closed path (cyclic)
216
Dietmar Zaefferer
B. Concrete entity (has spatiotemporal location) B. I. External entity (outside any mind) A. External situation (spatiotemporally coherent container) 1. Exclusive situation (situation without its content) 2. Inclusive situation (situation including its content) [= 1. Exclusive situation + I. External entity] B. External inventity (content; primarily spatial meronomy) 1. Individual: Count entity (inherently space-bounded, question of completeness central) a. Absolute individual (meronomically free) b. Relational individual (meronomically bound) i. Super-individual (meronomically superordinated) ii. Sub-individual (meronomically subordinated) 2. Dividual: Mass entity (not inherently space-bounded, question of completeness peripheral, homogeneous) a. Substance (non-atomic) b. Collection (atomic) C. External eventity (content; primarily temporal meronomy) 1. Characteristic (not inherently time-bounded) a. Property (inalienable) b. Stage (alienable) i. Static stage: State (force required for termination) ii. Dynamic stage: Activity (force required for maintenance) 2. Transition (stage changing; doubly time-bounded change) a. Transitional event: Achievement (not extended) b. Transitional process: Accomplishment (extended) 3. Transient (stage preserving; doubly time-bounded interlude) a. Transient event: Semelfactive (not extended) b. Transient process: Intergressive (extended)
DOME (Domain Ontology of Mental Entities) B. II. Mental entity (inside some mind or minds) A. M-situation (mental container; inherently bounded)
Language as mind sharing device
217
B. M-inventity (mental content, primarily ‘spatial’ meronomy) 1. M-individual (inherently space-bounded, question of completeness central) a. Absolute m-individual (meronomically free) i. I-situation ii. I-inventity (mental image of an inventity) iii. I-eventity (mental image of an eventity) b. Relational m-individual (meronomically bound) i. M-super-individual (meronomically superordinated) ⇒ Propositional content token [cf. Appendix 2.1.] ii. M-sub-individual (meronomically subordinated) C. M-eventity (mental content; primarily temporal meronomy) 1. M-characteristic (not inherently time-bounded) a. M-property (inalienable) b. M-stage (alienable) ⇒ Propositional attitude token [cf. Appendix 2.2.] i. Static m-stage: Mental state ii. Dynamic m-stage: Mental activity 2. M-transition (doubly time-bounded mental change) a. Transitional m-event: Mental achievement ⇒ Understanding token b. Transitional m-process: Mental accomplishment 3. M-transient (doubly time-bounded mental interlude) a. Transient m-event: Mental semelfactive b. Transient m-process: Mental intergressive B. III. External-mental hybrid (located partly in and partly outside some mind or minds) A. E-m hybrid situation (compound of external situation and mental entities) B. E-m hybrid inventity (compound of external inventity and mental entities) ⇒ Person ⇒ Self, Other C. E-m hybrid eventity (compound of external eventity and mental entities) ⇒ Action performance token ⇒ Oral illocution token: DOLPhen-o [cf. Appendix 3.] ⇒ Signed illocution token
218
Dietmar Zaefferer
C. Abstract entity (lacks spatiotemporal location) I. Abstract framing entity (provides abstract location) II. Type of concrete entity A. Type of external entity B. Type of mental entity C. Type of external-mental hybrid 1. Type of e-m hybrid situation 2. Type of e-m hybrid inventity ⇒ Inscription type 3. Type of e-m hybrid eventity ⇒ Oral illocution type: DOLPhen-O ⇒ Signed illocution type: DOLPhen-S III. Type of abstract entity IV. Type of concrete-abstract hybrid
D. Concrete-abstract hybrid (partly has spatiotemporal location) I. C-a hybrid situation (compound of concrete situation and abstract entities) II. C-a hybrid inventity (compound of concrete inventity and abstract entities) 1. C-a hybrid individual ⇒ Inscription token as objective sign III. C-a hybrid eventity (compound of concrete eventity and abstract entities) 3. C-a hybrid transient (inherently doubly time-bounded; extended) a. Transient event: Semelfactive b. Transient process: Intergressive ⇒ Anadic oral language use token (oral illocution token) ⇒ DOLPhen-o’ ⇒ Signed language use token
Language as mind sharing device
219
Appendix 2.: Excerpts from DOME (Domain Ontology of Mental Entities): Appendix 2.1.: Propositional content token (mental super-individual) 1. Near-proposition token (exclusive i-situation with i-situation type) a. Closed near-proposition (exclusive i-situation with closed i-sit type) b. Open near-proposition (exclusive i-situation with open i-sit type) 2. Full proposition token (inclusive i-situation with i-situation type) a. Closed proposition (inclusive i-situation with closed i-sit type) i. True closed proposition token (mental image of a correctly typed sit) ii. False closed proposition token (mental image of an incorrectly typed sit) b. Open proposition (inclusive i-situation with open i-situation type) 1. Plain proposition token (i-situation with plain i-situation type) 2. Modalized proposition token (i-situation with modalized i-situation type) a. Non-mentally modalized proposition token i. Modalized action proposition token ii. Modalized general proposition token b. Attitudinally modalized proposition token i. Presentatively modalized proposition token [cf. Presentative attitude token] ii. Representatively modalized proposition token [cf. Representative attitude token]
Appendix 2.2.: Propositional attitude token (mental role of propositional content, mental stage of an agent wrt a content) 1. Presentative attitude token (mental role of a blueprint proposition) a. Intention token (feasibility by attitude holder required) b. Volition token (feasibility required) c. Wish token (feasibility irrelevant)
220
Dietmar Zaefferer
2. Representative attitude token (mental role of a picture proposition) a. Knowledge token (uncontroversially true) i. Transparent knowledge token (content: full proposition token) A . Passive transparent knowledge (mental state) B . Activated transparent knowledge (mental activity) ii. Opaque knowledge token (content: near-proposition token) A . Passive opaque knowledge (mental state) B . Activated opaque knowledge (mental activity) b. Belief token (content: full proposition token assumed to be true by attitude holder) c. Hypothetical assumption token (content: full proposition token; truth irrelevant)
Appendix 3.: DOLPhen (Domain Ontology of Linguistic Phenomena; excerpt) DOLPhen-o Oral illocution token a. Holistic illocution token (without propositional content) b. Structured illocution token (with propositional content) i. Volitional illocution A.
General volitional (for goal sharing) I . Directive II . Permissive B . Epistemic volitional (for knowledge sharing) I . Assertive (content is a full proposition) 1. Assertion 2. Commissive 3. Declaration II . Erotetic (content is a near-proposition) 1. Polar question (content is a closed near-proposition) 2. Constituent question (content is an open near-proposition)
ii. Expressive illocution (for emotion sharing) A.
Optative I . Narrow optative II . Imprecative
Language as mind sharing device
221
iii. Hybrid illocution (for knowledge and emotion sharing) A.
Expressive assertive I . Direct expressive assertion (content is a full proposition) II . Indirect assertion (rhetorical question) 1. Rhetorical polar question (content is a closed nearproposition) 2. Rhetorical constituent question (content an open nearproposition) B . Exclamation (assertion with degree ineffability enrichment) I . Overt constituent exclamation 1. Interrogative constituent exclamation (content is an open near-proposition) 2. Demonstrative constituent exclamation (content is an open proposition) II . Covert constituent exclamation (content is an open proposition)
Notes 1. I assume to hold mutatis mutandis for linguistics what Mayr says about biology: Its exceptional position is based on the fact that living processes obey a dual causality: the ahistorical one of natural laws and the historical one of genetic programs. Thus biology connects in a unique way sciences and humanities. It is one of the most fundamental differences between biology and the socalled exact sciences that theories in biology are based on concepts, whereas in the physical sciences they are based on laws of nature. (“Ihre Sonderstellung liegt darin begr¨undet, daß Lebewesen eine doppelte Kausalit¨at auszeichnet: . . . Hierdurch verbindet die Biologie in einzigartiger Weise Naturwissenschaften und Geisteswissenschaften . . . Es ist einer der fundamentalsten Unterschiede zwischen Biologie und den so genannten exakten Naturwissenschaften, daß Theorien in der Biologie auf Konzepten beruhen, w¨ahrend sie in den physikalischen Wissenschaften auf Naturgesetzen fußen.” [Mayr 2002: 23, 27]) 2. For a short overview compare Crystal (2000); some supplementing remarks are added in Zaefferer (2003), a critical review of this book. 3. Cf. the efforts, e.g., of the Hans Rausing Endangered Languages Project at SOAS (University of London), the UK-based Foundation for Endangered Languages, the US-based Endangered Language Fund (ELF), the Electronic Metastructure for Endangered Languages Date project (EMELD), and the Germany-based Documentation of Endangered Languages – Dokumentation bedrohter Sprachen (DOBES) funded by the Volkswagen Foundation. 4. For documentation of the software architecture and the linguistic categories see Nickles (2001) and Peterson (2002), respectively. Their work as well as that of
222
5. 6. 7. 8. 9.
10.
11.
12. 13. 14.
Dietmar Zaefferer
the other project members was supported by grants Za 111/7–5 and Za 111/7–6 from the DFG (German Research Association) to the author, which are gratefully acknowledged. The location of the project’s homepage is: http://www.crg.lmu.de. AVG (Allgemein-Vergleichende Grammatik) is the German name of the CRG project, cf. Peterson (2002). GOLD (General Ontology for Linguistic Description) was first envisioned by Scott Farrar in his (2003) dissertation and introduced by Farrar and Langendoen (cf. Farrar 2003, this vol.; Farrar and Langendoen 2003). Article ‘language.’ Encyclopædia Britannica. Retrieved March 8, 2005, from Encyclopædia Britannica Premium Service. http://www.britannica.com/eb/ article?tocId=9108460. By distributed mind I mean a top-down view of the fact that different minds may entertain representations and even have simultaneous activations of the same concepts. So if the recently discovered species of Homo floresiens had a language, and recent measurements indicate that he had a least a rather well-developed brain and mind (http://www.sciencemag.org/sciencexpress/recent.shtml), then this would be covered by our definition, although it would not be a human language. Tomasello et al. (2005) see the origins of cultural cognition in understanding and sharing intentions and attention (shared intentionality). This is in line with the finding that attention steering ranks fifth after stating, asserting, asking questions, and requesting among illocution types used by preschoolers in a cooperation situation (Zaefferer and Frenz 1979). There is growing evidence for this phylogenetical priority and its presence at least in non-human primates, cf. e.g. Gallese (2003). As for the ontogenetic priority there is a debate whether the prelinguistic ontologies of infants stem from innate ontological knowledge (Soja, Carey, and Spelke 1991) or from early learning (Gentner and Boroditsky 1991). Presumably both positions are partly right. Or sufficiently similar minds. The facial expression of at least the basic emotions tends to converge across cultures. For a critical discussion of the issue cf. Russell (1994). There is growing evidence that the brains our minds run on are ‘pre-wired’ to support the achievement of this task: “The conventional view on intention understanding is that the description of an action and the interpretation of the reason why that action is executed rely on largely different mechanisms. In contrast, the present data show that the intentions behind the actions of others can be recognized by the motor system using a mirror mechanism. Mirror neurons are thought to recognize the actions of others, by matching the observed action onto its motor counterpart coded by the same neurons. The present findings strongly suggest that coding the intention associated with the actions of others is based on the activation of a neuronal chain formed by mirror neurons coding the observed motor act and by ‘logically related’ mirror neurons coding the motor acts that are most likely to follow the observed one, in a given context. To ascribe an intention is to infer a forthcoming new goal, and this is an operation that the motor system does automatically.“ (Iacoboni et al. 2005: 5)
Language as mind sharing device
223
15. This is part of the explanation for the higher degree of iconicity to be found in sign languages, e.g., in their so-called classifier predicates (cf. Liddell 2003: 261–316). 16. Readers with an interest in this kind of discussion are referred to the proceedings of the Evolution of Language conferences: Hurford, Studdert-Kennedy, and Knight (1998); Knight, Studdert-Kennedy, and Hurford (2000); Wray (2002); Tallerman (2005). 17. This is also the reason why I am skeptical about the Minimalist Program (Chomsky 1995) as long as it fails to provide a provide a place for (non-zero) optimal redundancy in its concept of optimal design. 18. In the CRG project, the priority of onomasiology was originally reflected in the development of a concepticon, a universal inventory of linguistically codable concepts. But then it turned out that as soon as the rich network of interconceptual relations is also accounted for, such a concepticon turns out to be what people from artificial intelligence call an ontology. This is how the research program called ontolinguistics emerged. 19. Kurt G¨odel (1906–1978) was a logician, mathematician, and philosopher of mathematics. G¨odel’s most famous works were his incompleteness theorems, the second of which says, roughly, that no formal system which is strong enough to express arithmetics can prove its own consistency unless it is inconsistent. 20. Often also under the label physical versus abstract, e.g. in SUMO (cf. Pease this vol.). Sowa (2000: 67) points out that this distinction goes back at least to Heraclitus’ dichotomy of physis ‘nature’ and logos ‘word, reason’. 21. This is of course inspired by Kant’s ([1781] 2003) idea of the exceptional role of the Formen der Anschauung. 22. This corresponds to Sowa’s (2000: 500–501) distinction between continuants and occurrents. 23. There is no inconsistency if one reads ‘located inside’ as ‘included’ and conceives inclusion as weakly ordered relation. 24. This is in line with Metzinger and Gallese (this vol.), who give the following definition: “Actions are a specific subset of goal-directed movements: A series of movements that are functionally integrated with a currently active representation of a goal-state as leading to a reward constitute an action.” 25. For a recent reformulation of his view of the logical dependence of the ontology of social phenomena including language on what he calls the deontology of the corresponding society see Searle (2006). 26. Note that the bottom-up conceptualization of pair objects (sub-individual: glove, super-individual: pair of gloves) is not a universal. There are cultures and languages that prefer the opposite direction, e.g. Hungarian f´el szemmel ‘with one eye’, literally ‘with half an eye’ (Bechert 1991: 67). 27. Dowty (1979: 51) uses this lengthy label in order to point at the roots and the history of the classification. 28. Compare also Trautwein (this vol.) and Schalley (this vol.). 29. For a more comprehensive (but still rather general) discussion of modal categories compare Zaefferer (2005). 30. For Krifka (2004) “speech act types are commitment change potentials.”
224
Dietmar Zaefferer
31. David Kaplan (1999) distinguishes in this context between expressive and descriptive content.
References Ameka, Felix, Alan Dench, and Nicholas Evans (eds.) 2006 Catching Language: The Standing Challenge of Grammar Writing. Berlin/ New York: Mouton de Gruyter. Barwise, Jon, and John Etchemendy 1987 The Liar. An Essay in Truth and Circularity. Oxford: Oxford University Press. Bechert, Johannes 1991 The problem of semantic incomparability. In Semantic Universals and Universal Semantics, Dietmar Zaefferer (ed.), 60–71. Berlin/New York: Foris. Chomsky, Noam 1995 The Minimalist Program. Cambridge, MA: MIT Press. Comrie, Bernard 1998 Ein Strukturrahmen f¨ur deskriptive Grammatiken: Allgemeine Bemerkungen. In Deskriptive Grammatik und allgemeiner Sprachvergleich, Dietmar Zaefferer (ed.), 7–16. T¨ubingen: Niemeyer. Comrie, Bernard, William Croft, Christian Lehmann, and Dietmar Zaefferer 1993 A framework for descriptive grammars. In Proceedings of the XVth International Congress of Linguists, Vol. I, Andr´e Crocheti`ere, JeanClaude Boulanger, and Conrad Ouellon (eds.), 159–170. Sainte-Foy, Qu´ebec: Les Presses de l’Universit´e Laval. Comrie, Bernhard, and Norval Smith 1977 Lingua Descriptive Studies: questionnaire. Lingua 42: 1–71. Crystal, David 2000 Language Death. Cambridge: Cambridge University Press. Davidson, Donald 1963 Actions, reasons and causes. Journal of Philosophy 60: 685–700. Dowty, David R. 1979 Word Meaning and Montague Grammar: The Semantics of Verbs and Times in Generative Semantics and Montague's PTQ. Dordrecht: Reidel. Farrar, Scott 2003 An ontology for lingistics on the Semantic Web. Ph.D. diss., Department of Linguistics, University of Arizona. this vol. Using ‘Ontolinguistics’ for language description. Farrar, Scott, and D. Terence Langendoen 2003 A linguistic ontology for the Semantic Web. GLOT International 7 (3): 97–100.
Language as mind sharing device
225
Gallese, Vittorio 2003 A neuroscientific grasp of concepts: From control to representation. Philosophical Transactions of the Royal Society B: Biological Sciences 358: 1231–1240. Gentner, Dedre, and Lera Boroditsky 2001 Individuation, relational relativity and early word learning. In Language Acquisition and Conceptual Development, Melissa Bowerman and Stephen Levinson (eds.), 215–256, Cambridge: Cambridge University Press. Hurford, James R., Michael Studdert-Kennedy, and Chris Knight (eds.) 1998 Approaches to the Evolution of Language: Social and Cognitive Bases. Cambridge: Cambridge University Press. Hyman, Larry M. 2004 Why join the Linguistic Society of America? http://www.lsadc.org/. Iacoboni, Marco, Istvan Molnar-Szakacs, Vittorio Gallese, Giovanni Buccino, John C. Mazziotta, and Giacomo Rizzolatti 2005 Grasping the intentions of others with one’s own mirror neuron system. PLoS Biology 3 (3): e79. Kant, Immanuel 2003 Reprint. Kritik der reinen Vernunft, Ingeborg Heidemann (ed.), Ditzingen: Reclam. Original edition, Riga: Johann Friedrich Hartknoch, 1781. Kaplan, David 1999 What is meaning? Explorations in the theory of Meaning as Use. Brief version – Draft 1, ms, UCLA. Knight, Chris, Michael Studdert-Kennedy, and James R. Hurford (eds.) 2000 The Evolutionary Emergence of Language: Social Function and the Origins of Linguistic Form. Cambridge: Cambridge University Press. Krifka, Manfred 2004 Semantics below and above speech acts. Handout for a talk at Stanford University, April 9, 2004. Liddell, Scott K. 2003 Grammar, Gesture, and Meaning in American Sign Language. Cambridge: Cambridge University Press. Mayr, Ernst 2002 Die Autonomie der Biologie. Naturwissenschaftliche Rundschau 55 (1): 23–29. Metzinger, Thomas, and Vittorio Gallese this vol. The emergence of a shared action ontology: Building blocks for a theory. Mosel, Ulrike 2006 Grammaticography – the art and craft of writing grammars. In Ameka, Dench, and Evans (eds.), 41–68.
226
Dietmar Zaefferer
Nickles, Matthias 2001 Systematics – Ein XML-basiertes Internet-Datenbanksystem f¨ur klassifikationsgest¨utze Sprachbeschreibungen. CIS-Bericht–01–129. Universit¨at M¨unchen: Centrum f¨ur Informations- und Sprachverarbeitung. http://www.cis.uni-muenchen.de/CISPublikationen.html. Pease, Adam this vol. Formal representation of concepts: The Suggested Upper Merged Ontology and its use in linguistics. Peterson, John 2002 AVG 2.0. Cross-linguistic Reference Grammar. Final Report. CISBericht–02–130. Universit¨at M¨unchen: Centrum f¨ur Informationsund Sprachverarbeitung. http://www.cis.uni-muenchen.de/CISPubli kationen.html. Rosch, Eleanor, Carolyn B. Mervis, Wayne Gray, David M. Johnson, and Penny Boyes-Braem 1976 Basic objects in natural categories. Cognitive Psychology 8: 382– 439. Russell, James A. 1994 Is there universal recognition of emotion from facial expression? A review of the cross-cultural studies? Psychological Bulletin 115: 102–141. Schalley, Andrea C. 2004 Cognitive Modeling and Verbal Semantics. A Representational Framework Based on UML. (Trends in Linguistics. Studies and Monographs 154.) Berlin/New York: Mouton de Gruyter. this vol. Relating ontological knowledge and internal structure of eventity concepts. Searle, John R. 1983 Intentionality: An Essay in the Philosophy of Mind. Cambridge: Cambridge University Press. 2006 Social ontology: Some basic principles. Anthropological Theory 6: 12–29. Soja, Nancy N., Susan Carey, and Elizabeth S. Spelke 1991 Ontological categories guide young children’s inductions of word meaning: object terms and substance terms. Cognition 38: 179–211. Sowa, John F. 2000 Knowledge Representation. Logical, Philosophical, and Computational Foundations. Pacific Grove: Brooks and Cole. Tallerman, Maggie (ed.) 2005 Language Origins. Perspectives on Evolution. Oxford: Oxford University Press. Tomasello, Michael, Malinda Carpenter, Josep Call, Tanya Behne, and Henrike Moll 2005 Understanding and sharing intentions: The origins of cultural cognition. Behavioral and Brain Sciences 28: 675–691.
Language as mind sharing device
227
Trautwein, Martin this vol. On the ontological, conceptual, and grammatical foundations of verb classes. Wray, Alison (ed.) 2002 The Transition to Language. Oxford: Oxford University Press. Zaefferer, Dietmar 1997 Neue Technologien in der Sprachbeschreibung. Der Paradigmenwechsel von linearen P-Grammatiken zu vernetzten E-Grammatiken. Zeitschrift f¨ur Literaturwissenschaft und Linguistik 106: 76–82. 2001 Modale Kategorien. In Sprachtypologie und sprachliche Universalien/Language Typology and Linguistic Universals, Martin Haspelmath, Ekkehart K¨onig, Wulf Oesterreicher, and Wolfgang Raible (eds.), 784–816. (HSK 20.1.) Berlin/New York: Mouton de Gruyter. 2003 Besprechung von: David Crystal, Language Death, Cambridge: Cambridge University Press 2000. Linguistische Berichte 196: 479– 484. 2005 A general typology of modal categories. Tidsskrift for Sprogforskning 3 (2), tema “Modality. Linguistic, Philosophical and Logical Aspects of a Universal Category”: 19–49. 2006a Realizing Humboldt’s dream: Cross-linguistic grammatography as database creation. In Ameka, Dench, and Evans (eds.), 113–135. 2006b Types, moods, and force potentials. Towards a comprehensive account of German sentence mood meanings. Theoretical Linguistics 32 (3): 335–351. Zaefferer, Dietmar (ed.) 1998 Deskriptive Grammatik und allgemeiner Sprachvergleich, T¨ubingen: Niemeyer. Zaefferer, Dietmar, and Hans-Georg Frenz 1979 Sprechakte bei Kindern. Eine empirische Untersuchung zur Entwicklung der sprachlichen Handlungsf¨ahigkeit im Vorschulalter. Linguistik und Didaktik 38: 91–132.
Part III: Concepts with closed-class coding
The representation of spatial structure in spoken and signed language: A neural model Leonard Talmy 1.
Introduction1
This paper presents and relates novel perspectives on the representation of spatial structure in three domains: in spoken language, in signed language, and in the neural underpinning of these two language modalities. The analysis of spatial schemas in spoken language has now progressed to where we can catalog most of the basic elements that make them up and observe how combinations of such basic elements behave. We can now also observe the ways in which the representation of spatial structure in signed language differs from that in spoken language. And we can accordingly propose a new neural model of our language capacity that accommodates both the similarities and the differences found in these two systems of spatial representation. Linguistic research to date has determined many of the factors that structure the spatial schemas found across spoken languages (e.g. Gruber 1965; Fillmore 1968; Leech 1969; Clark 1973; Bennett 1975; Herskovits 1982; Jackendoff 1983; Zubin and Svorou 1984; as well as myself, Talmy 1983, 2000a, 2000b). It is now feasible to integrate these factors and to determine the comprehensive system they constitute for spatial structuring in spoken language. This system is characterized by several features. With respect to constituency, there is a relatively closed universally available inventory of fundamental spatial elements that in combination form whole schemas. There is a relatively closed set of categories that these elements appear in. And there is a relatively closed small number of particular elements in each category, hence, of spatial distinctions that each category can ever mark. With respect to synthesis, selected elements of the inventory are combined in specific arrangements to make up the whole schemas represented by closed-class spatial forms. Each such whole schema that a closed-class form represents is thus a “pre-packaged” bundling together of certain elements in a particular arrangement. Each language has in its lexicon a relatively closed set of such pre-packaged schemas (larger than that of spatial closed-class forms, due to polysemy) that a speaker must select among in depicting a spatial scene. Fi-
232
Leonard Talmy
nally, with respect to the whole schemas themselves, these schemas can undergo a certain set of processes that extend or deform them. Such processes are perhaps part of the overall system so that a language’s relatively closed set of spatial schemas can fit more spatial scenes. An examination of signed language2 shows that its structural representation of space systematically differs from that in spoken language in the direction of what appear to be the structural characteristics of scene parsing in visual perception. Such differences include the following: Signed language can mark finer spatial distinctions with its inventory of more structural elements, more categories, and more elements per category. It represents many more of these distinctions in any particular expression. It also represents these distinctions independently in the expression, not bundled together into pre-packaged schemas. And its spatial representations are largely iconic with visible spatial characteristics. Our findings point to a new view of the neural implementation of language. They suggest that instead of some discrete whole-language module, spoken language and signed language are both based on some more limited core linguistic system responsible for their commonalities. This system then connects with different further neural subsystems for the full functioning of the two different language modalities. When formal linguistic investigation of signed language began several decades ago, it was important to establish in the context of that time that signed language was in fact a full genuine language, and the way to do this, it seemed, was to show that it fit the prevailing model of language, the Chomskyan-Fodorianlanguage module. Since then, however, evidence has been steadily accruing that signed language does diverge in various respects from spoken language. The modern response to such observations – far from once again calling into question whether signed language is a genuine language – should be to rethink what the general nature of language is. Part of that enterprise is undertaken in this paper. 2.
Fundamental space-structuring elements and categories in spoken language
An initial main finding emerges from analysis of the spatial schemas expressed by closed-class (grammatical) forms across spoken languages. There is a relatively closed and universally available inventory of fundamental conceptual elements that recombine in various patterns to constitute those spatial
Spatial structure in spoken and signed language: A neural model
233
schemas. These elements fall within a relatively closed set of categories, with a relatively closed small number of elements per category. 2.1.
The target of analysis
As background to this finding, spoken languages universally exhibit two different subsystems of meaning-bearing forms. One is the “open-class” or “lexical” subsystem, comprised of elements that are great in number and readily augmented – typically, the roots of nouns, verbs, and adjectives. The other is the “closed-class” or “grammatical” subsystem, consisting of forms that are relatively few in number and difficult to augment – including such bound forms as inflections and such free forms as prepositions and conjunctions. As argued in Talmy (2000a, Chapter 1.), these subsystems basically perform two different functions: open-class forms largely contribute conceptual content, while closed-class forms determine conceptual structure. Accordingly, our discussion focuses on the spatial schemas represented by closed-class forms so as to examine the concepts used by language for structuring purposes. Across spoken languages, only a portion of the closed-class subsystem regularly represents spatial schemas. We can identify the types of closedclass forms in this portion and group them according to their kind of schema. a. Types of closed-class forms with schemas for paths or sites include: i. forms in construction with a nominal, such as prepositions like English across (as in across the field) or noun affixes like the Finnish illative suffix -:n ‘into’, as well as prepositional complexes such as English in front of or Japanese constructions with a “locative noun” like ue ‘top surface’, (as in teeburu no ue ni ‘table GEN top at’ = “on the table”); ii. forms in construction with a verb, such as verb satellites like English out, back and apart (as in They ran out / back / apart); iii. deictic determiners and adverbs such as English this and here; iv. indefinites, interrogatives, relatives, etc., (such as English everywhere / whither / wherever); v. qualifiers such as English way and right (as in It’s way / right up there); vi. adverbials like English home (as in She isn’t home). b. Types of closed-class forms with schemas for the spatial structure of objects include:
234
Leonard Talmy i. forms modifying nominals such as markers for plexity or state of boundedness, like English -s for multiplexing (as in birds) or -ery for debounding (as in shrubbery); ii. numeral classifiers like Korean chang ‘planar object’; iii. forms in construction with the verb, such as some Atsugewi Cause prefixes, like cu- ‘as the result of a linear object moving axially into the Figure’.
c. Sets of closed-class forms that represent a particular component of a a spatial event of motion/location (see Talmy 2000b, Chapters 1. and 2.) include: i. the Atsugewi verb-prefix set that represents different Figures; ii. the Atsugewi verb-suffix set that represents different Grounds (together with Paths); iii. the Atsugewi verb-prefix set that represents different Causes; iv. the Nez Perce verb-prefix set that represents different Manners.
2.2.
Determining the elements and categories
A particular methodology is used to determine fundamental spatial elements in language. One starts with any closed-class spatial morpheme in any language, considering the full schema that it expresses and a spatial scene that it can apply to. One then determines any factor one can change in the scene so that the morpheme no longer applies to it. Each such factor must therefore correspond to an essential element in the morpheme’s schema. To illustrate, consider the English preposition across and the scene it refers to in The board lay across the road. Let us here grant the first two elements in the across schema (demonstrated elsewhere): (1) a Figure object (here, the board) is spatially related to a Ground object (here, the road); and (2) the Ground is ribbonal – a plane with two roughly parallel line edges that are as long as or longer than the distance between them. The remaining elements can then be readily demonstrated by the methodology. Thus, a third element is that the Figure is linear, generally bounded at both ends. If the board were instead replaced by a planar object, say, some wall siding, one could no longer use the original across preposition but would have to switch to the schematic domain of another preposition, that of over, as in The wall siding lay over the road. A fourth element is that the axes of the Figure and of the Ground are roughly
Spatial structure in spoken and signed language: A neural model
235
perpendicular. If the board were instead aligned with the road, one could no longer use the original across preposition but would again have to switch to another preposition, along, as in The board lay along the road. Additionally, a fifth element of the across schema is that the Figure is parallel to the plane of the Ground. In the referent scene, if the board were tilted away from parallel, one would have to switch to some other locution such as The board stuck into / out of the road. A sixth element is that the Figure is adjacent to the plane of the Ground. If the board were lowered or raised away from adjacency, even while retaining the remaining spatial relations, one would need to switch to locutions like The board lay (buried) in the road / The board was (suspended) above the road. A seventh element is that the Figure’s length is at least as great as the Ground’s width. If the board were replaced by something shorter, for example, a baguette, while leaving the remaining spatial relations intact, one would have to switch from across to on, as in The baguette lay on the road. An eighth element is that the Figure touches both edges of the Ground. If the board in the example retained all its preceding spatial properties but were shifted axially, one would have to switch to some locution like One end of the board lay over one edge of the road. Finally, a ninth element is that the axis of the Figure is horizontal (the plane of the Ground is typically, but not necessarily, horizontal). Thus, if one changes the original scene to that of a spear hanging on a wall, one can use across if the spear is horizontal, but not if it is vertical, as in The spear hung across the wall / The spear hung up and down on the wall. Thus, from this single example, the methodology shows that at least the following elements figure in closed-class spatial schemas: a Figure and a Ground, a point, a line, a plane, a boundary (a point as boundary to a line, a line as boundary to a plane), parallelness, perpendicularity, horizontality, adjacency (contact), and relative magnitude. In the procedure of systematically testing candidate factors for their relevance, the elements just listed have proved to be essential to the selected schema and hence, to be in the inventory of fundamental spatial elements. But it is equally necessary to note candidates that do not prove out, so as to know which potential spatial elements do not serve a structuring function in language. In the case of across, for example, one can probe whether the Figure, like the board in the referent scene, must be planar – rather than simply linear – and coplanar with the plane of the Ground. It can be seen, though, that this is not an essential element to the across schema, since this factor can be altered in the scene by standing the board on edge without any need to alter the preposition, as in The board lay flat / stood on edge across the road.
236
Leonard Talmy
Thus, coplanarity is not shown by across to be a fundamental spatial element. However, it does prove to be so in other schemas, and so in the end must be included in the inventory. This is seen for one of the schemas represented by English over, as in The tapestry hung over the wall. Here, both the Figure and Ground must be planes and coplanar with each other. If the tapestry here were changed to something linear, say, a string of beads, it is no longer appropriate to use over but only something like against, as in The string of beads hung *over / against the wall. Now, another candidate element – that the Figure must be rigid, like the board in the scene – can be tested and again found to be inessential to the across schema, since a flexible linear object can be substituted for the board without any need to change the preposition, as seen in The board / The cable lay across the road. Here, however, checking this candidate factor across numerous spatial schemas in many languages might well never yield a case in which it does figure as an essential element and so would be kept off the inventory. This methodology affords a kind of existence proof: it can demonstrate that some element does occur in the universally available inventory of structural spatial elements since it can be seen to occur in at least one closed-class spatial schema in at least one language. The procedure is repeated numerous times across many languages to build up a sizable inventory of elements essential to spatial schemas. The next step is to discern whether the uncovered elements comprise particular structural categories and, if so, to determine what these categories are. It can be observed that for certain sets of elements, the elements in a set are mutually incompatible – only one of them can apply at a time at some point in a schema. Such sets are here taken to be basic spatial categories. Along with their members, such categories are also part of language’s fundamental conceptual structuring system for space. A representative sample of these categories is presented next. It will be seen that these categories generally have a relatively small membership. This finding depends in part on the following methodological principles. An element proposed for the inventory should be as coarse-grained as possible – that is, no more specific than is warranted by cross-schema analysis. Correlatively, in establishing a category, care must be taken that it include only the most generic elements that have actually been determined – that is, that its membership have no finer granularity than is warranted by the element-abstraction procedure. For example, the principle of mutual incompatibility yields a spatial category of “relative orientation” between two lines
Spatial structure in spoken and signed language: A neural model
237
or planes, a category with perhaps only two member elements (both already seen in the across schema): approximately parallel and approximately perpendicular. Some evidence additionally suggests an intermediary ‘oblique’ element as a third member of the category. Thus, some English speakers may distinguish a more perpendicular sense from a more oblique sense, respectively, for the two verb satellites out and off, as in A secondary pipe branches out / off from the main sewer line. In any case, though, the category would have no more than these two or three members. Although finer degrees of relative orientation can be distinguished by other cognitive systems, say, in visual perception and in motor control, the conceptual structuring subsystem of language does not include anything finer than the two-or three-way distinction. The procedures of schema analysis and cross-schema comparison, together with the methodological principles of maximum granularity for elements and for category membership, can lead to a determination of the number of structurally distinguished elements ever used in language for a spatial category. 2.3.
Sample categories and their member elements
The fundamental categories of spatial structure in the closed-class subsystem of spoken language fall into three classes according to the aspect of a spatial scene they pertain to: the segmentation of the scene into individual components, the properties of an individual component, and the relations of one such component to another. In a fourth class are categories of non-geometric elements frequently found in association with spatial schemas. A sampling of categories and their member elements from each of these four classes is presented next. Category names are enclosed by quote marks here and throughout. The examples provided here are primarily drawn from English but can be readily multiplied across a diverse range of languages (see Talmy 2000a, Chapter 3.). 2.3.1.
Categories pertaining to scene segmentation
The class designated as scene segmentation may include only one category, that of “major components of a scene”, and this category may contain only three member elements: the Figure, the Ground, and a secondary Reference Object. Figure and Ground were already seen for the across schema. Schema
238
Leonard Talmy
comparison shows the need to recognize a third scene component, the Secondary Reference Object – in fact, two forms of it: encompassive of or external to the Figure and Ground. The English preposition near, as in The lamp is near the TV specifies the location of the Figure (the lamp) only with respect to the Ground (the TV). But localizing the Figure with the preposition above, as in The lamp is above the TV, requires knowledge not only of where the Ground object is, but also of the encompassive earth-based spatial grid, in particular, of its vertical orientation. Thus, above requires recognizing three components within a spatial scene, a Figure, a Ground, and a Secondary Reference Object of the encompassive type. Comparably, the schema of past in John is past the border only relates John as Figure to the border as Ground. One could say this sentence on viewing the event through binoculars from either side of the border. But John is beyond the border can be said only by someone on the side of the border opposite John, hence the beyond schema establishes a perspective point at that location as a secondary Reference Object – in this case, of the external type. 2.3.2.
Categories pertaining to an individual scene component
A number of categories pertain to the characteristics of an individual spatial scene component. This is usually one of the three major components resulting from scene segmentation – the Figure, Ground, or Secondary Reference Object – but it could be others, such as the path line formed by a moving Figure. One such category is that of “dimension” with four member elements: zero dimensions for a point, one for a line, two for a plane, and three for a volume. Some English prepositions require a Ground object schematizable for only one of the four dimensional possibilities. Thus, the schema of the preposition near as in near the dot requires only that the Ground object be schematizable as a point. Along, as in along the trail, requires that the Ground object be linear. Over as in a tapestry over a wall requires a planar Ground. And throughout, as in cherries throughout the jello, requires a volumetric Ground. A second category is that of “number” with perhaps four members: one, two, several, and many. Some English prepositions require a Ground comprising objects in one or another of these numbers. Thus, near requires a Ground consisting of just one object, between of two objects, among of several objects, and amidst of numerous objects, as in The basketball lay near the boulder / between the boulders / among the boulders / amidst the cornstalks. The category of number appears to lack any further members – that
Spatial structure in spoken and signed language: A neural model
239
is, closed-class spatial schemas in languages around the world seem never to incorporate any other number specifications – such as ‘three’ or ‘evennumbered’ or ‘too many’. A third category is that of “motive state”, with two members: motion and stationariness. Several English prepositions mark this distinction for the Figure. Thus, in one of its senses, at requires a stationary Figure, as in I stayed / *went at the library, while into requires a moving Figure, as in I went / *stayed into the library. Other prepositions mark this same distinction for the Ground object (in conjunction with a moving Figure). Thus, up to requires a stationary Ground (here, the deer), as in The lion ran up to the deer, while after requires a moving Ground as in The lion ran after the deer. Apparently no spatial schemas mark such additional distinctions as motion at a fast vs. slow rate, or being located at rest vs. remaining located fixedly. A fourth category is that of “state of boundedness” with two members: bounded and unbounded. The English preposition along requires that the path of a moving Figure be unbounded, as shown by its compatibility with a temporal phrase in for but not in, as in I walked along the pier for 10 minutes / *in 20 minutes. But the spatial locution the length of requires a bounded path, as in I walked the length of the pier in 20 Minutes / *for 10 minutes.3 While some spatial schemas have the bounded element at one end of a line and the unbounded element at the other end, apparently no spatial schema marks any distinctions other than the two cited states of boundedness. For example, there is no cline of gradually increasing boundedness, nor a gradient transition, although just such a “clinal boundary” appears elsewhere in our cognition, as in geographic perception or conception, e.g., in the gradient demarcation between full forest and full meadowland (Mark and Smith 2004). Continuing the sampling of this class, a fifth category is that of “directedness” with two members: basic and reversed. A schema can require one or the other of these elements for an encompassive Ground object, as seen for the English prepositions in The axon grew along / against the chemical gradient, or for the Atsugewi verb satellites for (moving) ‘downstream’ and ‘upstream’. Or it can require one of the member elements for an encompassive Secondary Reference Object (here, the line), as in Mary is ahead of / behind John in line. A sixth category is “type of geometry” with two members: rectilinear and radial. This category can apply to an encompassive Secondary Reference Object to yield reference frames of the two geometric types. Thus, in a subtle
240
Leonard Talmy
effect, the English verb satellite away, as in The boat drifted further and further away / out from the island, tends to suggest a rectilinear reference frame in which one might picture the boat moving rightward along a corridor or sea lane with the island on the left (as if along the x-axis of a Cartesian grid). But out tends to suggest a radial reference frame in which the boat is seen moving from a center point along a radius through a continuum of concentric circles. In the type-of-geometry category, the radial-geometry member can involve motion about a center, along a radius, or along a periphery. The first of these is the basis for a further category, that of “orientation of spin axis”, with two members: vertical and horizontal. The English verb satellites around and over specify motion of the Figure about a vertical or horizontal spin axis, respectively, as in The pole spun around / toppled over and in I turned the pail around / over. An eighth category is “phase of matter”, with three main members, solid, liquid, and empty space, and perhaps a fourth member, fire. Thus, among the dozen or so Atsugewi verb satellites that subdivide the semantic range of English into plus a Ground object, the suffix -ik’s specifies motion horizontally into solid matter (as chopping an ax into a tree trunk), -ic’t specifies motion into liquid, -ipsnu specifies motion into the empty space of a volumetric enclosure, and -caw specifies motion into a fire. The phase of matter category even figures in some English prepositions, albeit covertly. Thus, in can apply to a Ground object of any phase of matter, whereas inside can apply only to one with empty space, as seen in The rock is in / inside the box; in / *inside the ground; in / *inside the puddle of water; in / *inside the fire. A final category in this sampled series is that of “state of consolidation” with apparently two members: compact (precisional) and diffuse (approximative). The English locative prepositions at and around distinguish these two concepts, respectively, for the area surrounding a Ground object, as in The other hiker will be waiting for you at / around the landmark. The two deictic adverbs in The hiker will be waiting for you there / thereabouts mark the same distinction (unless there is better considered neutral to the distinction). And in Malagasy (Imai 2002), two locative adverbs for ‘here’ mark this distinction, with eto for ‘here within this bounded region’, typically indicated with a pointing finger, and ety for ‘here spread over this unbounded region’, typically indicated with a sweep of the hand. In addition to this sampling, some ten or so further categories pertaining to properties of an individual schema component, each category with a small number of fixed contrasts, can be readily identified.
Spatial structure in spoken and signed language: A neural model
2.3.3.
241
Categories pertaining to the relation of one scene component to another
Another class of categories pertains to the relations that one scene component can bear to another. One such category was described earlier, that of “relative orientation”, with two or three members: parallel, perpendicular, and perhaps oblique. A second such category is that of “degree of remove”, of one scene component from another. This category appears to have four or five members, two with contact between the components – coincidence and adjacency – and two or three without contact – proximal, perhaps medial, and distal remove. Some pairwise contrasts in English reveal one or another of these member elements for a Figure relating to a Ground. Thus, the locution in the front of, as in The carousel is in the front of the fairground, expresses coincidence, since the carousel as Figure is represented as being located in a part of the fairground as Ground. But in front of (without a the) as in The carousel is in front of the fairground, indicates proximality, since the carousel is now located outside the fairground and near it but not touching it. The distinction between proximal and distal can be teased out by noting that in front of can only represent a proximal but not a distal degree of remove, as seen in the fact that one can say The carousel is 20 feet in front of the fairground, but not, *The carousel is 20 miles in front of the fairground, whereas above allows both proximal and distal degrees of remove, as seen in The hawk is 1 foot / 1 mile above the table. The distinction between adjacency and proximality is shown by the prepositions on and over, as in The fly is on / over the table. Need for a fifth category member of ‘medial degree of remove’ might come from languages with a ‘here / there / yonder’ kind of distinction in their deictic adverbs or demonstratives. A third category in this series is that of “degree of dispersion” with two members: sparse and dense. To begin with, English can represent a set of multiple Figures, say, 0-dimensional peas, as adjacent to or coincident with a 1-, 2-, or 3-dimensional Ground, say, with a knife, a tabletop, or aspic, in a way neutral to the presence or absence of dispersion, as in There are peas on the knife; on the table; in the aspic. But in representing dispersion as present, English can (or must) indicate its degree. Thus, a sparse degree of dispersion is indicated by the addition of the locution here and there, optionally together with certain preposition shifts, as in There are peas here and there on / along the knife; on / over the table; in the aspic. And for a dense degree of dispersion, English has the three specialized forms all along, all over and
242
Leonard Talmy
throughout, as seen in There are peas all along the knife; all over the table; throughout the aspic. A fourth category is that of “path contour” with perhaps some four members: straight, arced, circular, and meandering. Some English prepositions require one or another of these contour elements for the path of a Figure moving relative to a Ground. Thus, across indicates a straight path, as seen in I drove across the plateau / *hill, while over – in its usage referring to a single path line – indicates an arced contour, as in I drove over the hill / *plateau. In one of its senses, around indicates a roughly circular path, as in I walked around the maypole, and about indicates a meandering contour, as in I walked about the town. Some ten or so additional categories for relating one scene component to another, again each with its own small number of member contrasts, can be readily identified.
2.3.4.
Non-geometric categories
All the preceding elements and their categories have broadly involved geometric characteristics of spatial scenes or the objects within them – that is, they have been genuinely spatial. But a number of non-geometric elements are recurrently found in association with otherwise geometric schemas. One category of such elements is that of “force dynamics” (see Talmy 2000a, Chapter 7.) with two members: present and absent. Thus, geometrically, the English prepositions on and against both represent a Figure in adjacent contact with a Ground, but in addition, on indicates that the Figure is supported against the pull of gravity through that contact while against indicates that it is not, as seen in The poster is on / *against the wall and The floating helium balloon is against / *on the wall. Cutting the conceptualization of force somewhat differently (Melissa Bowerman, pers. comm.), the preposition op in the Dutch of the Netherlands indicates a Figure supported comfortably in a natural rest state through its contact with a Ground, whereas aan indicates that the Figure is being actively maintained against gravity through contact with the Ground, so that flesh is said to be “op” the bones of a live person but “aan” the bones of a dead person. A second non-geometric category is that of “accompanying cognitive/affective state”, though its extent of membership is not clear. One recurrent member, however, is the attitude toward something that it is unknown, mysterious, or risky. Perhaps in combination with elements of inaccessibility
Spatial structure in spoken and signed language: A neural model
243
or non-visibility, this category member is associated with the Figure’s location in the otherwise spatial indications of the English preposition beyond, whereas it is absent from the parallel locution on the other side of, as in He is beyond / on the other side of the border (both these locutions – unlike past seen above – are otherwise equivalent in establishing a viewpoint location as an external Secondary Reference Object). A third non-geometric category – in the class that relates one scene component to another – is that of “relative priority”, with two members: coequal and main/ancillary. The English verb satellites together and along both indicate joint participation, as seen in I jog together / along with him. But together indicates that the Figure and the Ground are coequal partners in the activity, whereas along indicates that the Figure entity is ancillary to the Ground entity, who would be assumed to engage in the activity even if alone (see Talmy 2000b, Chapter 3.).
2.4.
Properties of the inventory
By our methodology, the universally available inventory of structural spatial elements includes all elements that appear in at least one closed-class spatial schema in at least one language. These elements may indeed be equivalent in their sheer availability for use in schemas. But beyond that, they appear to differ in their frequency of occurrence across schemas and languages, ranging from very common to very rare. Accordingly, the inventory of elements – and perhaps also that of categories – may have the property of being hierarchical, with entries running from the most to the least frequent. Such a hierarchy suggests asking whether the elements in the inventory, the categories in the inventory, and the elements in each category form fully closed memberships. That is, does the hierarchy end at a sharp lower boundary or trail off indefinitely? With many schemas and languages already examined, our sampling method may have yielded all the commoner elements and categories, but as the process slows down in the discovery of the rarer forms, will it asymptotically approach some complete constituency and distinctional limit in the inventory, or will it be able to go on uncovering sporadic novel forms as they develop in the course of language change? The latter seems likelier. Exotic elements with perhaps unique occurrence in one or a few schemas in just one language can be noted, including in English. Thus, in referring to location at the interior of a wholly or partly enclosed
244
Leonard Talmy
vehicle, the prepositions in and on distinguish whether the vehicle lacks or possesses a walkway. Thus, one is in a car but on a bus, in a helicopter but on a plane, in a grain car but on a train, and in a rowboat but on a ship. Further, Fillmore has observed that this on also requires that the vehicle be currently in use as transport: The children were playing in / *on the abandoned bus in the junkyard. Thus, schema analysis in English reveals the element “(partly) enclosed vehicle with a walkway currently in use as transport”. This is surely one of the rarer elements in schemas around the world, and its existence, along with that of various others that can be found, suggests that indefinitely many more of them can sporadically arise. In addition to being only relatively closed at its hierarchically lower end, the inventory may include some categories whose membership seems not to settle down to a small fixed set. One such category may be that of “intrinsic parts”. Frequently encountered are the five member elements ‘front’, ‘side’, ‘back’, ‘top’, and ‘bottom’, as found in the English prepositions in The cat lay before / beside / behind / atop / beneath the TV. But languages like Mixtec seem to distinguish a rather different set of intrinsic parts in their spatial schemas (Brugmann and Macaulay 1986), while Makah distinguishes many more and finer parts, such as with its verb suffixes for ‘at the ankle’ and ‘at the groin’ (Matthew Davidson, pers. comm.). Apart from any such fuzzy lower boundary or non-coalescing categories, there does appear to exist a graduated inventory of basic spatial elements and categories that is universally available and, in particular, is relatively closed. Bowerman has raised the main challenge to this notion (see, e.g., Bowerman 1989). She notes, for example, that at the same time that children acquiring English learn its in/on distinction, children acquiring Korean learn its distinction between kkita ‘put [Figure] in a snug fit with [Ground]’ and nehta ‘put [Figure] in a loose fit with [Ground]’ she argues that since the elements ‘snug fit’ and ‘loose fit’ are presumably rare among spatial schemas across languages, they do not come from any preset inventory, one that might plausibly be innate, but rather are learned from the open-ended semantics of the adult language. My reply is that the spatial schemas of genuinely closedclass forms in Korean may well still be built from the proposed inventory elements, and that the forms she cites are actually open-class verbs. Open-class semantics – whether for space or other domains – seems to involve a different cognitive subsystem, drawing from finer discriminations within a broader perceptual / conceptual sphere. The Korean verbs are perhaps learned at the same age as English space-related open-class verbs like squeeze. Thus, Eng-
Spatial structure in spoken and signed language: A neural model
245
lish-acquiring children probably understand that squeeze involves centripetal pressure from encircling or bi-/multi-laterally placed Antagonists (typically the arm(s) or hand(s)) against an Agonist that resists the pressure but yields down to some smaller compass where it blocks further pressure, and hence that one can squeeze a teddy bear, a tube of toothpaste, or a rubber ball, but not a piece of string or sheet of paper, juice or sugar or the air, a tabletop or the corner of a building. Thus, Bowerman’s challenge may be directed at the wrong target, leaving the proposed roughly preset inventory of basic spatial building blocks intact.
2.5.
Basic elements assembled into whole schemas
The procedure so far has been analytic, starting with the whole spatial schemas expressed by closed-class forms and abstracting from them an inventory of fundamental spatial elements. But the investigation must also include a synthetic procedure: examining the ways in which individual spatial elements are assembled to constitute whole schemas. Something of such an assembly was implicit in the initial discussion of the across schema. But an explicit example here can better illustrate this part of the investigation. Consider the schema represented by the English preposition past as in The ball sailed past my head at exactly 3 PM. This schema is built out of the following fundamental spatial elements (from the indicated categories) in the indicated arrangements and relationships: There are two main scene components (members of the “major scene components” category), a Figure and a Ground (here, the ball and my head, respectively). The Figure is schematizable as a 0-dimensional point (a member element of the “dimension” category). This Figure point is moving (a member element of the “motive state” category). Hence it forms a 1-dimensional line (a member of the “dimension” category). This line constitutes the Figure’s “path”. The Ground is also schematizable as a 0-dimensional point (a member of the “dimension” category). There is a point P at a proximal remove (a member of the “degree of remove” category) from the Ground point, forming a 1-dimensional line with it (a member of the “dimension” category). This line is parallel (a member of the “relative orientation” category) to the horizontal plane (a member of the “intrinsic parts” category) of the earth-based grid (a member of the “major scene components” category). The Figure’s path is perpendicular (a member of the “relative orientation” category) to this line. The Figure’s path is also
246
Leonard Talmy
parallel to the horizontal plane of the earth-based grid. If the Ground object has a front, side, and back (members of the “intrinsic parts” category), then point P is proximal to the side part. A non-boundary point (a member of the “state of boundedness” category) of the Figure’s path becomes coincident (a member of the “degree of remove” category) with point P at a certain point of time. Note that here the Figure’s path must be specified as passing through a point proximal to the Ground because if it instead passed through the Ground point, one would switch from the preposition past to into, as in The ball sailed into my head, and if it instead past through some distal point, one might rather say something like The ball sailed along some ways away from my head. And the Figure’s path must be specified both as horizontal and as located at the side portion of the Ground because, for example here, if the ball were either falling vertically or traveling horizontally at my front, one would no longer say that it sailed “past” my head. The least understood aspect of the present investigation is what wellformedness conditions, if any, may govern the legality of such combinations. As yet, no obvious principles based, say, on geometric simplicity, symmetry, consistency, or the like are seen to control the patterns in which basic elements assemble into whole schemas. On the one hand, some seemingly byzantine combinations – like the schemas seen above for across and past – occur with some regularity across languages. On the other hand, much simpler combinations seem never to occur as closed-class schemas. For example, one could imagine assembling elements into the following schema: down into a surround that is radially proximal to a center point. One could even invent a preposition apit to represent this schema. This could then be used, say, in I poured water apit my house to refer to my pouring water down into a nearby hole dug in the field around my house. But such schemas are not found. Similarly, a number of schematic distinctions in, for example, the domain of rotation are regularly marked by signed languages, as seen below, and could readily be represented with the inventory elements available to spoken languages, yet they largely do not occur. It could be argued that the spoken language schemas are simply the spatial structures most often encountered in everyday activity. But that would not explain why the additional sign-language schemas – presumably also reflective of everyday experience – do not show up in spoken languages. Besides, the different sets of spatial schemas found in different spoken languages are diverse enough from each other that arguing on the basis of the determinative force of everyday expe-
Spatial structure in spoken and signed language: A neural model
247
rience is problematic. Something else is at work but it is not yet clear what that is.
2.6.
Properties and processes applying to whole spatial schemas
It was just seen that selected elements of the inventory are combined in specific arrangements to make up the whole schemas represented by closed-class spatial forms. Each such whole schema is thus a “pre-packaged” bundling together of certain elements in a particular arrangement. Each language has in its lexicon a relatively closed set of such pre-packaged schemas – a set larger than that of its spatial closed-class forms, because of polysemy. A speaker of the language must select among these schemas in depicting a spatial scene. We now observe that such schemas, though composite, have a certain unitary status in their own right, and that certain quite general properties and processes can apply to them. In particular, certain properties and processes allow a schema represented by a closed-class form to generalize to a whole family of schemas. In the case of a generalizing property, all the schemas of a family are of equal priority. On the other hand, a generalizing process acts on a schema that is somehow basic, and either extends or deforms it to yield nonbasic schemas (see Talmy 2000a, Chapters 1. and 3., 2000b, Chapter 5.). Such properties and processes are perhaps part of the overall spoken-language system so that any language’s relatively closed set of spatial closed-class forms and the schemas that they basically represent can be used to match more spatial structures in a wider range of scenes. Looking first at generalizing properties of spatial schemas, one such property is that they exhibit a topological or topology-like neutrality to certain factors of Euclidean geometry. Thus, they are magnitude neutral, as seen in such facts as that the across schema can apply to a situation of any size, as in The ant crawled across my palm / The bus drove across the country. Further, they are largely shape-neutral, as seen by such facts as that, while the through schema requires that the Figure form a path with linear extent, it lets that line take any contour, as in I zig-zagged / circled through the woods. And they are bulkneutral, as seen by such facts as that the along schema requires a linear Ground without constraint on the Ground’s radial extension, as in The caterpillar crawled up along the filament / tree trunk. Thus, while holding to their specific constraints, schemas can vary freely in other respects and so cover a range of spatial configurations.
248
Leonard Talmy
Among the generalizing processes that extend schemas, one is that of “extendibility from the prototype”, which can actually serve as an alternative interpretation for some forms of neutrality, otherwise just treated under generalizing properties. Thus, in the case of shape, as for the through schema above, this schema could alternatively be conceived as prototypically involving a strait path line for the Figure, one that can then be bent to any contour. And, in the case of bulk, as for the along schema above, this schema could be thought prototypically to involve a purely 1-dimensional line that then can be radially inflated. Another such process is “extendibility in ungoverned dimensions”. By this process, a scene component of dimensionality N in the basic form of a schema can generally be raised in dimensionality to form a line, plane, or volume aligned in a way not conflicting with the schema’s other requirements. To illustrate, it was seen earlier under the “type of geometry” category that the English verb satellite out has a schema involving a point Figure moving along a radius away from a center point through a continuum of concentric circles, as in The boat sailed further and further out from the island. This schema with the Figure idealizable as a point is the basic form. But the same satellite can be used when this Figure point is extended to form a 1-dimensional line along a radius, as in The caravan of boats sailed further and further out from the island. And the out can again be used if the Figure point were instead extended as a 1-dimensional line forming a concentric circle, as in A circular ripple spread out from where the pebble fell into the water. In turn, such a concentric circle could be extended to fill in the interior plane, as in The oil spread out over the water from where it spilled. Alternatively, the concentric circle could have been extended in the vertical dimension to form a cylinder, as in A ring of fire spread out as an advancing wall of flames. Or again, the circle could have been extended to form a spherical shell, as in The balloon I blew into slowly puffed out. And such a shell can be extended to fill in the interior volume, as in The leavened dough slowly puffed out. Thus, the same form out serves for this series of geometric extensions without any need to switch to some different form. One more schema-extending process is “extendibility across motive states”. A schema basic for one motive state and Figure geometry can in general be systematically extended to another motive state and Figure geometry. For example, a closed-class form whose most basic schema pertains to a point Figure moving to form a path can generally serve as well to represent the related schema with a stationary linear Figure in the same location as the
Spatial structure in spoken and signed language: A neural model
249
path. Thus, probably the most basic across schema is actually for a moving point Figure, as in The gopher ran across the road. By the present process, this schema can extend to the static linear Figure schema first seen in The board lay across the road. All the spatial properties uncovered for that static schema hold as well for the present basic dynamic schema, which in fact is the schema in which these properties originally arise. Among the generalizing processes that deform a schema, one is that of “stretching”, which allows a slight relaxing of one of the normal constraints. Thus, in the across schema, where the Ground plane is either a ribbon with a long and short axis or a square with equal axes, a static linear Figure or the path of a moving point Figure must be aligned with the short Ground axis or with one of its equal axes. Accordingly, one can say I swam across the canal and I swam across the square pool when moving from one side to the other, but one cannot say *I swam across the canal when moving from one end of the canal to the other. But, by moderately stretching one axis length relative to the other, one might just about be able to say I swam across the pool when moving from one end to the other of an slightly oblong pool. Another schema deforming process is that of “feature cancellation”, in which a particular complex of elements in the basic schema is omitted. Thus, the preposition across can be used in The shopping cart rolled across the boulevard and was hit by an oncoming car, even though one feature of the schema – ‘terminal point coincides with the distal edge of the Ground ribbon’ – is canceled from the Figure’s path. Further, both this feature and the feature ‘beginning point coincides with the proximal edge of the Ground ribbon’ are canceled in The tumbleweed rolled across the prairie for an hour. Thus, the spoken language system includes a number of generalizing properties and processes that allow the otherwise relatively closed set of abstracted or basic schemas represented in the lexicon of any single language to be applicable to a much wider range of spatial configurations.
3.
Spatial structuring in signed language
All the preceding findings on the linguistic structuring of space have been based on the patterns found in spoken languages. The inquiry into the fundamental concept structuring system of language leads naturally to investigating its character in another major body of linguistic realization, signed language. The value in extending the inquiry in this way would be to discover whether
250
Leonard Talmy
the spatial structuring system is the same or is different in certain respects across the two language modalities, with either discovery having major consequences for cognitive theory. In this research extension, a problematic issue is exactly what to compare between spoken and signed language. The two language systems appear to subdivide into somewhat different sets of subsystems. Thus, heuristically, the generalized spoken language system can be thought to consist of an open-class or lexical subsystem (generally representing conceptual content); a closed-class or grammatical subsystem (generally representing conceptual structure); a gradient subsystem of “vocal dynamics” (including loudness, pitch, timbre, rate, distinctness, unit separation); and an accompanying somatic subsystem (including facial expression, gesture, and “body language”). On the other hand, by one provisional proposal, the generalized sign language system might instead divide up into the following: a subsystem of lexical forms (including noun, verb, and adjective signs); an “inflectional” subsystem (including modulations of lexical signs for person, aspect); a subsystem of size-and-shape specifiers (or SASS’s; a subsystem of so-called “classifier expressions”); a gestural subsystem (along a gradient of incorporation into the preceding subsystems); a subsystem of face, head, and torso representations; a gradient subsystem of “bodily dynamics” (including amplitude, rate, distinctness, unit separation); and an associated or overlaid somatic subsystem (including further facial expression and “body language”). In particular here, the subsystem of classifier expressions – which is apparently present in all signed languages – is a formally distinct subsystem dedicated solely to the schematic structural representation of objects moving or located with respect to each other in space (see Liddell 2003; Emmorey 2002). Each classifier expression, perhaps generally corresponding to a clause in spoken language, represents a so-conceived event of motion or location.4 The research program of comparing the representation of spatial structure across the two language modalities ultimately requires considering the two whole systems and all their subsystems. But the initial comparison – the one adopted here – should be between those portions of each system most directly involved with the representation of spatial structure. In spoken language, this is that part of the closed-class subsystem that represents spatial structure and, in signed language, it is the subsystem of classifier constructions. Spelled out, the shared properties that make this initial comparison apt include the following. First, of course, both subsystems represent objects relating to each other in space. Second, in terms of the functional distinction between “structure”
Spatial structure in spoken and signed language: A neural model
251
and “content” described earlier, each of the subsystems is squarely on the structural side. In fact, analogous structure-content contrasts occur. Thus, the English closed-class form into represents the concept of a path that begins outside and ends inside an enclosure in terms of schematic structure, in contrast with the open-class verb enter that represents the same concept in terms of substantive content (see Talmy 2000a, Chapter 1., for this structure-content distinction). Comparably, any of the formations within a classifier expression for such an outside-to-inside path represents it in terms of its schematic structure, in contrast with the unrelated lexical verb sign that can be glossed as ‘enter’. Third, in each subsystem, a schematic structural form within an expression in general can be semantically elaborated by a content form that joins or replaces it within the same expression. Thus, in the English sentence I drove it ( – the motorcycle – ) in (to the shed) the parenthesized forms optionally elaborate on the otherwise schematically represented Figure and Ground. Comparably, in the ASL sentence (SHED) (MOTORCYCLE) vehicle-moveinto-enclosure, the optionally signed forms within parentheses elaborate on the otherwise schematic Figure and Ground representations within the hyphenated classifier expression. To illustrate the classifier system, a spatial event that English could express as The car drove past the tree could be expressed in ASL as follows: The signer’s dominant hand, used to represent the Figure object, here has a “3 handshape” (index and middle fingers extended forward, thumb up) to represent a land vehicle. The non-dominant hand, used to represent the Ground object, here involves an upright “5 handshape” (forearm held upright with the five fingers extended upward and spread apart) to represent a tree. The dominant hand is moved horizontally across the signer’s torso and past the non-dominant forearm. Further though, this basic form could be modified or augmented to represent additional particulars of the referent spatial event. Thus, the dominant hand can show additional characteristics of the path. For example, the hand could move along a curved path to indicate that the road being followed was curved, it could slant upward to represent an uphill course, or both could be shown together. The dominant hand can additionally show the manner of the motion. For example, as it moves along, it could oscillate up and down to indicate a bumpy ride, or move quickly to indicate a swift pace, or both could be shown together, as well as with the preceding two path properties. And the dominant hand can show additional relationships of the Figure to the Ground. For example, it could pass nearer or farther from the non-dominant hand to indicate the car’s distance from the tree when
252
Leonard Talmy
passing it, it could make the approach toward the non-dominant hand longer (or shorter) than the trailing portion of the path to represent the comparable relationship between the car’s path and the tree, or it could show both of these together or, indeed, with all the preceding additional characteristics. The essential finding of how signed language differs from spoken language is that it more closely parallels what appear to be the structural characteristics of scene parsing in visual perception. This difference can be observed in two venues, the universally available spatial inventory and the spatial expression. These two venues are discussed next in turn. 3.1.
In the inventory
The inventory of forms for representing spatial structure available to the classifier subsystem of signed language has a greater total number of fundamental elements, a greater number of categories, and generally a greater number of elements per category than the spoken language closed-class inventory. While many of the categories and their members seem to correspond across the two inventories, the signed language inventory has an additional number of categories and member elements not present in the spoken language inventory. Comparing the membership of the corresponding categories in terms of discrete elements, the number of basic elements per category in signed language actually exhibits a range: from being the same as that for spoken language to being very much greater. Further, though, while the membership of some categories in signed language may well consist of discrete elements, that of others appears to be gradient. Here, any procedure of tallying some fixed number of discrete elements in a category must give way to determining the approximate fineness of distinctions that can be practicably made for that category. So while some corresponding categories across the two language modalities may otherwise be quite comparable, their memberships can be of different types, discrete vs. analog. Altogether, then, given its greater number of categories, generally larger membership per category, and a frequently gradient type of membership, the inventory of forms for building a schematic spatial representation available to the classifier subsystem of signed language is more extensive and finer than for the closed-class subsystem of spoken language. This greater extensiveness and finer granularity of spatial distinctions seems more comparable to that of spatial parsing in visual perception. The following are some spatial categories in common across the two language modalities, but with increasing disparity in size of membership. First,
Spatial structure in spoken and signed language: A neural model
253
some categories appear to be quite comparable across the two modalities. Thus, both the closed-class subsystem of spoken language and the classifier subsystem of signed language structurally segment a scene into the same three components, a Figure, a Ground, and a secondary Reference Object. Both subsystems represent the category of dimensionality with the same four members – a point, a line, a plane, and a volume. And both mark the same two degrees of boundedness: bounded and unbounded. For certain categories, signed language has just a slightly greater membership than does spoken language. Thus, for motive state, signed language structurally represents not only moving and being located, but also remaining fixedly located – a concept that spoken languages typically represent in verbs but not in their spatial preposition-like forms. For some other spatial categories, signed language has a moderately greater membership than spoken language. In some of these categories, the membership is probably gradient, but without the capacity to represent many fine distinctions clearly. Thus, signed language can apparently mark moderately more degrees of remove than spoken language’s four or five members in this category. It can also apparently distinguish moderately more path lengths than the two – short and long – that spoken language marks structurally (as in English The bug flew right / way up there). And while spoken language can mark at most three distinctions of relative orientation – parallel, perpendicular, and oblique – signed language can distinguish a moderately greater number, for example, in the elevation of a path’s angle above the horizontal, or in the angle of the Figure’s axes to that of the Ground (e.g. in the placement of a rod against a wall). Finally, there are some categories for which signed language has an indefinitely greater membership than spoken language. Thus, while spoken language structurally distinguishes some four path contours as seen in Section 2.3.3., signed language can represent perhaps indefinitely many more, including zigzags, spirals, and ricochets. And for the category “locus within referent space”, spoken language can structurally distinguish perhaps at most three loci relative to the speaker’s location – ‘here’, ‘there’, and ‘yonder’ – whereas sign language can distinguish indefinitely many more within sign space. Apart from membership differences across common categories, signed language represents some categories not found in spoken language. One such category is the relative lengths of a Figure’s path before and after encounter with the Ground. Or again, signed language can represent not only the cate-
254
Leonard Talmy
gory of “degree of dispersion” (which spoken language was seen to represent in Section 2.3.3.), but also the category “pattern of distribution”. Thus, in representing multiple Figure objects dispersed over a planar surface, it could in addition structurally indicate that these Figure objects are linear (as with dry spaghetti over a table) and are arrayed in parallel alignment, crisscrossing, or in a jumble. This difference in the number of structurally marked spatial category and element distinctions between spoken and signed language can be highlighted with a closer analysis of a single spatial domain, that of rotational motion. As seen earlier, the closed-class subsystem in spoken language basically represents only one category within this domain, that of “orientation of spin axis”, and within this category distinguishes only two member elements, vertical and horizontal. These two member elements are expressed, for example, by the English verb satellites around and over as in The pole spun around / toppled over. ASL, by contrast, distinguishes more degrees of spin axis orientation and, in addition, marks several further categories within the domain of rotation. Thus, it represents the category of “amount of rotation” and within this category can readily distinguish, say, whether the arc of a Figure’s path is less than, exactly, more than, or many times one full circuit. These are differences that English might offer for inference only from the time signature, as in I ran around the house for 20 seconds / in 1 minute / for 2 minutes / for hours, while using the same single spatial form around for all these cases. Further, while English would continue using just around and over, ASL further represents the category of “relation of the spin axis to an object’s geometry” and marks many distinctions within this category. Thus, it can structurally mark the spin axis as being located at the center of the turning object – as well as whether this object is planar like a CD disk, linear like a propeller, or an aligned cylinder like a pencil spinning on its point. It distinguishes this from the spin axis located at the boundary of the object – as well as whether the object is linear like the “hammer” swung around in a hammer toss, a transverse plane like a swinging gate, or a parallel plane like a swung cape. And it further distinguishes these from the spin axis located at a point external to the object – as well as whether the object is point-like like the earth around the sun, or linear like a spinning hoop. Finally, ASL can structurally represent the category of “uniformity of rotation” with its two member elements, uniform and non-uniform, where English could mark this distinction only with an open-class form, like the verbs in The hanging rope spun / twisted around, while once again continuing with the same single structural closed-class form
Spatial structure in spoken and signed language: A neural model
255
around. Thus, while spoken language structurally marks only a minimal distinction of spin axis orientation throughout all these geometrically distinct forms of rotation, signed language marks more categories as well as finer distinctions within them, and a number of these appear to be distinguished as well by visual parsing of rotational movement. To expand on the issue of gradience, numerous spatial categories in the classifier subsystem of signed language – for example, many of the 30 spatial categories listed in Section 3.2.3.1. are gradient in character. Spoken language has a bit of this, as where the vowel length of a waaay in English can be varied continuously. But the preponderant norm is the use of discrete spatial elements, typically incorporated into distinct morphemes. For example, insofar as they represent degree of remove, the separate forms in the series on / next to / near / away from represent increasing distance in what can be considered quantal jumps. That is, the closed-class subsystem of spoken language is a type of cognitive system whose basic organizing principle is that of the recombination of discrete elements (i.e., the basic conceptual elements whose combinations, in turn, comprise the meanings of discrete morphemic forms). By contrast, the classifier subsystem of signed language is the kind of cognitive system whose basic organizing principle largely involves gradience, much as would seem to be the case as well for the visual and motor systems. In fact, within a classifier expression, the gradience of motor control and of visual perception are placed in sync with each other (for the signer and the addressee, respectively), and conjointly put in the service of the linguistic system. While this section provides evidence that the classifier subsystem in signed language diverges from the schematizing of spoken language in the direction of visual parsing, one must further observe that the classifier subsystem is also not “simply” a gestural system wholly iconic with visual perception. Rather, it incorporates much of the discrete, categorial, symbolic, and metaphoric character that is otherwise familiar from the organization of spoken language. Thus, as already seen above, spatial representation in the classifier subsystem does fall into categories, and some of these categories contain only a few discrete members – in fact, several of these are much the same as in spoken language. Second, the handshapes functioning as classifiers for the Figure, manipulator, instrument, or Ground within classifier expressions are themselves discrete (non-gradient) members of a relatively closed set. Third, many of the hand movements in classifier expressions represent particular concepts or metaconcepts and do not mimic actual visible
256
Leonard Talmy
movements of the represented objects. Here is a small sample of this property. After one lowers one’s two extended fingers to represent a knife dipping into peanut butter – or all one’s extended fingers in a curve to represent a scoop dipping into coffee beans – one curls back the fingertips while moving back up to represent the instrument’s “holding” the Figure, even though the instrument in question physically does nothing of the sort. Or again, the free fall of a Figure is represented not only by a downward motion of the dominant hand in its classifier handshape, but also by an accompanying rotation of the hand – whether or not the Figure in fact rotated in just that way during its fall. As another example, a Figure is shown as simply located at a spot in space by the dominant hand in its classifier handshape being placed relaxedly at a spot in signing space, and as remaining fixedly at its spot by the hand’s being placed tensely and with a slight final jiggle, even though these two conceptualizations of the temporal character of a Figure’s location are visually indistinguishable. Or, further, a (so-conceivedly) random spatial distribution of a mass or multiplex Figure along a line, over a plane, or through a volume is represented by the Figure hand being placed with a loose nonconcerted motion, typically three times, at uneven spacings within the relevant n-dimensional area, even though that particular spacing of three exemplars may not correspond to the actual visible distribution. And finally, a classifier hand’s type of movement can indicate whether this movement represents the actual path of the Figure, or is to be discounted. Thus, the two flat hands held with palms toward the signer, fingertips joined, can be moved steadily away to represent a wall’s being slid progressively outward (as to expand a room), or instead can be moved in a quick up-and-down arc to a point further away to represent a wall relocated to a further spot, whatever its path from the starting location. That is, the latter quick arc movement represents a metaconcept: that the path followed by the hands does not represent the Figure’s actual path and is to be disregarded from calculations of iconicity. All in all, then, the classifier subsystem presents itself as a genuine linguistic system, but one having more extensive homology with the visual structuring system than spoken language has. 3.2.
In the expression
The second venue, that of any single spatial expression, exhibits further respects in which signed language differs from spoken language in the apparent direction of visual scene parsing. Several of these are outlined next.
Spatial structure in spoken and signed language: A neural model
3.2.1.
257
Iconic representation in the expression
Spatial representation in signed classifier expressions is iconic with scene parsing in visual perception in at least the following four respects. 3.2.1.1.
Iconic clustering of elements and categories
The structural elements of a scene of motion are clustered together in the classifier subsystem’s representation of them in signed language more as they seem to be clustered in perception. When one views a motion event, such as a car driving bumpily along a curve past a tree, it is perceptually the same single object, the car, that exhibits all of the following characteristics: it has certain object properties as a Figure, it moves, it has a manner of motion, it describes a path of a particular contour, and it relates to other surrounding objects (the Ground) in its path of motion. The Ground object or objects are perceived as separate. Correspondingly, the classifier subsystem maintains exactly this pattern of clustering. It is the same single hand, the dominant hand, that exhibits the Figure characteristics, motion, manner, path contour, and relations to a Ground object. The other hand, the non-dominant, separately represents the Ground object. All spoken languages diverge to a greater or lesser extent from this visual fidelity. Thus, consider one English counterpart of the event, the sentence The car bumped along past the tree. Here, the subject nominal, the car, separately represents the Figure object by itself. The verb complex clusters together the representations of the verb and the satellite: The verb bumped represents both the fact of motion and the manner of motion together, while its sister constituent, the satellite along represents the presence of a path of translational motion. The prepositional phrase clusters together the preposition past, representing the path conformation, and its sister constituent, the nominal the tree, representing the Ground object. It in fact remains a mystery at this point in the investigation why all spoken languages using a preposition-like constituent to indicate path always conjoin it with the Ground nominal and basically never with the Figure nominal,5 even though the Figure is what executes the path, and is so represented in the classifier construction of signed language. 3.2.1.2.
Iconic representation of object vs. action
The classifier subsystem of signed language appears to be iconic with visual parsing not only in its clustering of spatial elements and categories, as just
258
Leonard Talmy
seen, but largely also in its representation of them. For example, it marks one basic category opposition, that between an entity and its activity, by using an object like the hand to represent an object, and motion of the hand to represent motion of the object. More specifically, the hand or other body part represents a structural entity (such as the Figure) – with the body part’s configuration representing the identity or other properties of the entity – while movements or positionings of the body part represent properties of the entity’s motion, location, or orientation. For example, the hand could be shaped flat to represent a planar object (e.g. a sheet of paper), or rounded to represent a cup-shaped object. And, as seen, any such handshape as Figure could be moved along a variety of trajectories that represent particular path contours. But an alternative to this arrangement could be imagined. The handshape could represent the path of a Figure – e.g., a fist to represent a stationary location, the outstretched fingers held flat together to represent a straight line path, the fingers in a curved plane for a curved path, and the fingers alternately forward and backward for a zigzag path. Meanwhile, the hand movement could represent the Figure’s shape – e.g., the hand moving in a circle to represent a round Figure and in a straight line for a linear Figure. However, no such mapping of referents to their representations is found.6 Rather, the mapping in signed language is visually iconic: it assigns the representation of a material object in a scene to a material object in a classifier complex, for example, the hand, and the representation of the movements of that object in the scene to the movements of the hand. No such iconic correspondence is found in spoken language. Thus, while material objects are prototypically expressed by nouns in English, they are instead prototypically represented by verb roots in Atsugewi (see Talmy 2000b, Chapter 1.). And while path configurations are prototypically represented in Spanish by verbs, this is done by prepositions and satellites in English.
3.2.1.3.
Iconic representation of further particular categories
Finer forms of iconicity are also found within each branch of the broad entity-activity opposition. In fact, most of the spatial categories listed in Section 3.2.3.1. that a classifier expression can represent are largely iconic with visual parsing. Thus, an entity’s form is often represented by the form of the hand(s), its size by the compass of the hand(s), and its number by the number of digits or hands extended. And, among many other categories in the list, an
Spatial structure in spoken and signed language: A neural model
259
entity’s motive state, path contour, path length, manner of motion, and rate of motion are separately represented by corresponding behaviors of the hand(s). Spoken language, again, has only a bit of comparable iconicity. As examples, path length can be iconically represented in English by the vowel length of way, as in The bird flew waay / waaaay / waaaaaay up there. Path length can also be semi-iconically represented by the number of iterations, as in The bird flew up / up up / up up up and away. Perhaps the number of an entity can be represented in some spoken language by a closed-class reduplication. But the great majority of spoken closed-class representations show no such iconicity. 3.2.1.4.
Iconic representation of the temporal progression of a trajectory
The classifier subsystem is also iconic with visual parsing in its representation of temporal progression, specifically, that of a Figure’s path trajectory. For example, when an ASL classifier expression represents “The car drove past the tree”, the “past” path is shown by the Figure hand progressing from the nearer side of the Ground arm to a point beside it and then on to its further side, much like the path progression one would see on viewing an actual car passing a tree. By contrast, nothing in any single closed-class path morpheme in a spoken language corresponds to such a progression. Thus, the past in The car drove past the tree is structurally a single indivisible linguistic unit, a morpheme, whose form represents no motion ahead in space. Iconicity of this sort can appear in spoken language only where a complex path is treated as a sequence of subparts, each with its own morphemic representation, as in I reached my hand down around behind the clothes hamper to get the vacuum cleaner. 3.2.2.
A narrow time-space aperture in the expression
Another way that the classifier expression in signed language may be more like visual perception is that it appears to be largely limited to representing a narrow time-space aperture. The tentative principle is that a classifier complex readily represents what would appear within a narrow scope of space and time if one were to zoom in with one’s scope of perception around a Figure object, but little outside that narrowed scope. Hence, a classifier expression readily represents the Figure object as to its shape or type, any manipulator or instrument immediately adjacent to the Figure, the Figure’s current state
260
Leonard Talmy
of Motion (motion or locatedness), the contour or direction of a moving Figure’s path, and any Manner exhibited by the Figure as it moves. However, a classifier expression can little represent related factors occurring outside the current time, such as a prior cause or a follow-up consequence. And it can little represent even concurrent factors if they lie outside the immediate spatial ambit of the Figure, factors like the ongoing causal activity of an intentional Agent or other external instrumentality. By contrast, spoken languages can largely represent such non-local spatiotemporal factors within a single clause. In particular, such representation occurs readily in satellite-framed languages such as English (see Talmy 2000b, Chapters 1. and 3.). In representing a Motion event, this type of language regularly employs the satellite constituent (e.g. the verb particle in English) to represent the Path, and the main verb to represent a “co-event”. The co-event is ancillary to the main Motion event and relates to it as its precursor, enabler, cause, manner, concomitant, consequence, or the like. Satellite-framed languages can certainly use this format to represent within-aperture situations that can also be represented by a classifier complex. Thus, English can say within a single clause – and ASL can sign within a single classifier expression – a motion event in which the Figure is moved by an adjacent manipulator, as in I pinched some moss up off the rock and I pulled the pitcher along the counter, or in which the Figure is moved by an adjacent instrument, as in I scooped jelly beans up into the bag. The same holds for a situation in which a moving Figure exhibits a concurrent Manner, as in The cork bobbed past the seaweed. But English can go on to use this same one-clause format to include the representation of co-events outside the aperture, either temporally or spatially. Thus, temporally, English can include the representation of a prior causal event, as in I kicked the football over the goalpost (first I kicked the ball, then it moved over the goalpost). And it can represent a subsequent event, as in They locked the prisoner into his cell (first they put him in, then they locked it). But ASL cannot represent such temporally extended event complexes within a single classifier expression. Thus, it can represent the former sentence with a succession of two classifier expressions: first, flicking the middle finger of the dominant hand across the other hand’s upturned palm to represent the component event of kicking an object, and next moving the extended index finger of the dominant hand axially along a line through the space formed by the up-pointing index and little fingers of the non-dominant hand, representing the component event of the ball’s passing over the goal-
Spatial structure in spoken and signed language: A neural model
261
post. But it cannot represent the whole event complex within a single expression – say, by flicking one’s middle finger against the other hand whose extended index finger then moves off axially along a line. Further, English can use the same single-clause format to represent events with spatial scope beyond a narrow aperture, for example, an Agent’s concurrent causal activity outside any direct manipulation of the Figure, as in I walked / ran / drove / flew the memo to the home office. Again, ASL cannot represent the whole event complex of, say, I ran the memo to the home office within a single classifier expression. Thus, it could not, say, adopt the classifier for holding a thin flat object (thumb pressed against flat fingers) with the dominant hand and placing this atop the non-dominant hand while moving forward with it as it shows alternating strokes of two downward pointed fingers to indicate running (or concurrently with any other indication of running). Instead a sequence of two expressions would likely be used, for example, first one for taking a memo, then one for a person speeding along.7 Although the unacceptable examples above have been devised, they nevertheless show that it is physically feasible for a signed language to represent factors related to the Figure’s Motion outside its immediate space-time ambit. Accordingly, the fact that signed languages, unlike spoken languages, do avoid such representations may follow from deeper structural causes, such as a greater fidelity to the characteristics of visual perception. However apt, though, such an account leaves some facts still needing explanation. Thus, on the one hand, it makes sense that the aperture of a classifier expression is limited temporally to the present moment – this accords with our usual understanding of visual perception. But it is not clear why the aperture is also limited spatially. Visual perception is limited spatially to a narrow scope only when attention is being focused, but is otherwise able to process a wide-scoped array. Why then should classifier expressions avoid such wide spatial scope as well? Further, sign languages can include representation of the Ground object within a single classifier expression (typically with the nondominant hand), even where that object is not adjacent to the Figure. 3.2.3.
More independent distinctions representable in the expression
This third property of classifier expressions has two related aspects – the large number of different elements and categories that can be represented together, and their independent variability – and these are treated in succession next.
262
Leonard Talmy
3.2.3.1.
Many more elements / categories representable within a single expression
Although the spatiotemporal aperture that can be represented within a single classifier expression may be small compared to that in a spoken-language clause, the number of distinct factors within that aperture that can be represented is enormously greater. In fact, perhaps the most striking difference between the signed and the spoken representation of space in the expression is that the classifier system in signed language permits the representation of a vastly greater number of distinct spatial categories simultaneously and independently. A spoken language like English can separately represent only up to four or five different spatial categories with closed-class forms in a single clause. As illustrated in the sentence The bat flew way back up into its niche in the cavern, the verb is followed in turn by: a slot for indication of path length (with three members: “zero” for ‘neutral’, way for ‘relatively long’, right for ‘relatively short’); a slot for state of return (with two members: “zero” for ‘neutral’, back for ‘return’); a slot for displacement within the earth-frame (with four members: “zero” for ‘neutral’, up for ‘positive vertical displacement’, down for ‘negative vertical displacement’, over for ‘horizontal displacement’); a slot for geometric conformation (with many members, including in, across, past); and perhaps a slot for motive state and vector (with two members: “zero” for ‘neutral between location AT and motion TO’ as seen in in / on, and -to for ‘motion TO’ as seen in into / onto). Even a polysynthetic language like Atsugewi has closed-class slots within a single clause for only up to six spatial categories: path conformation combined with Ground type, path length, vector, deixis, state of return, and cause or manner. In contrast, by one tentative count, ASL has provision for the separate indication of thirty different spatial categories. These categories do exhibit certain co-occurrence restrictions, they differ in obligatoriness or optionality, and it is unlikely – perhaps impossible – for all thirty of them to be represented at once. Nevertheless, a sizable number of them can be represented in a single classifier expression and varied independently there. The table below lists the spatial categories that I have provisionally identified as available for concurrent independent representation. The guiding principle for positing a category has been that its elements are mutually exclusive: different elements in the same category cannot be represented together in the same classifier expression. If certain elements can be concurrently represented, they belong to different categories. Following this principle has, on the one hand, involved
Spatial structure in spoken and signed language: A neural model
263
joining together what some sign language analyses have treated as separate factors. For example, the first category below covers equally the representation of Figure, instrument, or manipulator (handling classifier), since these three kinds of elements apparently cannot be separately represented in a single expression – one or another of them must be selected. On the other hand, the principle requires making distinctions within some categories that spoken languages treat as uniform. Thus, the single “manner” category of English must be subdivided into a category of “divertive manner” (e.g. moving along with an up-down bump) and a category of “dynamic manner” (e.g. moving along rapidly) because these two factors can be represented concurrently and varied independently. A. entity properties 1. 2. 3. 4. 5. B.
orientation properties 6. 7. 8.
C.
identity (form or semantic category) of Figure / Instrument / Manipulator identity (form or semantic category) of Ground magnitude of some major entity dimension magnitude of a transverse dimension number of entities an entity’s rotatedness about its left-right axis (“pitch”) an entity’s rotatedness about its front-back axis (“roll”) a. an entity’s rotatedness about its top-bottom axis (“yaw”) b. an entity’s rotatedness relative to its path of forward motion
locus properties 9.
locus within sign space
D. Motion properties 10. motive state (moving / resting / fixed) 11. internal motion (e.g. expansion/contraction, form change, wriggle, swirling) 12. confined motion (e.g. straight oscillation, rotary oscillation, rotation, local wander) 13. translational motion E.
Path properties 14. state of continuity (unbroken / saltatory) 15. contour of path
264
Leonard Talmy 16. 17. 18. 19. 20. 21. 22. 23.
F.
state of boundedness (bounded / unbounded) length of path vertical height horizontal distance from signer left-right positioning up-down angle (“elevation”) left-right angle (“direction”) transitions between motion and stationariness (e.g. normal, decelerated, abrupt as from impact)
Manner properties 24. divertive manner 25. dynamic manner
G. relations of Figure or Path to Ground 26. 27. 28. 29. 30.
path’s conformation relative to Ground relative lengths of path before and after encounter with Ground Figure’s path relative to the Path of a moving Ground Figure’s proximity to Ground Figure’s orientation relative to Ground
It seems probable that something more on the order of this number of spatial categories are concurrently analyzed out by visual processing on viewing a scene than the much smaller number present in even the most extreme spoken language patterns. 3.2.3.2.
Elements / categories independently variable in the expression – not in pre-packaged schemas
The signed-spoken language difference just presented was mainly considered for the sheer number of distinct spatial categories that can be represented together in a single classifier expression. Now, though, we stress the corollary: their independent variability. That is, apart from certain constraints involving co-occurrence and obligatoriness in a classifier expression, a signer can generally select a category for inclusion independently of other categories, and select a member element within each category independently of other selections. For example, a classifier expression can separately include and independently vary a path’s contour, length, vertical angle, horizontal angle, speed, accompanying manner, and relation to Ground object.
Spatial structure in spoken and signed language: A neural model
265
By contrast, it was seen earlier that spoken languages largely bundle together a choice of spatial member elements within a selection of spatial categories for representation within the single complex schema that is associated with a closed-class morpheme. The lexicon of each spoken language will have available a certain number of such “pre-packaged” spatial schemas, and the speaker must generally choose from among those to represent a spatial scene, even where the fit is not exact. The system of generalizing properties and processes seen in Section 2.6. that apply to the set of basic schemas in the lexicon (including their plastic extension and deformation) may exist to compensate for the pre-packaging and closed stock of the schemas in any spoken language. Thus, what are largely semantic components within a single morpheme in spoken language correspond to what can be considered separate individually controllable morphemes in the signed classifier expression. The apparent general lack in classifier expressions of pre-packaging, of a fixed set of discrete basic schemas, or of a system for generalizing, extending, or deforming such basic schemas may well accord with comparable characteristics of visual parsing. That is, the visual processing of a viewed scene may tend toward the independent assessment of spatial factors without much pre-packeting of associated factors or of their plastic alteration. If shown to be the case, then signed language will once again prove to be closer to perceptual spatial structuring than spoken language is. 4.
Cognitive and neural implications of spoken / signed language differences
The preceding comparison of the space-structuring subsystems of spoken and of signed language has shown a number of respects in which these are similar and in which they are different. It can be theorized that their common characteristics are the product of a single neural system, what can be assumed to be the core language system, while each set of distinct characteristics results from the activity of some further distinct neural system. These ideas are outlined next. 4.1.
Where signed and spoken language are alike
We can first summarize and partly extend the properties above found to hold both in the closed-class subsystem of spoken language and in the classifier
266
Leonard Talmy
subsystem of signed language. Both subsystems can represent multifarious and subtly distinct spatial situations – that is, situations of objects moving or located with respect to each other in space. Both represent such spatial situations schematically and structurally. Both have basic elements that in combination make up the structural schematizations. Both group their basic elements within certain categories that themselves represent particular categories of spatial structure. Both have certain conditions on the combination of basic elements and categories into a full structural schematization. Both have conditions on the co-occurrence and sequencing of such schematizations within a larger spatial expression. Both permit semantic amplification of certain elements or parts of a schematization by open-class or lexical forms outside the schema. And in both subsystems, a spatial situation can often be conceptualized in more than one way, so that it is amenable to alternative schematizations.
4.2.
Where spoken and signed language differ
Beside the preceding commonalities, though, the two language modalities have been seen to differ in a number of respects. First, they appear to divide up into somewhat different sets of subsystems without clear one-to-one matchups. Accordingly, the spatial portion of the spoken language closedclass subsystem and the classifier subsystem of signed language may not be exactly corresponding counterparts, but only those parts of the two language modalities closest to each other in the representation of schematic spatial structure. Second, within this initial comparison, the classifier subsystem seems closer to the structural characteristics of visual parsing than the closedclass subsystem in all of the following ways: It has more basic elements, categories, and elements per category in its schematic representation of spatial structure. Its category membership exhibits much more gradient representation, in addition to discrete representation. Its elements and categories exhibit more iconicity with the visual in the pattern in which they are clustered in an expression, in their observance of an object/action distinction, in their physical realization, and in their progression through time. It can represent only a narrow temporal aperture in an expression (and only a narrow spatial aperture as well, though this difference from spoken language might not reflect visual fidelity). It can represent many more distinct elements and categories together in a single expression. It can more readily select categories and category el-
Spatial structure in spoken and signed language: A neural model
267
ements independently of each other for representation in an expression. And it avoids pre-packaged category-element combinations as well as generalizations of their range and processes for their extension or deformation.
4.3.
A new neural model
In its strong reading, the Fodor-Chomsky model relevant here is of a complete inviolate language module in the brain, one that performs all and only the functions of language without influence from outside itself – a specifically linguistic “organ”. But the evidence assembled here challenges such a model. What has here been found is that two different linguistic systems, the spoken and the signed, both of them undeniably forms of human language, share extensive similarities but – crucially – also exhibit substantial differences in structure and organization. A new neural model can be proposed that is sensitive to this finding. We can posit a “core” language system in the brain, more limited in scope than the Fodor-Chomsky module, that is responsible for the properties and performs the functions found to be in common across both the spoken and the signed modalities. In representing at least spatial structure, this core system would then further connect with two different outside brain systems responsible, respectively, for the properties and functions specific to each of the two language modalities. It would thus be the interaction of the core linguistic system with one of the outside systems that would underlie the full functioning of each of the two language modalities. The particular properties and functions that the core language system would provide would include all the spoken-signed language properties in Section 4.1. specific to spatial representation, though presumably in a more generic form. Thus, the core language system might have provision for all of the following. It might use individual unit concepts as the basis for representing broader conceptual content. It might group individual concepts into categories. It might associate individual concepts with overt physical representations, whether vocal or manual. It might combine individual concepts – and their physical representations – under certain constraints to represent a conceptual complex. And it might establish a subset of individual concepts as the basic schematic concepts that, in combinations, represent conceptual structure. When in use for signed language, this core language system might then further connect with particular parts of the neural system for visual percep-
268
Leonard Talmy
tion. I have previously called attention to the already great overlap of structural properties between spoken language and visual perception (see Talmy 2000a, Chapter 2.), which might speak to some neural connection already in place between the core language system and the visual system. Accordingly, the proposal here is that in the case of signed language, still further connections are brought into play, ones that might underlie the finer granularity, iconicity, gradience, and aperture limitations we have seen in signed spatial representations. When in use for spoken language, the core language system might further connect with a putative neural system responsible for some of the characteristics present in spoken spatial representations but absent from signed ones. These could include the packeting of spatial elements into a stable closed set of patterned combinations, and a system for generalizing, extending, and deforming the packets. It is not clear why such a further system might otherwise exist but, speculatively, one might look to see if any comparable operations hold, say, for motor control. A cognitive capacity may have evolved (perhaps already in early animals) for the formation and maintenance of certain general-outline motor patterns suited to regularly encountered types of circumstances – that is, for the packeting of motor schemas – as well as for their modification and tailoring to the particular details of a specific occasion. The present proposal of a more limited core language system connecting with outlying subsystems for full language function seems more consonant with contemporary neuroscientific findings that relatively smaller neural assemblies link up in larger combinations in the subservience of any particular cognitive function. In turn, the proposed core language system might itself be found to consist of an association and interaction of still smaller units of neural organization, many of which might in turn participate in subserving more than just language functions.
4.4.
Cognitive and neural plausibility
We can further briefly consider why a neural arrangement of this sort might be plausible. As a precursor, note that each of the two language modalities must be characterized in terms of the combination both of a particular form of stimulus production and of the perception of that stimulus type: vocalauditory for spoken language and manual-visual for signed language (where “manual” is meant to cover bodily movements more broadly). Each of these
Spatial structure in spoken and signed language: A neural model
269
two production-perception modalities has certain basic properties of structure and organization, some of which differ across the two. The differences pertinent here are in the type of iconicity, the degree of parallelism, and the type of representation.
4.4.1.
Type of iconicity
The types of iconicity at hand in each of the two language modalities can account for the types of representational subsystems present in them. The concept of iconicity operative here is that of available relevant iconicity. This refers to the characteristics available in a modality that are iconic with referential areas of greater relevance to communication. “Iconicity” here can be understood to apply to the relationship between a representation and what it represents: they must both be realized in the same physical domain (for example, spatial, temporal, qualitative) and must exhibit a correspondence of degree or kind in that domain. “Relevant” here applies to referential areas that occur in communication more frequently, more pervasively, and more ramifiedly (for reasons that themselves can be separately examined). One referential area of evidently great relevance is that of objects moving or located with respect to each other in space. The manual-visual modality of signed language includes among its characteristics the motion or location of objects (the hands) with respect to each other in space. And this manual system in fact exhibits the two conditions for iconicity. It is a representation that is realized in space, and what it represents is that same spatial domain. And the representations and the represented are in a relation of correspondence in degree or kind. For example, within the classifier subsystem, the more toward the vertically upward that a hand moves, the more toward the vertically upward that the angle is of the object’s path being represented – not, say, the more downward or the more circular. Likewise, the faster a hand moves, the faster the motion of the object represented – not, say, the slower or the larger. By contrast, the vocal-auditory modality of spoken language does not have this form of spatial realization available among its characteristics. Whereas auditory perception by itself can determine the locations and paths of sound emitters, vocal production is fixed in place, and so cannot iconically represent such phenomena. If “throwing one’s voice” were somehow a genuine physical option, spoken language might well have formed a subsystem for the iconic representation of objects following particular paths through space
270
Leonard Talmy
or occupying particular sites within space. But since such spatial localization is absent in the joint production-perception modality of spoken language, it is not available there for use as an iconic representation. Spoken language does have other available characteristics that could have served for iconic representation. But some of these – for example, vocal timbre – are realized in a physical domain that does not also constitute a referential area of great relevance to communication. Thus, while a speaker can effect a different vocal timbre to mimic another speaker or to convey a certain attitude, this is typically only an occasional practice, not a pervasive communicative necessity, and has in fact not become part of spoken language’s systematic organization. Perhaps curiously, spoken language does have other characteristics – ones in the temporal domain: rate of speed and length of interval – that could have served as a communicatively relevant iconic subsystem, but these have not entered into linguistic structure. If they had, it might have been obligatory, for example, to utter each of the three successive phrases in the sentence The pen lay on the table, rolled to the edge, and fell off progressively faster in iconic correspondence with the speed of the three events depicted. Or one might have had to introduce pauses between the three phrases in the sentence I entered, sat down, and fell asleep – in fact, pauses longer than the utterance time of each phrase – to iconically represent the duration of the phases in the depicted situation. It is not clear why this form of available and seemingly relevant iconicity was not adopted into the spoken language modality, but one explanation will be suggested below in Section 4.4.3. Having seen that the manual-visual modality has at least one available and relevant type of iconicity – spatial localization – and that the vocal-auditory modality has neither this nor much of any other relevant type of iconicity available, we could round out the picture by noting a type of iconicity not available in the manual-visual modality. It was pointed out above that, although auditory perception is attuned to the spatial domain, the combined vocal-auditory modality is not so attuned. In a parallel way, although visual perception is well attuned to surface texture, the combined manual-visual modality is not. This is because hand shapes and movements are not realized in the physical domain of texture. Accordingly, the manual-visual modality cannot represent texture iconically, and no subsystem attempting such textural representation is found in signed language. The conclusion of the present consideration is that where a productionperception modality has available a communicatively relevant form of iconic
Spatial structure in spoken and signed language: A neural model
271
representation, that form may become a structural part of the modality’s functional system, and the neural connections that can subserve this function may be established. This conclusion accounts for why signed language has the additional subsystems that it does – ones not present in spoken language. Signed language’s classifier subsystem and the subsystem of size-and-shapespecifiers specifically represent the location, motion, form, and size of objects in space – comprising a physical domain that is part of the manual-visual modality. Where signed language is not representing such spatial factors, it relies on other subsystems – principally, the lexical and inflectional subsystems – whose properties are much closer to those of spoken language.
4.4.2.
Degree of parallelism
The second set of properties that differ across the two language modalities can be called “degree of parallelism”. This is the number of independently variable factors that can be produced and perceived concurrently, that is, “in parallel”. For the vocal-auditory modality of spoken language, the main independently variable factors would seem to be on the order of five in number: phonetic quality, pitch, loudness, rate, and timbre. By contrast, the number of independently variable factors in the manual-visual modality of signed language would appear to be more on the order of thirty, as listed in Section 3.2.3.1. Again, while auditory perception alone might add several further independent factors to the former list – for example, the location in space of a stationary sound and the direction of a moving sound – and while visual perception alone could add at least color and texture to the latter list, the joint production-perception modalities would seem to have roughly the indicated degrees of parallelism. Perhaps this difference can simply be regarded as due to the nature of the respective mediums and the kinds of neural processing they afford. Given this difference, though, it is reasonable that signed language, at least within its classifier subsystem, takes advantage of this expanded set of dimensions.
4.4.3.
Representational type
The third set of property differences across the two language modalities is in representational type. This involves the following three factors: granularity,
272
Leonard Talmy
categoriality, and recombinance. The granularity of a dimension pertains to the size of the components occupying that dimension (perhaps relative to the size of the whole dimension). With sufficiently fine granularity, the dimension can be considered gradient; otherwise, it can be considered to consist of discrete chunked elements. The second factor of categoriality pertains to whether a coarse-grained chunked element in a dimension is considered to be simply a discrete step along the dimension or a qualitatively distinct category in its own right. And the third factor of recombinance pertains to whether the categories in a dimension occur there solely with their own identities and at sites relevant to that identity or can also recombine in different arrangements to constitute new higher-level entities. In the classifier subsystem of signed language, many of the independently variable dimensions have a fine degree of granularity, in effect behaving as gradient continua for both motoric production and visual perception. For example, it seems that over a roughly continuous range, a signer can vary the locus of a hand within sign space, the contour, length, and speed of a path of motion, and the distance between Figure and Ground. By contrast, the handshapes that represent the Figure (or Manipulator, Instrument, or Ground) are for the most part organized into categories – hence, the term “classifiers”. Thus, ASL has two distinct handshapes to represent a ground vehicle and an aircraft, but the first handshape cannot be continuously morphed into the second handshape to represent a series of hybrid machines that progress in design from ground vehicles to aircraft. However, certain classifier handshapes do allow a representation of magnitude, apparently not over a continuous range but only with at most three values. As an example, with the thumb and forefinger of each hand extended and curved to form a semicircle, the two hands can be held touching, slightly separated, or much separated to represent a planar circular Figure object that is small, medium, or large. Such forms should perhaps be regarded as coarse-grained steps along a dimension rather than as distinct categories. (Note with respect to Section 4.4.1. that, even under this interpretation, the separation of the two hands exhibits iconic correspondence with the disk diameter it represents, so that neither the concept of iconicity generally nor that of correspondence of degree in particular requires gradience.) With regard to recombinance, relatively little in the classifier subsystem seems to involve this representational type. Within spoken language, the set of dimensions that I term “vocal dynamics – including pitch, loudness, rate, and timbre – exhibit fine granularity and so act as gradients both in vocal production and in auditory perception. If
Spatial structure in spoken and signed language: A neural model
273
any coarser discrete steps occur in these dimensions, they do not seem to behave as categories. And there is certainly no recombinance. Vocal dynamics – which can convey speaker affect and attitude – seems to be an older inherited system antedating the evolution of the central structural system of human language. By contrast with the other systems considered so far for their representational type, though, the central structural system of spoken language relies heavily on categories and their recombination. As is well known, this type of representation occurs at two levels. To begin with, phonetic quality is not treated as varying continuously over a range, but rather as segmented into discrete phonetic categories: the phonetic units or phonemes of a language. At the first level of recombinance, then, selections of such phonetic units from a language’s inventory are arranged in particular sequences to constitute distinct higher-level units, the morphemes. Thus, there is no continuous phonetic transition from, say, “red” through “rej” to “reg” to represent the color shift from red through orange to yellow; rather, the three phonetically unrelated morphemes just mentioned are used. In turn, selections of morphemes from a language’s inventory are arranged in accordance with that language’s principles of morphosyntax to constitute still higher-level entities, sentences. Significantly, except for the contribution of vocal dynamics, the conveying of all conceptual content in spoken language, no matter the kind of content, is accomplished by this same format of two tiers of the recombining of discrete categories. This consistent reliance on the same single format for the central structural system of spoken language may account for why temporal iconicity – as noted in Section 4.4.1. – did not become part of this central structural system. The classifier subsystem of signed language is for the most part not structured in terms of this single-format recombinant system of spoken language. First, there are a number of distinct “formats”: the Figure type is represented by a handshape; the Figure’s path by a linear movement of the hand; the Figure’s Manner by quick hand motions outside this linear path; the Figure’s angle relative to the path of motion by the angle at which the hand is held; the distance between the Figure and the Ground by the distance between the dominant and the non-dominant hand; etc. Second, the representations in these different formats largely combine with each other in a single compatible arrangement and do not shift their relations with each other to represent novel relationships among the referents. The one part of this subsystem that does seem to work this way is the classifier handshapes themselves. For ex-
274
Leonard Talmy
ample, in showing an animal moving past a vehicle or a vehicle moving past an animal by reversing the handshapes of the dominant and non-dominant hands. In addition to spoken language’s being based on discrete recombinance at the level of morphemes into sentences, the subsystems of signed language that were above suggested as being governed by the neural core language system – principally the lexical and inflectional subsystems – do exhibit recombinance at the level of morphemes into sentences. Thus, the operation of recombinance may be one of the characteristics of the neural core language system. The question arises whether, as it evolved, the core language system could have adopted this recombinant form of organization from other cognitive systems already extant. Other cognitive systems do at least exhibit the first two representational types discussed in this section. Thus, both visual perception and auditory perception include the recognition of gradients, for example, that of the location of an object in space, as well as the recognition of discrete categories, for example, the identity of an object. But it is not clear whether these perceptual modalities exhibit the third representational type, recombinance. Perhaps visual perception can concurrently maintain the recognition of the identities of several objects as they move about relative to each other in space. But such repositionings may not yield novel higher-level arrays and hence constitute a form of recombinance. Perhaps recombinance can yet be found in visual perception – for example, in the patterns in which the perception of surfaces, edges, and vertices combine in different patterns to be perceived as different objects. Or again, perhaps motor patterns will be found to involve recombinations of some set of basic “motor units”. However, if recombinance is in fact not found in other cognitive modalities, or is minor there, then the evolving core language system must have developed recombinance newly or ramified it into playing a major role. If that is the case, there is the question of why it might have been advantageous or necessary for that to happen. Part of the answer may lie in the following consideration. Apart from cognitive systems involved with communication across organisms, the sphere of most cognitive modalities is entirely internal to a single organism. The connection between one part of a cognitive modality and another part (for example, within the visual processing system) is neural, which has the properties of being massively parallel and of high transmission fidelity over the relatively short distances involved. By contrast, the vocal-auditory modality between two different organisms has available only a few dimensions in parallel (again, mainly phonetic quality, pitch, loudness,
Spatial structure in spoken and signed language: A neural model
275
timbre, and rate), and the fidelity of transmission over the relatively longer distances through the air can be low. It may be that whereas a predominantly analog representational system served for intra-organismal transmission, the newly evolving inter-organismal system had to become, as it were, “digital” to afford sufficient fidelity for the transmission of more than just a few communicative concepts. If the new language system had originally evolved as a manual-visual modality, the greater degree of parallelism and perhaps finer granularity available in that modality might well have permitted a continuation of the more analog representational type. But the utilization of the vocal-auditory channel, which lacks these properties, may have necessitated the development of something like a two-tiered discrete recombinant system. In turn, once evolved, this new representational type now carries over to those subsystems of signed language (the lexical and the inflectional) based on the core language system, whereas other subsystems of signed language instead have returned to the earlier analog representational type of the manual and visual modalities.
Notes 1. In the continuing research reported on here, the version in the present article supersedes that in a prior article, Talmy (2003). 2. I here approach signed language from the perspective of spoken language because it is not at this point an area of my expertise. For their help with my questions on signed language, my thanks to Paul Dudis, Karen Emmorey, Samuel Hawk, Nini Hoiting, Marlon Kuntze, Scott Liddell, Stephen McCullough, Dan Slobin, Ted Suppala, Alyssa Wolf, and others – who are not responsible for my errors and oversights. 3. As it happens, most motion prepositions in English have a polysemous range that covers both the unbounded and the bounded sense. Thus, through as in I walked through the tunnel for 10 minutes refers to traversing an unbounded portion of the tunnel’s length, whereas in I walked through the tunnel in 20 minutes, it refers to traversing the entire bounded length. 4. The “classifier” label for this subsystem – originally chosen because its constructions largely include a classifier-like handshape – can be misleading, since it names the whole expression complex for just one of its components. An more apt term might be the “Motion-event subsystem”. 5. As the only apparent exception, a “demoted Figure” (see Talmy 2000b, Chapter 1.) can acquire either of two “demotion particles” – e.g., English with and of – that mark whether the Figure’s path had a “ TO” or a “FROM” vector, as seen in The fuel tank slowly filled with gas / drained of its gas.
276
Leonard Talmy
6. The size and shape specifiers (SASS’s) in signed languages do permit movement of the hands to trace out an object’s contours, but the hands cannot at the same time adopt a shape representing the object’s path. 7. The behavior here of ASL cannot be explained away on the grounds that it is simply structured like a verb-framed language, since such spoken languages typically can represent concurrent Manner outside a narrow aperture, in effect saying something like: “I walking / running / driving / flying carried the memo to the home office”.
References Bennett, David C. 1975 Spatial and Temporal Uses of English Prepositions: An Essay in Stratificational Semantics. London: Longman. Bowerman, Melissa 1989 Learning a semantic system: What role do cognitive predispositions play? In The Teachability of Language, Mabel L. Rice and Richard L. Schiefelbusch (eds.), 133–169. Baltimore: P.H. Brookes. Brugmann, Claudia, and Monica Macaulay 1986 Interacting semantic systems: Mixtec expressions of location. In Proceedings of the Twelfth Annual Meeting of the Berkeley Linguistics Society, 315–327. Berkeley: Berkeley Linguistics Society. Clark, Herbert H. 1973 Space, time, semantics, and the child. In Cognitive Development and the Acquisition of Language, Timothy E. Moore (ed.), 27–63. New York: Academic Press. Emmorey, Karen 2002 Language, Cognition and the Brain: Insights from Sign Language Research. Mahwah, NJ: Lawrence Erlbaum. Emmorey, Karen (ed.) 2003 Perspectives on Classifier Constructions in Sign Language. Mahwah, NJ: Lawrence Erlbaum. Fillmore, Charles 1968 The case for case. In Universals in Linguistic Theory, Emmon Bach and Robert T. Harms (eds.), 1–88. New York: Holt, Rinehart and Winston. Gruber, Jeffrey S. 1965 Studies in lexical relations. Ph.D. diss., MIT. Reprinted as part of Lexical Structures in Syntax and Semantics, 1976, Amsterdam: North-Holland. Herskovits, Annette 1982 Space and the prepositions in English: Regularities and irregularities in a complex domain. Ph.D. diss., Stanford University.
Spatial structure in spoken and signed language: A neural model Imai, Shingo 2002
277
Spatial deixis: how demonstratives divide space. Ph.D. diss., University at Buffalo. Jackendoff, Ray 1983 Semantics and Cognition. Cambridge, MA: MIT Press. Leech, Geoffrey 1969 Towards a Semantic Description of English. New York: Longman Press. Liddell, Scott 2003 Sources of meaning in ASL classifier predicates. In Emmorey (ed.), 199–220. Mark, David M., and Barry Smith 2004 A science of topography: from qualitative ontology to digital representations. In: Geographic Information Science and Mountain Geomorphology, Michael P. Bishop and John F. Shroder (eds.), 75–100. Chichester: Springer. Talmy, Leonard 1983 How language structures space. In Spatial Orientation: Theory, Research, and Application, Herbert L. Pick, Jr. and Linda P. Acredolo (eds.), 225–282. New York: Plenum Press. 2000a Toward a Cognitive Semantics, Volume I: Concept Structuring Systems. Cambridge, MA: MIT Press. 2000b Toward a Cognitive Semantics, Volume II: Typology and Process in Concept Structuring. Cambridge, MA: MIT Press. 2003 The representation of spatial structure in spoken and signed language. In Emmorey (ed.), 169–195. Zubin, David, and Soteria Svorou 1984 Orientation and gestalt: conceptual organizing principles in the lexicalization of space. With Soonja Choi. In Papers from a Parasession on Lexical Semantics, David Testen, Veena Mishra, and Joseph Drogo (eds.), 333–345. Chicago: Chicago Linguistic Society.
Postural categories and the classification of nominal concepts: A case study of Goemai 1 Birgit Hellwig This paper addresses the central question of this book – how the ontological status of concepts and categories is reflected in their linguistic coding – from the perspective of nominal classification. It looks at nominal classifiers, i.e., at systems characterized through the presence of a closed class of elements (termed “[nominal] classifiers”) that occur in specific morphosyntactic environments where they divide the nominal domain into a number of different classes. Cross-linguistically, these classes tend to be based on a limited and recurrent set of very general semantic domains. As such, classifiers are often said to tap into high-level concepts, thereby making them of interest to any study on ontolinguistics. This paper focuses specifically on one semantic domain – postural semantics – and examines its role in classification. Recent studies have already shown the importance of this domain for the coding of locative relations (see also Brala (this vol.) and Skopeteas (this vol.) for studies on the coding of locative relations through adpositions), but so far there has been only little discussion of its classificatory use. The paper is structured as follows: Section 1. gives a brief introduction into the topic; Section 2. presents a detailed case study of postural classifiers in the West Chadic language Goemai; and Section 3. concludes this paper. 1.
Overview
This section gives an overview of attested classifier systems (Section 1.1.) (drawing largely on studies by Aikhenvald 2000; Allan 1977; Craig 1986; Grinevald 2000; Senft 2000), and introduces the topic of postural semantics (Section 1.2.) (see especially Ameka and Levinson submitted; Newman 2002). The interested reader is referred to these studies for details. 1.1.
Nominal classifiers
Nominal classifiers are concerned with categorizing the nominal domain:2 they exhaustively (or near-exhaustively) divide this domain into a set of
280
Birgit Hellwig
classes, and classification takes place in specific morphosyntactic environments only. Based on these environments, the following six types are generally recognized: noun classifiers (that occur with nouns), numeral classifiers (that occur in noun phrases with numerals), possessive classifiers (that occur in possessive noun phrases), locative classifiers (that occur on adpositions), deictic classifiers (that occur on demonstratives or articles), and verbal classifiers (that occur on the verb, but classify one of its arguments).3 Example (1) below illustrates a numeral classifier (from the Mayan language Yucatec, see Lucy and Gaskins 2001: 260–261) and Example (2) a deictic classifier (from the Chadic language Goemai).4 (1)
(2)
k´a’a-tz’´ıit kib’ two-CL:long.and.thin wax ‘two candles’ lu n-d’yem-nnoe settlement ADVZ-CL:stand(SG)-DEM . PROX ‘this standing house’
(numeral classifier)
(deictic classifier)
It is generally assumed that each of the six classifier types is associated with specific semantic domains, diachronic origins, grammaticalization patterns, and discourse functions. But despite attested differences, all types draw upon a recurring set of semantic domains: they classify according to animacy (e.g., human vs. non-human), function (e.g., edible vs. non-edible), and physical properties such as extendedness (e.g., one-dimensional vs. twodimensional; long vs. flat), consistency (e.g., flexible vs. rigid), constitution (e.g., liquid vs. solid), material (e.g., wood vs. metal), etc.5 These classifiers are often termed “sortal”, i.e., they set up disjoint classes based on inherent time-stable properties. Additionally, many classifier systems contain further elements that denote non-inherent temporary properties: mensural elements (i.e., quanta such as bunch vs. cluster) and temporary-state elements such as configuration (e.g., looped vs. coiled), distribution (e.g., heaped vs. scattered) and posture (e.g., standing vs. lying). There is a consensus in the literature that classifier systems are basically sortal.6 Mensural and temporary-state elements, by contrast, are only termed “classifiers” if they occur in constructions that are formally and functionally similar to the constructions of the prototypical sortal classifiers. While this requirement is met by many systems, Section 2. below introduces a system that is – at first sight – based on temporary properties only (i.e., on posture). It is shown there how a language can use temporary properties to set up disjoint
Postural categories and nominal concepts
281
time-stable classes, and how such postural information can then complement the information coded in nouns. 1.2.
Postural semantics and classification
One semantic domain that is found in different classifier systems is the domain of posture: it plays a role in the verbal classifiers and classificatory verbs of Athapaskan and Papuan languages, in the numeral classifiers of Mayan languages, and in the deictic classifiers of Siouan and Guaykuruan languages. More generally, this domain is not only relevant to classification, but also to spatial semantics (illustrated with data from Goemai in (3a) to (3f) below): all known postural-based classifiers have developed from postural verbs that code the static location of a figure relative to a ground (as t’ong ‘sit’ in (3a)). According to Stassen (1997: 55–61), a majority of the world’s languages employs postural verbs in comparable locative contexts. In such languages, speakers choose a postural from a small set of contrastive verbs. These verbs often constitute a closed form class, and they include verbs that have a human/animate-based origin (such as ‘sit’, ‘stand’, ‘lie’), but also other verbs (notably, ‘hang/be attached’ and ‘move/be in a natural habitat’, and sometimes a semantically general verb ‘exist/be located’). Frequently, the same or similar distinctions are coded in transitive verbs of placement (as leng ‘hang/move’ in (3b)). And in some languages this contrastive set has further grammaticalized into aspectual markers (usually expressing progressive aspect, as lang ‘hang/move’ in (3c); and sometimes resultative notions, as d’e ‘exist’ in (3d)), into verbs or copulas having equative, ascriptive or possessive functions (as d’yam ‘stand’ in (3e)), or into deictic classifiers (as t’o ‘lie’ in (3f)). (3)
a.
pepe. Wang t’ong k’a pot sit(SG) HEAD(SG) woven.cover ‘The pot sits on the cover.’
b.
lu n-k’a muk. Tangzem leng wasps hang/move(PL) settlement LOC-head(SG) 3SG . POSS ‘Wasps hung up houses (i.e., built their hives) in his hair.’
c.
n-su yi b’e? Ko lang or PROGR:hang/move(SG) PROGR-run(SG) PROGR EMPH ‘Or does (it) really move running?’
282
Birgit Hellwig
d.
Hangoed’e hok b’ang d’e nd’ˆuuˆ n cup. DEF become.red exist INSIDE cup water ‘The water is red in the cup (i.e., the water has changed, and now exists in red color).’
e.
vel. T’eng d’yam tree stand(PL) two ‘The trees stand two (i.e., the trees are two).’
f.
fa? Goe-n-t’o-nnoe NOMZ(SG )-ADVZ-CL :lie(SG )-DEM . PROX INTERR ‘What about this lying one?’
The context featuring a deictic classifier (in (3f)) is of particular interest to any study of nominal classification. Such classifiers are attested in only few languages: in Siouan and neighboring languages of North America (Seiler 1986: 87–94), in Guaykuruan languages of South America (Klein 1979), and in some African languages: Khoisan languages (cited in Kuteva 1999: 204– 205), Mbay (Keegan 2002), and Goemai (see Section 2.). While deictic classifiers are considered to be “classifiers”, the posturals illustrated in contexts (3a) to (3e) are usually not discussed under the heading of classification – the argument being that they are members of a major word class (i.e., verbs). There is some controversy as to whether verbs can be said to function as classifiers of the nominal domain: while some authors recognize the existence of “classificatory verbs” that classify through their verb stem, others consider them to be “a covert lexical means of nominal classification” that “can be found in any language” (Grinevald 2000: 68). The term “classificatory verb” has a long tradition in the literature on Athapaskan languages, where nominals are categorized into up to 13 classes on the basis of shape, posture, texture, consistency, animacy and number. Each class is associated with up to four sets of suppletive classificatory verb stems that are used in reference to members of that class being in a position of rest, handled, thrown or in motion. Crucially, they are distinguished from “non-classificatory” and “pseudo-classificatory” verb stems in that they form a consistent paradigmatic subset of the verb lexicon, i.e., the classes contrast with each other in well-defined morphosyntactic environments. For researchers such as Aikhenvald (2000: 153–159), Allan (1977: 287), McGregor (2002), or Seiler (1986: 77–86), the Athapaskan classificatory verbs thereby meet the crucial criterion of any classifier system: the classes are reflected in
Postural categories and nominal concepts
283
grammar. While this criterion excludes lexical means of classification, they do not take it to exclude the possibility of classification through suppletive verb stems.7 Despite this controversy in terminology, it is acknowledged that – semantically – there are tendencies for postural verbs to develop “classificatory overtones” (Aikhenvald 2000: 362–363), i.e., to develop characteristics reminiscent of the prototypical sortal classifiers (see Section 1.1.). These tendencies have been especially discussed for postural-based existential verbs in Papuan languages (Aikhenvald 2000: 153–159). Furthermore, Ameka and Levinson (submitted) propose a hypothesis that languages using postural verbs in the locative construction allow for two uses: a “presuppositional” or classificatory use (which pays attention to the canonical position of a figure) and an “assertional” or non-classificatory use (which pays attention to the current position of a figure). For example, an animal canonically moves, i.e., it has the capacity or disposition to move. In their classificatory system, speakers of Goemai therefore place it into the category lang ‘hang/move’ – even though it may be currently in a stationary position (see Section 2.).8 Section 2. takes up this discussion and illustrates in more detail the sortal nature of postural-based classificatory verbs and deictic classifiers.
2.
Case study: Classification in Goemai
Goemai is a West Chadic language that is spoken by about 150,000 speakers in Central Nigeria. This language has a system of nominal classification that is based on postural semantics. Table 1 below illustrates the categories, their forms and their referential range. Notice that aside from four specific posturals, the table also contains one unspecific postural: d’e ‘exist’. This element covers entities that cannot assume a physical position (i.e., all abstract concepts), entities that do not have a default or canonical position (e.g., a hole in a piece of cloth), and novel entities that cannot easily be placed into the existing system (e.g., attached objects that do not dangle or project away from the ground, such as a band-aid or a ring). Together, they exhaustively and disjointly divide up the nominal domain, i.e., each nominal concept is assigned to exactly one of the five categories. To a large extent, this assignment is semantically predictable, and there are only very few seemingly arbitrary assignments.
284
Birgit Hellwig
Table 1. The postural categories forms singular plural lang leng
nominals that occur in this category
typical referents
nominals whose referents move
‘sit’
t’ong
t’wot
‘stand’
d’yem
d’yam
nominals whose referents project away from the ground and maintain a stable position by their own means nominals whose referents project away from the ground and maintain a stable position with the help of the ground
‘lie’
t’o
t’oerep
living beings, inanimate forces, dangling objects (e.g., fruits) containers that support themselves through their base, chairs that support themselves through their legs trees and houses that are supported through being buried in the ground; caves that are ‘inserted’ into hills; ladders that lean against the ground pieces of cloth or ropes that mould their entire position onto the ground abstract concepts
category ‘hang/ move’
‘exist’
d’e
nominals whose referents do not project away from the ground remainder category, containing all nominals that cannot be placed into one of the more specific postural categories
The forms constitute a contrastive set in that Goemai speakers are required to choose one of them in each of the following morphosyntactic contexts: as intransitive verbs occurring in the locative/existential construction (see (3a) above), the progressive construction (see (3c)), different subtypes of resultative serial verb constructions (see (3d)), and the ascriptive construction (see (3e)); as transitive verbs of placement (see (3b)); and as classifiers within the demonstrative word (see (3f)). Their basic use is as intransitive verbs in the
Postural categories and nominal concepts
285
locative construction, and all other uses are derived (see Hellwig 2003 for details). Within the verb lexicon, these five intransitive verbs constitute a single form class: in contrast to most other verbs in the language, they are unambiguously stative (including lang ‘hang/move’); furthermore, they occur with an obligatory semantic participant that denotes the ground. In a way, Goemai has a fairly typical postural system: the forms constitute a closed class, and similar categories are found in many other languages. However, Goemai is of particular interest because the posturals have spread to such a large variety of morphosyntactic contexts. In all these contexts, speakers can either choose the default, classificatory, postural (illustrated in Table 1; see Section 2.1.), or they can shift away from this default to either a different specific postural (see Section 2.2.) or to the existential in a more general use (see Section 2.3.).
2.1.
Use of the default (specific or general) postural
In Goemai, each nominal concept is placed into one of the five classes, i.e., each of them has a default postural assigned to it, which can be used regardless of the current position of its referent. The criteria that determine their assignment are summarized – in a very simplified way – in Table 1 above. These assignments are based on canonical or typical positions. For example, in its canonical position, a container is upright, and it is in this position that it matches the criteria for t’ong ‘sit’. Containers are therefore assigned to the class of ‘sitting’ objects, and t’ong is the default element to be used with containers. Speakers resort to this default element whenever they focus on the existence of a referent at a location. This includes negative existence (as in (4)) – notice that this sentence is not about the current position, and cannot mean “there is a bottle on the table, but it is not sitting.” Furthermore, it includes the existence of non-canonically located referents, e.g., an upside-down pot in (5). This example is taken from a longer conversation between two speakers about upside-down pots in different locations (on the ground, on a table, in a tree). The speaker in (5) cannot identify the wang ‘pot’ that his interlocutor has mentioned in the preceding discourse, and thus asks for clarification. He ventures the guess that it is the pot located on the ground, using the default verb t’ong. That is, this speaker ignores the current position of the pot and instead focuses on its existence at a location.
286
Birgit Hellwig
(4)
tebul ba. Kwalba t’ong k’a bottle sit(SG) HEAD ( SG ) table NEG ‘There sits no bottle on the table.’ (i.e., there is no bottle)
(5)
ai? Wang goenang nd’ˆuuˆ n? . . . T’ong n-yil sit(SG) LOC-ground INTERJ pot which(SG) INSIDE ‘Which pot among (them)? . . . Does (it) sit on the ground?’
In (4) and (5) above, the default element gives information about the class of the object. It is this phenomenon that makes the Goemai system similar to the prototypical sortal classifiers introduced in Section 1.1.: the elements set up disjoint classes that are independent of the current state of the referent. The interesting difference to prototypical sortal classifiers is that the Goemai classes are not based on inherent characteristics, but on postural information. Although this information is not inherent to figures, it has to be kept in mind that Goemai assigns classes on the basis of canonical positions. Like prototypical sortal classifiers, canonical positions do set up time-stable categories (see also Merlan, Roberts, and Rumsey 1997 for a report on similar developments in the Papuan language Imonda). Furthermore, it is sometimes argued that posturals code inherent shape properties, often combined with axial properties. For example, Aikhenvald (2000: 271–306) suggests that referents tend to ‘sit’ when they are threedimensional or non-extended, ‘lie’ when they are two-dimensional or horizontally extended, and ‘stand’ when they are one-dimensional or vertically extended. In Goemai, these features play only a secondary role. As illustrated in Table 2, figures of all dimensions and extensions occur in almost all of the postural categories. (Lang ‘hang/move’ and d’e ‘exist’ are excluded from the table because verbs of this type do not play a role in the literature; nevertheless, they can occur with figures of all types.) Table 2. Posturals and abstract shapes 3D 2D 1D non-extended horizontal vertical
t’ong ‘sit’ pot pepe ‘woven cover’9 pot pepe ‘woven cover’ bottle
d’yem ‘stand’ house wall tree house hook/nail tree
t’o ‘lie’ stone cloth rope stone cloth bark of tree
Postural categories and nominal concepts
287
The table illustrates that, in Goemai, there is only an incomplete overlap between an inherent shape and a postural category. Nevertheless, positions imply certain shapes: e.g., a self-supported ‘sitting’ figure is preferably three-dimensional, while a ‘lying’ figure that is supported fully by the ground tends to be horizontally-oriented and not three-dimensional. It is thus conceivable that classifiers based on inherent properties develop from classifiers based on posture. In present-day Goemai, however, their semantics are clearly not based on such inherent properties. There are indications that other languages follow a similar type of semantics in their postural-based elements (see Ameka and Levinson submitted). The parallels between Goemai and better-studied nominal classifier systems extend further to the interaction between classifier and noun semantics. Regardless of the semantic domain(s) coded in a given classifier system, there is a general agreement in the literature that classifiers do not mirror noun semantics but add semantic content to the utterance (e.g., Aikhenvald 2000: 317–333; Broschart 2000; Denny 1986; Lucy and Gaskins 2001; Seiler 1986: 94–110). With regard to numeral classifier languages, it is frequently argued that they are characterized through a large number of nouns denoting substances. Classifiers are then used to create individual, bounded and contoured, units of that substance. For example, in order to count a noun like kib’ ‘wax’ in the numeral classifier language Yucatec Maya, speakers have to add a classifier that specifies its shape, e.g., k´a’a-tz’´ıit kib’ ‘two candles’ (lit., ‘two-CL:long.and.thin wax’) (see Example (1) above). This line of argumentation is further corroborated by psycholinguistic research (Lucy and Gaskins 2001), showing that speakers of Yucatec Maya attend more to the material of objects, while speakers of the non-classifier language English attend more to their shape. Numeral classifiers are only one type of classifier – in many languages, classifiers do not occur in enumeration contexts at all. For these languages, the available information suggests that nouns are semantically general in that they tend not to differentiate between, e.g., a natural source and its natural or man-made produce, or between an individual and its collective (Broschart 2000; Merlan, Roberts, and Rumsey 1997: 82; Seiler 1986: 105–106; Wilkins 2000: 179–186). Classifiers are then considered to be one means to restrict the reference of such semantically general nouns. For example, in the Goemai utterances (6a) and (6b) below, the posturals leng ‘hang/move’ and t’o ‘lie’ are used to differentiate between a natural source (‘bees’) and its produce (‘honey’) – the noun nshi ‘bee/honey’ is compatible with both entities.
288 (6)
Birgit Hellwig
a.
nshi n-leng-nnoe bee/honey ADVZ-CL:hang/move(PL)-DEM . PROX ‘these moving bees’
b.
nshi n-t’o-nnoe bee/honey ADVZ-CL:lie(SG)-DEM . PROX ‘this lying honey’
Classifiers thereby highlight certain aspects of the meaning potential of a noun. In doing this, they create either bound units out of substances (e.g., in numeral classifier languages), or restrict the reference of general nouns (e.g., in Goemai). In both cases, they add semantic information that enables a listener to successfully identify or track a referent. 2.2.
Shift to a different specific postural
Speakers have the option to shift away from the default, classificatory, element (see Section 2.1.), and use a different specific postural instead. They resort to this alternative option whenever they focus on the current position of the referent. This includes all contrastive situations (as in (7)), and it includes the introduction of new referents into discourse (as in (8)). In both cases, the speakers use a postural that best matches the current non-canonical position. By drawing attention to its current position, they enable the hearer to correctly identify the intended referent.10 (7)
Goe-nnoe NOMZ ( SG )- LOC . ANAPH goe-nnoe NOMZ ( SG )- LOC . ANAPH
t’ong k’a sit(SG) HEAD ( SG ) k’a t’o lie(SG) HEAD ( SG )
tebul, table tebul. table
‘This one (referent: upright bottle) sits on the table, this one (referent: bottle on its side) lies on the table.’ (8)
Kwalba na n-t’o! bottle PRES PRES-lie(SG) ‘Look, a bottle lies (there)!’ (referent: bottle on its side)
This shift is not classificatory by itself: it simply asserts a current position. This means that speakers can use the postural system in both a classificatory (focusing on the class of the figure) and a non-classificatory way (focusing on the current position of the figure). However, the two uses differ in terms of
Postural categories and nominal concepts
289
their markedness, and hearers interpret defaults differently from non-defaults. This difference is illustrated in the following paragraphs. Whenever speakers use the default, hearers do not pay attention to the current position of a referent. For example, in (9) and (10) below, two speakers – who could not see each other – were asked to compare pictures. These pictures were nearly identical, but differed in a few crucial items, e.g., in the position of referents. (The referents discussed in (9) and (10) are illustrated in Figure 1.) In Example (9), speaker A. introduces a canonically positioned (upright) bottle by means of its default t’ong ‘sit’ in a presentative construction. The picture of speaker N., however, contains an upside-down bottle. Nevertheless, he accepts and produces the default as an appropriate characterization of his upside-down bottle. But now consider the reverse situation. In Example (10), speaker A. introduces a non-canonically positioned (upside-down) calabash, using the non-default d’yem ‘stand’. Upon hearing the non-default, speaker N. pays close attention to the current position of his calabash, which happens to be upright. As a consequence, he rejects d’yem, and shifts first to d’e ‘exist’ (to confirm the existence of the calabash), and then uses t’ong ‘sit’ in a morphologically marked way to stress the current ‘sitting’ position. Example (9)
speaker A.
speaker N.
Example (10)
speaker A.
speaker N.
Figure 1. Referents discussed in Examples (9) and (10)
(9)
k’a kwati. (. . . ) A: Goe na kwalba n-t’ong 2SG . M see bottle PRES-sit(SG) HEAD(SG) box ‘Look, see a bottle sitting on the box.’ (. . . ) (referent: upright) k’a. N: Ni t’ong d’i 3SG sit(SG) LOC . ANAPH HEAD ( SG ) ‘It sits there on top (of the box).’ (referent: upside down)
(10)
k’a k’aram. A: D’a n-d’yem calabash PRES-stand(SG) HEAD(SG) mat ‘Look, a calabash stands on the mat.’ (referent: upside down)
290
Birgit Hellwig
d’i (. . . ). N: D’a–, d’a na n-d’e calabash calabash PRES PRES-exist LOC . ANAPH M-maan t’ong n-t’ong. NOMZ -1 SG . POSS sit(SG ) ADVZ -sit(SG ) ‘The calabash–, look, there is a calabash (. . . ). (But) mine sits sitting.’ (referent: upright) Similar differences in interpretation are found in all comparable situations, suggesting that the two uses have a different status. This difference can be captured with the help of pragmatic implicatures, in particular, with Generalized Conversational Implicatures. Levinson (2000) suggests two complementary principles (M- and I-principles) that explain the distribution of marked and unmarked forms. The M-principle is based on two of Grice’s (1975) submaxims of Manner (i.e., “avoid obscurity of expression”, “avoid unnecessary prolixity”), while the I-principle is based on his second Maxim of Quantity (i.e., “do not make your contribution more informative than is required”). These two principles can be applied to the Goemai postural system in the following ways: the non-default element is the marked – unexpected – expression, while the default is unmarked and expected. To use the marked (non-default) expression in a context where the unmarked (default) expression could have been used draws attention to a marked situation (e.g., the referent is non-canonically positioned). Following the M-principle, the use of the non-default would therefore induce the hearer to closely monitor the current situation, looking for some marked property. As a consequence, speaker N. does not accept the non-default because it does not match the current position (in (10)). Following the I-principle, by contrast, the use of the default would not force the hearer to pay attention to the current situation. Instead, he takes the default to describe the class of the referent (e.g., of objects that ‘sit’ by default). As a consequence, speaker N. accepts the default, even though it does not match the current position (in (9)). This discussion shows that the use of non-defaults is an integral part of the whole classificatory system: they can only receive their marked interpretation because there is an unmarked default or classificatory element available that the speaker could have used. But instead of using it, the speaker chose to place the referent temporarily into a different class. Similar phenomena are also observed to occur in better-known classifier systems, where speakers shift to different classifiers as a means to highlight different aspects of the referent in context (e.g., McGregor 2002: 8–13; Wilkins 2000).
Postural categories and nominal concepts
2.3.
291
Shift to the general existential
Aside from the four specific posturals, the form class introduced in Table 1 above also includes an unspecific postural: the existential. This existential shares the same formal properties, but differs semantically in that it is more general, being in a superordinate / hyponym relationship with the posturals. The specific posturals code existence at a location in a specific canonical or presupposed position (see Section 2.1.), and speakers can shift to different specific posturals to highlight other temporary positions (see Section 2.2.). The existential, by contrast, does not give postural information by itself: it simply codes existence at a location, and picks up its postural information from its opposition to the specific posturals. In its classificatory use, it is therefore the default element for all concepts that cannot be subsumed under any of the more specific postural categories (see Section 2.1.). But in addition to its classificatory use, speakers can use it in a more general way: they can shift away from the default postural to the existential. In fact, given its general semantics, it could – in principle – replace all specific posturals in all their occurrences. However, its actual distribution is more restricted, and can be predicted on the basis of pragmatic principles. Speakers shift to the existential in two contexts. First, they shift when the focus is on the current posture, which happens to be unknown – if the current posture was known, this context would trigger the shift to a non-default specific postural (see Section 2.2.); and if the focus was not on the posture, the default element would be used (see Section 2.1.). Example (11) illustrates such a context with the help of a “where” question: the speaker focuses on the current referent (a calabash that he has misplaced), and uses the existential to explicitly seek locative information – part of this information is how the referent is positioned relative to the ground. By using the existential, he does not presuppose anything about its current posture, and invites the addressee to fill in this gap in his knowledge. (11)
Yin, d’a hok d’e nnang? say calabash DEF exist where ‘(He) said, where is the calabash?’
Second, speakers shift to the existential when they introduce or keep track of referents that can be identified by means of non-postural information (in presentative and demonstrative constructions only). This includes the second mention of previously-identified referents (as in (12)); and it includes refer-
292
Birgit Hellwig
ents that are non-canonically positioned, and thus identifiable through their marked non-stereotypical position (as in (13) below). (12)
a lemu. Goe-n-d’yem-nnoe NOMZ(SG )-ADVZ-CL:stand(SG )- DEM . PROX FOC orange Lemu n-d’e-nnoe=hoe (. . . ). orange ADVZ-CL:exist-DEM . PROX=exactly ‘This standing one is an orange (tree). This existing orange (tree) (. . . ).’
The contexts illustrated under (11), (12), and (13) are the only contexts when speakers shift to the existential. In fact, whenever such a shift occurs, the hearer assumes that one of the above conditions applies. This assumption is illustrated in (13) below. It is taken from a matching game, in which two speakers (who could not see each other) were asked to compare pictures. Each speaker had an identical set of pictures, containing – among others – three bottles in different positions (upright, on its side, upside down, as shown in Figure 2). Speaker A. was asked to pick a picture and describe it to speaker N., who had to find the matching picture from his set. In this example, speaker A. picks one of the bottles, and introduces it with the existential predicate in the presentative construction. Semantically, his description could apply to any of the three bottles. Speaker N. thus asks for clarification – but notice that he only mentions the two non-canonically positioned bottles (the ‘lying’ and the ‘standing’, i.e., upside-down, bottles). For him, the shift to the existential implicated a non-canonical position. Possible referents
Figure 2. Referents discussed in Example (13)
(13)
zak-yit. A: Nde kwalba hok na n-d’e one/other bottle DEF PRES PRES-exist again ‘Look, (there) is again a bottle.’
Postural categories and nominal concepts
293
N: Goenang nd’ˆuuˆ n? which(SG) INSIDE n-t’o nnoe Goe-t’o a NOMZ(SG )-lie(SG ) ADVZ-lie(SG ) LOC . ANAPH INTERR n-d’yem? ko goe-d’yem maybe NOMZ(SG)-stand(SG) ADVZ-stand(SG) ‘Which among (them)? (Is it) this one that lies lying, or the one that stands standing?’ Similar differences in interpretation are found whenever the existential is used in place of specific posturals. Again, its interpretation can be captured with the help of pragmatic implicatures, more specifically with Levinson’s (2000) Q-principle, which is based on Grice’s (1975) first Maxim of Quantity (i.e., “make your contribution as informative as is required”). This principle captures the distribution of elements that are in a privative opposition: the general semantics of the existential (“existence at a location”) are entailed by the specific semantics of the posturals (“existence at a location in a position”). In this case, the use of the less informative term (i.e., the existential) implicates that the more informative term (i.e., the postural) is not applicable – if it were applicable the speaker would have used the more informative term in the first place. This means, the use of the superordinate term d’e ‘exist’ is not always pragmatically appropriate: speakers only use it under the specified conditions – precisely because its use carries the implicature that a specific postural is not applicable (see also Skopeteas (this vol.) who uses this framework to explain the distribution of adpositions).
3.
Summary and discussion
This paper has discussed the coding of postural information in nominal classifier systems, focusing on a system that is based on postural semantics. The following two findings are of particular relevance to the topic of this book: First, it was shown that Goemai uses postural semantics to set up disjoint classes. Given our knowledge of established classifier systems, such a semantic basis is unexpected: while postural information plays a role in different systems, the system as a whole is usually based on inherent properties.
294
Birgit Hellwig
However, a semantic analysis has shown that the Goemai system is not fundamentally different from other classifier systems: it is based on canonical – and hence time-stable – positions. As such, its deictic classifiers are comparable to established classifier types. Furthermore, identical semantic classes are not only coded in deictic classifiers, but also in verbs. This finding has an even wider implication: after all, only few languages have deictic classifiers, but very many languages have postural verbs. Its semantic basis offers an interesting perspective on the topic of ontolinguistics: Goemai forces its speakers to conceptualize the nominal domain in terms of its postural characteristics, since – in many different morphosyntactic contexts – it requires its speakers to select one element from amongst the closed set of postural elements. Second, although each nominal concept is assigned to one class, speakers still have the possibility to override this default assignment and to temporarily assign a concept to a different class. It was argued that this possibility does not undermine the classificatory basis of the Goemai system. Instead, the distribution of default elements, non-default elements and the general existential is governed by pragmatic implicatures. These implicatures arise because speakers and hearers maintain expectations about normal language behavior and because they are aware of alternative expressions that a speaker could have used but did not. As such, the use of a marked alternative specific postural or of the semantically more general superordinate existential carries certain implicatures that the speaker may wish to avoid. The discussion has shown the important contribution of pragmatics to the understanding of the overall classificatory system.
Notes 1. This research was funded by the Max Planck Institute for Psycholinguistics (Netherlands) and the Endangered Languages Documentation Programme (UK). I am grateful to Friederike L¨upke, Andrea Schalley, Eva Schultze-Berndt, and Dietmar Zaefferer, who have given valuable comments on earlier drafts of this paper. All remaining mistakes are my own. 2. This paper does not discuss the categorization of the verbal domain (see McGregor 2002; Schultze-Berndt 2000), and it does not address noun class and gender systems (see Aikhenvald 2000: 19–80). It focuses on nouns denoting concrete physical concepts, but will also comment on the position of abstract concepts. 3. Another, less frequently discussed, possibility is the existence of classifier constructions without classifiers (see Wilkins 2000; Gaby in prep.).
Postural categories and nominal concepts
295
4. The following abbreviations are used in the interlinear glosses: ADVZ adverbializer FOC focus PL plural ANAPH anaphor INTERR interrogative POSS possessive CL classifier LOC locative PRES presentative DEF definite article M masculine PROGR progressive DEM demonstrative NEG negation PROX proximal EMPH emphasis NOMZ nominalizer SG singular 5. Some domains are more important in some types than in others, e.g., physical properties are frequently found in numeral, locative, deictic and verbal classifiers, but only rarely in noun and possessive classifiers. 6. But see McGregor (2002) who argues against this assumption. 7. It is a matter of debate whether or not one would want to actually label such suppletive verb stems “classifiers”. From a typological perspective, it may be more appropriate to use the label “classificatory verbs”, and to restrict the label “classifier” to a distinct part of speech. 8. Notice that they distinguish postural-type systems from other types of systems that use superficially similar verbs, but that do not have classificatory uses. The present paper only discusses their postural type. 9. A pepe is a round mat that is used to cover food vessels. Its woven structure gives it a certain asymmetry, and the same side is consistently placed in contact with the food. I assume that this asymmetry is responsible for the occurrence of pepe in the category t’ong ‘sit’. 10. The possibility of such a shift is limited by the disambiguating function of the posturals (discussed in Section 2.1.): only if the referential context is unambiguous, is it possible to, e.g., use lang ‘hang/move’ and t’o ‘lie’ to contrast a moving bee with a lying dead bee.
References Aikhenvald, Alexandra Y. 2000 Classifiers: A Typology of Noun Categorization Devices. (Oxford Studies in Typology and Linguistic Theory.) Oxford: Oxford University Press. Allan, Keith 1977
Classifiers. Language 53 (2): 285–311.
Ameka, Felix K., and Stephen C. Levinson (eds.) submitted Locative predicates. (Linguistics, special issue.) Brala, Marija M. this vol. Spatial ‘on’ – ‘in’ categories and their prepositional codings across languages: Universal constraints on language specificity. Broschart, J¨urgen 2000 Isolation of units and unification of isolates: The gestalt-functions of classifiers. In Senft (ed.), 239–269.
296
Birgit Hellwig
Craig, Colette (ed.) 1986 Noun Classes and Categorization. (Typological Studies in Language 7.) Amsterdam/Philadelphia: John Benjamins. Denny, J. Peter 1986 The semantic role of noun classifiers. In Craig (ed.), 297–308. Gaby, Alice in prep. A grammar of Kuuk Thaayorre. Ph.D. diss., Department of Linguistics and Applied Linguistics, University of Melbourne. Grice, H. Paul 1975 Logic and conversation. In Syntax and Semantics. Vol. 3: Speech Acts, Peter Cole and Jerry L. Morgan (eds.), 41–58. New York/San Francisco/London: Academic Press. Grinevald, Colette 2000 A morphosyntactic typology of classifiers. In Senft (ed.), 50–92. Hellwig, Birgit 2003 The grammatical coding of postural semantics in Goemai (a West Chadic language of Nigeria). Ph.D. diss., Faculteit der Letteren, Katholieke Universiteit Nijmegen, and Max Planck Institut f¨ur Psycholinguistik. Keegan, John M. 2002 Posture verbs in Mbay. In Newman (ed.), 333–358. Klein, Harriet E. Manelis 1979 Noun classifiers in Toba. In Ethnolinguistics: Boas, Sapir and Whorf Revisited, Madeleine Mathiot (ed.), 85–95. (Contributions to the Sociology of Language 27.) Den Haag/Paris: Mouton. Kuteva, Tania A. 1999 On ‘sit’/‘stand’/‘lie’ auxiliation. Linguistics 37 (2): 191–213. Levinson, Stephen C. 2000 Presumptive Meanings: The Theory of Generalized Conversational Implicature. (Language, Speech, and Communication.) Cambridge, MA/London: MIT Press. Lucy, John A., and Suzanne Gaskins 2001 Grammatical categories and the development of classification preferences: A comparative approach. In Language Acquisition and Conceptual Development, Melissa Bowerman and Stephen C. Levinson (eds.), 257–283. (Language, Culture and Cognition 3.) Cambridge: Cambridge University Press. McGregor, William B. 2002 Verb Classification in Australian Languages. (Empirical Approaches to Language Typology 25.) Berlin/New York: Mouton de Gruyter. Merlan, Francesca, Steven Powell Roberts, and Alan Rumsey 1997 New Guinea ‘classificatory verbs’ and Australian noun classification: A typological comparison. In Nominal Classification in Aboriginal Australia, Mark Harvey and Nicholas Reid (eds.), 63–103. (Studies
Postural categories and nominal concepts
297
in Language Companion Series 37.) Amsterdam/Philadelphia: John Benjamins. Newman, John (ed.) 2002 The Linguistics of Sitting, Standing, and Lying. (Typological Studies in Language 51.) Amsterdam/Philadelphia: John Benjamins. Schultze-Berndt, Eva 2000 Simple and complex verbs in Jaminjung: A study of event categorisation in an Australian language. Ph.D. diss., Faculteit der Letteren, Katholieke Universiteit Nijmegen, and Max Planck Institut f¨ur Psycholinguistik. Seiler, Hansjakob 1986 Apprehension: Language, Object, and Order. Vol. 3: The Universal Dimension of Apprehension. (Language Universals Series 1/3.) T¨ubingen: Gunter Narr. Senft, Gunter (ed.) 2000 Systems of Nominal Classification. (Language, Culture, and Cognition 4.) Cambridge: Cambridge University Press. Skopeteas, Stavros this vol. Semantic categorizations and encoding strategies. Stassen, Leon 1997 Intransitive Predication. (Oxford Studies in Typology and Linguistic Theory.) Oxford: Clarendon Press. Wilkins, David P. 2000 Ants, ancestors and medicine: A semantic and pragmatic account of classifier constructions in Arrernte (Central Australia). In Senft (ed.), 147–216.
Spatial ‘on’ – ‘in’ categories and their prepositional codings across languages: Universal constraints on language specificity Marija M. Brala A good way to search for answers relative to the language-mind riddle might be departing from language, and posing the following hypothesis: if the human language faculty is constrained by ontological knowledge (some kind of ‘pre-knowledge’) then it is quite likely that this same ontology (or parts thereof) will be constraining other subsystems of human cognition. Under this view, we then ask: 1) which subsystems of human cognition are easily comparable with language (one valid answer is spatial cognition); 2) what is universally shared between – in this case – the way in which we elaborate space non-verbally and the way in which we talk about it; and 3) is there a way to reconcile language specific categorization of space and (universal) conceptual knowledge? In this paper, by proposing an answer to questions 1) and 2) we also try to contribute to a positive answer to the third question. Departing from a crosslinguistic analysis of the ‘on’ – ‘in’ range of prepositional spatial usages (Bowerman and Pederson 1992, 2003), we a) try to describe and explain the central aspects of crosslinguistic variation in the ‘on’ – ‘in’ semantic (prepositional) usage; b) propose the elements and principles identified as being at the core of crosslinguistic prepositional variation and also the main causes bringing about language specificity; and c) explore what seem to be some (cognitively based) universal constraints on language specificity. The result of the study is that it is possible to logically systematize this ‘on’ – ‘in’ range, in terms of different (but systematic, non arbitrary!) combinatorial patterns of features from only three domains: DIMENSIONALITY (points, axes, volumes), ATTACHMENT (contact; presence / absence / quantity of) and ORIENTATION (90° / 180° angle, directionality). 1.
Introduction: Language and space
The relations between space and language, or, more precisely, the expression or rather lexicalization of spatial concepts in (various) language(s), has
300
Marija M. Brala
been studied extensively by linguists and psychologists (cf. e.g. all papers in Bloom et al. 1996; Bowerman 1996b; Brown 1994; Clark 1973; Gopnik and Meltzoff 1986; Johnston 1984; Johnston and Slobin 1979; Levinson and Meira 2003; Piaget and Inhelder 1956; Talmy 2000; Weist 1991 etc.). More specifically, studies of spatial language and conceptualization have been of fundamental importance in the development of cognitive semantics. Why is the domain of space i.e. spatial language so interesting? Research in comparative linguistics as well as research in cognitive semantics has revealed that there is a considerable variation in the ways in which different languages categorize space in order to talk about it. This crosslinguistic variation has brought about a series of questions, answering which might shed light on some crucial psycholinguistic issues. The concrete questions stemming from crosslinguistic research of spatial coding are: – Is it possible to identify a universal subset of spatial meanings that are expressed in all languages (as suggested by e.g. Talmy 2000, this vol.)? – Are there neuropsychological constraints on the nature of possible spatial meanings i.e. on what could be talked about and thus lexicalized in natural languages (see e.g. Jackendoff 1996; Landau and Jackendoff 1993)? And, relatedly, does spatial language depend upon prelinguistic spatial schematizations (as suggested by e.g. Mandler 1996), and if so, how? – Does the representation of the ‘human body’ as a spatial ‘source domain’ play a role both in the structure and in the acquisition of (spatial) language (Lakoff 1987)?
And finally: – Does (spatial) language interfere with spatial cognition (in e.g. the comprehension of spatial expressions), i.e. does crosslinguistic variation in (the semantics of) spatial categories bring about differences in the nonlinguistic spatial cognitive processes of speakers of different languages (as suggested by e.g. Bowerman 1996a, 1996b; Bowerman and Choi 2001; Levinson 1996, 2003 etc.)?
As said, answers to these questions bear relation to more general issues such as: – the interesting but still controversial issue of the relation between language and thought; – the issue of the structure and the ontological status of (spatial) concepts in both language as well as in other sub-systems of human cognition.
Spatial ‘on’ – ‘in’ categories
301
In addition to all the above, from the developmental point of view we also note that spatial words are frequently cited as prime evidence for the claim that children’s first words label non-linguistic concepts. Furthermore, from the cognitive point we note that spatial cognition is seen as being at the heart of our thinking (spatial thinking invades our conceptions of many other domains as diverse as time, social structure or mathematics – cf. e.g. Levinson 1996, 2003). These two facts represent two further justifications behind the scholarly interest in the relation between language and space. Given all the above, spatial words are by most scholars seen as good candidates for the search in the field of the universal (perhaps also primitive, innate) in language. One question is duly reiterated at this point: if what has just been claimed about the universality (primitiveness) of spatial language is indeed so, how are we to account for the fact that languages vary substantially in their semantic structuring of space (cf. e.g. Ameka 1995; Brown 1994; Choi and Bowerman 1991; Lakoff 1987; Levinson 1996, 2003; Talmy 1983, 2000; see Bowerman 1996a for an overview). Furthermore, and perhaps most interestingly, how are we to explain the fact that increasing evidence seems to suggest that children are sensitive to language-specific structural properties of the language they are acquiring from one-word stage of development. It has namely been shown that different linguistic patterns in the linguistic input influence the meanings of children’s spatial words from as early as 18 months (cf. e.g. Choi and Bowerman 1991; Bowerman 1996a, 1996b; Choi et al. 1999; Bowerman and Choi 2001). In final analysis we necessarily wonder: is there a way to reconcile all these contradictions relative to the findings about the relationship between (spatial) language and (spatial) cognition? The aim of this paper is to try and contribute to a positive answer to the latter question by first addressing another one: how is one to go about testing the hypothesis that ontological knowledge – intended as knowledge inherent in the way human cognition tends to conceptualize and categorize the phenomena of this world (i.e. some kind of pre-knowledge) – may considerably constrain both the human language faculty in general, and the lexicon and other lexical structures (of any one natural language) in particular, and what can the results of such an investigation reveal? One way of approaching the issue is by taking a domain and comparing and contrasting the ways in which this domain is realized in language vs. the ways in which we elaborate i.e. compute this domain in other cognitive modes. The idea is to find a domain or rather a subsystem of human cognition which is easily comparable with language (cf. Talmy 2000). As already
302
Marija M. Brala
pointed out above, one such obvious domain is the domain of space. Spatial language has, indeed, for some time now been the focus of study of linguistics, psychologists, anthropologists, even neuroscientists. Scholars have concentrated on the ways in which spatial meanings are expressed (lexicalized) by spatial verbs (posture / motion verbs), spatial nominals, spatial (locative) cases, and adpositions. Out of all these categories, in this work we shall focus on the class of adpositions or rather, prepositions.1 Here is why.
2.
Prepositional semantics
2.1.
Prepositions as a word class
Within the research domain of spatial language, particular interest has been drawn by prepositions. They are interesting both at the crosslinguistic level (since considerable crosslinguistic semantic variation in the field of prepositions very clearly points to the apparent gap between cognitive universality underlying the spatial lexicon on the one hand, and linguistic relativity that seems to be at play when it comes to the acquisition of spatial words, on the other), as well as intralinguistically, as a grammatical form. (Grammatical forms) represent only certain categories, such as space, time (hence, also form, location, and motion), perspective point, distribution of attention, force, causation, knowledge state, reality status, and the current speech event, to name some main ones. And, importantly, they are not free to express just anything within these conceptual domains, but are limited to quite particular aspects and combinations of aspects, ones that can be thought to constitute the ‘structure’ of those domains. (Talmy 1983: 227)
Departing from Talmy’s view expressed in the words quoted above, Slobin (1985) proposes that children, like languages, are constrained in the meanings they assign to the grammaticized portions of language, and that, even more interestingly for our case, there exists a difference between the kinds of meaning expressed by open-class and closed-class forms. In fact, the meaning of the former is seen as being essentially unbounded, while the meaning of the latter is viewed as being constrained (cf. Slobin’s 1985 notion of ‘privileged set of grammaticizable notions’). As one of the closed classes of the lexicon, prepositions could then carry meaning which is constrained. Is this constrained meaning also definable?
Spatial ‘on’ – ‘in’ categories
303
Linguistics has, for some time now, been familiar with the idea that syntactic categories express certain semantic traits which are common for all members of a given syntactic category (e.g. Talmy 1983, 2000; Slobin 1985; Levin and Pinker 1991). In order to try and establish a ‘general meaning’ for the word-class of prepositions, let us first recall the traditional reading of the category. Linguists define prepositions as ‘relational words’. If prepositions are, by definition, relational words, then in order to understand the nature of their meaning, i.e. of the type of relation they can establish, we need to stop for a moment and think about the sort of things they put into relation. Herskovits (1986: 7) notes that the simplest type of prepositional spatial expression is composed of three constituents, i.e. the preposition and two noun phrases (NP), as in: The spider (is) on the wall. The two NPs are referred to in the literature by various names (‘theme’, ‘located entity’, ‘located object’, ‘spatial entity’, . . . for the first NP, and ‘reference object’, ‘reference entity’, ‘localizer’, ‘landmark’, . . . for the second NP). The terminology adopted in this paper is: Figure (abbreviated as ‘F’) for the first NP, i.e. the object being located, and Ground (abbreviated as ‘G’) for the second NP, i.e. the object in reference to which F is being located. The notions of Figure and Ground were originally described in Gestalt psychology, but their application in linguistics stems from Talmy (1983), who characterized them as follows: The Figure is a moving or conceptually movable object whose site, path, or orientation is conceived as a variable the particular value of which is the salient issue. The Ground is a reference object (itself having a stationary setting within a reference frame) with respect to which the Figure’s site, path, or orientation receives characterization. (Talmy 1983: 232)
Given that a preposition seems to relate F’s location with respect to G (F’s location being static in the case of locational contexts and dynamic in the case of motional ones), we might easily be led to conclude that the relation established by a preposition (as word class) has a locational or rather topological nature. But let us contrast sentences 1) and 2) as given in Figure 1 on p. 304. Topologically, the relations between the smoke and the cheese cover in (1) and the pear and the cheese cover in (2) look very much alike. However, the problem of the unacceptability of the preposition in in (2) (unacceptable in many cases crosslinguistically), is easily resolved within the cognitive framework, which takes into account features that result from our functioning as and interacting with entities in the world. ‘Containment’ is a good example
304
Marija M. Brala
1) The smoke is in the cheese cover.
2) The pear is *in/under the cheese cover.2
Figure 1. Functional motivation of semantic profiling
of an interactive relation established between entities (including ourselves) in the world. ‘Support’ is another one. While in (1) the cheese cover controls the location of the smoke, in (2) it is not the cheese cover (in terms of ‘containment’), but rather the table (in terms of ‘support’) that controls the location of the pear. If we remove the cheese cover, in 1) the smoke ‘leaves’ the original location (so we see that G’s volume controls the location of F), whereas in 2) the pear stays in the same place. So in (2) we have to say either that The pear is on the table, or, with respect to the cheese cover, that it is under, but in no case in it. It would hence appear to be the case that it is not ‘location’, but rather ‘the function of controlling location’ that seems to be at the core of prepositional semantics. Furthermore, in 2) the cheese cover functions as an obstacle which controls the access to the pear, which also involves a different schematization of the cheese cover than in 1) (due to a different relation between F and G in 1) and 2)). This is a crucial point, one which we shall return to in Section 3.1. Another theme is pervasive through most of the accounts of prepositional meaning: prepositions do not link objects, but rather geometric descriptions of objects (Herskovits 1986), different conceptualizations, i.e. views of objects (e.g. Leech 1969; Bennett 1975) or, as shall be suggested in the next section, views or abstractions of parts of objects. An easy way to understand what is meant here is by attempting a mental exercise whereby F and G are kept constant and only the preposition is replaced (e.g. frog in grass vs. frog on grass). We see that the mere change in preposition forces a particular type of construal on the scene. Before moving on, let us take stock of the situation by noting that the notions of ‘function of control’ and ‘schematization’ represent the main ideas underlying and guiding our analysis, and that the semantics of prepositions is probably best interpreted as a ‘mixture’ of control and schematization.
Spatial ‘on’ – ‘in’ categories
2.2.
305
The meaning of on
In the following pages, we will try to take a look at how the above theoretical notions and assumptions work in practice. We shall focus on two (English) prepositions, namely on and in. This choice of prepositions is motivated by the following reasons: 1) on and in are the first prepositions to be acquired by children (and, crosslinguistically speaking, the same is true of other words in other languages expressing the concepts of ‘containment’ and ‘support’ – cf. e.g. Johnston and Slobin 1979); 2) on and in (just like their counterparts in the other languages under examination) are the most frequently used prepositions (be it as independent lexical items, as verb particles or prefixes); 3) on and in fall in the middle range of semantic complexity, on a continuum between e.g. toward(s) and by (cf. Lindstromberg (2001: 83), who also points out that entries in the middle range of semantic complexity – i.e. those for which a reasonably comprehensive dictionary is unlikely to put either too many or too few semantic complexities – are best candidates for linguistic studies); 4) notwithstanding their polyfunctionality, a wide majority of treatments of these prepositions in the literature seem to concord on the fact that on and in seem to have quite a clear ‘prototypical’ or ‘ideal’ meaning. They are thus frequently referred to as ‘the two basic topological prepositions’ (cf. Herskovits 1986: 127; Leech 1969: 161), or ‘central locational prepositions’ (cf. Quirk et al. 1985); 5) on and in are the two prepositions that have most extensively been dealt with in the literature, which gives us a lot of ready data to verify / develop further. Nevertheless (or precisely because of this extensive interest), the semantic analyses of these two prepositions have, if one leaves aside the central or prototypical meaning, in ultimate analysis usually proposed either an underdifferentiated or an overdifferentiated interpretation (which becomes particularly evident when we take a crosslinguistic look at these prepositions).
Some of the studies that have most influentially dealt with the issue of prepositional semantics, and – more importantly for our analysis – studies which seem to offer elements which could help us systematize prepositional (‘on’ – ‘in’ ) usage (crosslinguistically), are (in chronological order): Cooper
306
Marija M. Brala
1968; Leech 1969; Bennett 1975; Quirk et al. 1985; Herskovits 1986; Vandeloise 1986, 1998; Cuyckens 1993; Bowerman and Pederson 1992, 2003; and Lindstromberg 1997. It should be noted that although these researchers do not come up with a single explanation or, in the case of the cognitive approaches, with a single mental structure that could be said to constitute the meaning of a given spatial preposition, most (see overview in Brala 2000; Lindstromberg 2001) seem to agree on the following: a) all prepositions are to some extent gestalt-kinetic in their nature (comparable to our ‘function of control’ introduced in Section 2.1. above); and b) all prepositions are sufficiently endowed with definable geometrical and functional structure as to enable (motivated, non-random) metaphorical extensions (comparable to the notion of ‘schematization’ also introduced in Section 2.1.).
Departing from these conclusion, in the remainder of this paper I will try to further develop these general notions, trying to explore them from the ontological perspective, i.e. attempting to see whether it would be possible to establish an account of these linguistic remarks with respect to what is already known about language-independent (spatial) conceptualization.
2.3.
The study by Bowerman and Pederson
Out of the many studies of spatial language, one that has been attracting a lot of interest and one that, in my view, offers extremely insightful data that could be taken as a starting point on the road to answering questions relative to the ontological status of the concepts lexicalized by different spatial prepositions in different languages is a crosslinguistic study of prepositional usage by Bowerman and Pederson (1992, 2003; cf. also Bowerman and Choi 2001: 484–487). In this detailed study the authors examine the physical (spatial) senses lexicalized by the English prepositions on and in, and the ways in which these same senses (i.e. types of spatial relations) are rendered in 33 other natural languages. Bowerman and Pederson look at all non-predicative topological relation markers (adpositions, spatial nominals, and cases) that are used to code what in English are instances of spatial relations coded by the prepositions on and in (e.g. cup on table, picture on wall, ring on finger, cigarette in mouth, apple in bowl – see Figure 2). In order to investigate crosslinguistic codings of
Spatial ‘on’ – ‘in’ categories
307
Spanish
Croatian English
Italian
Berber
Finnish
en
u
in
in
di
-ssa
en
na
on
su/a
di
-ssa
en
na
on
su/a
di/x
-ssa
en
na
on
(su)/a
di/x
-ssa
en
na
on
su/a
di/x
-ssa
en
na
on
su/a
x
-lla
en
na
on
su
x
-lla
Figure 2. Some crosslinguistic differences in the semantic classification of static spatial configurations (based on Bowerman 1996b: 154–155; Bowerman and Choi 2001: 485)
these spatial relations, the authors have developed a stimulus set of drawings (as those represented in Figure 2), used as an elicitation tool. Each drawing represents a Figure and a Ground. The researcher uses the pictures to set up a verbal scenario and asks a consultant (a native speaker of one of the 33 languages under examination) to answer the question of the form: ‘Where is the [Figure] (in the sketched scenario)?’3 Using this elicitation procedure Bowerman and Pederson have aptly shown that all the instances of spatial relations under consideration4 can be divided into 11 categories, with categorial
308
Marija M. Brala
boundaries being drawn whenever at least one language, in order to lexicalize one or more of these spatial relations, ‘switches’ from one preposition (or other lexical form)5 to another. Even more interestingly, the authors observe that these 11 categories identified by way of crosslinguistic comparison can be ordered as to form the sequence given in Figure 3: Nr. 1 2 3 4 5 6 7 8 9 10 11
Category Support from below Marks on a surface Clingy attachment Hanging over / against Fixed attachment Point-to-point attachment Encircle with contact Impaled / spitted on Pierces through Partial inclusion Inclusion
Example Cup on table Writing on paper Raindrops on window Picture on wall Handle on cupboard Apple on twig Ring on finger Apple on stick Arrow in / through apple Cigarette in mouth Apple in bowl
Scale ‘on’
‘in’
Figure 3. The ‘on’ – ‘in’ scale of spatial meaning categories (Bowerman and Pederson 1992; cf. also Bowerman and Choi 2001)
This ordered sequence of meaning categories, is, at a crosslinguistic level, differently partitioned into meaning clusters. E.g. Spanish and Portuguese lexicalize the whole range with one preposition only (en and em, respectively), English uses two prepositions (on and in), while German and Dutch partition the scale into three ‘prepositional segments’ (auf, an, and in for German, op, aan, and in for Dutch), etc. The most striking observation is that the portions of the scale attributed to different prepositions are ‘compact’, i.e. there is no language which would lexicalize part of the scale with ‘on’, then part of the scale by ‘in’, and then part of the scale by ‘on’ again. If there is overlapping at all (i.e. if a language uses two prepositions interchangeably for one or more categories) this always occurs in the section of the scale which is ‘transitional’ i.e. between the categories in which the use of only one of the two prepositions is possible.6 In other words, Bowerman and Pederson show that their scenes (instances of spatial relations) scale on a cline between a prototypical ON (support from below)–IN (containment proper), i.e. that if a
Spatial ‘on’ – ‘in’ categories
309
language has a broad extension for, say, ON, it can be predicted which scenes (relations) will be coded by this linguistic mean. The fact that Bowerman and Pederson manage to show that all the instances of ‘on’ – ‘in’ spatial relations that are lexicalized in various languages can be categorized along a continuum that can systematically be mapped onto language, no matter which language one chooses, lead to the hypothesis that the ‘on’ – ‘in’ scale is not formed on a random basis, but that there must be an underlying ‘gradient’, something more powerful than ‘linguistic arbitrariness’ governing the formation and arrangement of its categories. However, neither the authors of the study nor other scholars who have used Bowerman’s and Pederson’s findings in their own research (cf. e.g. Levinson and Meira 2003) do not propose an answer to a key question; the one relative to the ‘structural order’ of the ‘on’ – ‘in’ gradient. Departing from Bowerman’s and Pederson’s findings (Figure 3), in the remainder of this paper we shall be focusing on three questions: 1) what governs the ‘on’ – ‘in’ gradient identified by the authors? 2) is it possible to find an answer to question 1) above that would bear a relation to the universal, cognitively based view of the human language faculty? and 3) is it possible to determine the ontological status of the concepts underlying prepositional codings crosslinguistically? In other words, is it possible to determine the exact level (and modalities) of language specificity and also try to see whether and if so then how language specificity can be integrated with the idea of cognitive universality (primitiveness) of all the various elements which have been seen to play a role a) in the structuring of spoken-language spatial schemas (at the crosslinguistic level), b) in the schematic representation of space in signed languages, and possibly c) in what is known about scene parsing in visual perception (see also Talmy this vol.) and / or in other modalities (e.g. kinesthesis, i.e. limb position, vestibular senses, i.e. gravity and body acceleration etc.).
3. 3.1.
Understanding the ‘on’ – ‘in’ gradient The rationale behind the decomposition
We depart from two working assumptions, both to be tested below. The first working hypothesis is that the categories of spatial relations are formed (and later organized into meaning clusters) on a combinatorial basis, out of uni-
310
Marija M. Brala
versal, primitive, bodily based semantic features (for an extensive literature review see Brala 2000). The second working hypothesis is that these basic semantic elements have to be analyzed and interpreted assuming that semantic knowledge presupposes ontological knowledge, at least to a degree and at a deep, I-level (of semantic primes). This knowledge exists in our nervous system, inside a body that lives in constant interaction with a physical and social world (which, however, is not an ‘objective’ world, but rather a world registered through a series of sensory systems organized into modalities). Semantic primes are thus viewed as being shared between the human language faculty and other sub-systems of human cognition (cf. Talmy 2000). What exactly is meant by this? Taking things from the observational into the explanatory realm, Bowerman (e.g. 1996b) assumes the position that although the Categories 1 through 11 in Figure 3 could be universal, linguistic relativity might nevertheless be strongly at play when it comes to the distribution of (prelinguistic?) spatial concepts, i.e. the organization of the spatial lexicon. In fact, she seems to be suggesting that the universality of spatial conceptualization is difficult to reconcile with the diversity and relativity of the acquisition of spatial relational words (cf. Bowerman 1996a, 1996b). Yet, the two should not necessarily be irreconcilable. Some initial arguments that offer hope for reconciliation between a set of spatial prelinguistic concepts and the view of linguistic relativity can be found in Vandeloise (1998), who observes two crucial points: a) prototypical spatial configurations are not essentially perceptual (as is the case with color or other natural categories), or perhaps more exactly, they are not locational but rather functional7 (cf. also the remarks in Section 2.1. regarding the function of control i.e. force-dynamics); b) related to a) – the connection between the different words used for lexicalizing various portions of the ‘on’ – ‘in’ scale will remain difficult to establish for as long as one looks at categories described in the scale as topological concepts (as Bowerman does). What should be done is observe the distribution of (even locative) prepositions by taking into account dynamic factors (cf. also points a) and b) in Section 1.2.).
The dynamic factor which links containment and support is their function of control (in one, two or three dimensions). This fact leads to the possibility of connecting various categories (e.g. ‘containment’, ‘tight fit’, ‘attachment’ etc.) into a hierarchy.
Spatial ‘on’ – ‘in’ categories
3.2.
311
A hierarchy of prelinguistic concepts
Vandeloise’s proposal (1998: 7) looks as given in Figure 4. Here, we are looking at an extremely valuable proposal. It is, namely, the first analysis (at least to my knowledge) that tries to ‘systematically’ decompose the classic primitive candidates of ‘containment’ and ‘support’ in terms of dynamic forces, hence suggesting another potential trait which might be underlying categorial intentions, and also the first view suggesting a ‘hierarchic’ organization of (pre)linguistic (spatial) concepts, this latter being of particular relevance for our analysis. control
control in more than one direction containment
control in the vertical axis support
virtual or effective control
effective control
direct control
control by intermediary
containment
tight fit
support
attachment
Figure 4. Vandeloise’s hierarchy of prelinguistic concepts (Vandeloise 1998: 7)
In fact, Bowerman’s categories can now, following Vandeloise’s suggestions, be treated as complex primitives (referring to relationships, i.e. dynamic factors). They are called ‘primitive’ because they are seen as prelinguistic concepts, and ‘complex’ because they need to be described by a list of properties which behave like traits of family resemblance (cf. also Vandeloise 1998: 11–15). It should be noted, however, that in order to apply Vandeloise’s analysis summarized in Figure 4 to the semantic clusters in Figure 3, we have to distinguish virtual from effective control and we have to rearrange the hierarchically lower components in the ‘support’ branch (‘attachment’ comes before ‘support’ proper), so that the lowest components in the tree in Figure 4 match the categories in Figure 3 (for the full elaboration of Figure 4 see Brala 2000, Chapter 3.). The aim of this exercise is to decompose the internal structure of the complex primitives that, at a lexical level, are mapped onto the ‘on’ – ‘in’ prepositional range. Departing from the
312
Marija M. Brala
hypothesis that Bowerman’s categories of spatial relations are formed (and later organized into meaning clusters) on a combinatorial basis, out of universal, primitive, bodily based semantic features, shared between the human language faculty and other sub-systems of human cognition, in our concrete case the range of ‘on’ – ‘in’ static spatial meanings can be explicated in terms of varying combinatorial patterns of different values (or features) within only three domains:8 DIMENSIONALITY, ORIENTATION, and ATTACHMENT. Let us take a look at each of them. DIMENSIONALITY
This domain, illustrated in Figure 5, relates to the number of axes of the Ground that are taken into consideration when a prepositional label is attributed to a physical relation. In e.g. cup on table, only the horizontal surface of the table is taken into account, so the suggested feature from within the ‘dimensionality’ structure is ‘1 DIM’. The feature ‘1 DIM’ always selects the horizontal axis, in that it is in this axis that the force of gravity is ‘felt’, i.e. where the dynamic control over the force of gravity is exercised. As we move from the top of the gradience scheme towards the bottom, the 1-DIM is no longer sufficient to distinguish between e.g. the German an and auf, or the Dutch aan and op. In fact, with Category 3 (termed by Bowerman ‘clingy attachment’), the horizontal axis is no longer sufficient for categorial distinctions (for the German an, the Dutch aan, some uses of the Italian a, and the Croatian o, the critical distinction seems to be in terms of the attachment to the vertical axis), so a second, vertical axis of the Ground starts playing a semantically salient role. From this point on we have the feature ‘2 DIM’. Moving on, from the ‘2 DIM’ feature, as the two lines ‘lose’ the 90° angle between them and their ends ‘meet’, the ‘2 DIM’ ‘develops’ into a ‘CIRCLE’ (Category 7: ‘encircle with contact’). Then, as we move on, the ‘CIRCLE’ is ‘elongated’, i.e. developed into the third axis, and we finally get to ‘3 DIM’ (i.e. ‘volume proper’) first applied to F (Category 8), and then to G (Categories 9 through 11). Finally, in Category 11 (inclusion), the dominant feature (the feature with most weight) is ‘3 DIM’. This category represents the best example (or the prototypical relation) of ‘control over force of gravity in terms of G’s volume’. In fact, in all those languages which activate the preposition ‘in’ for one category only, it is always the case of Category 11, i.e. ‘inclusion proper’ (e.g. Japanese naka). Summarizing, the domain DIMEN SIONALITY – a domain relative to the number of axes of G that are taken
Spatial ‘on’ – ‘in’ categories
313
into consideration for the purposes of linguistic expression – yields – for the purposes of explanatory needs of the range of prepositional usages under consideration – the following four features (see also Cuyckens 1993): a) 1 DIM (one-dimensional); b) 2 DIM (two-dimensional); c) CIRCLE (crucial as separately lexicalized in many languages); d) 3 DIM (three-dimensional, VOLUME, or ‘containment proper’).
Applied to Bowerman’s ‘on’ – ‘in’ gradience scheme, schematically, this looks as shown in Figure 5: ⎧ ⎪ ⎨
1 DIM⎪ ⎩
⎧ ⎪ ⎨
2 DIM CIRCLE
⎪ ⎩
⎧ ⎪ ⎨
⎧ ⎪ ⎨
⎪ ⎩
3 DIM
⎪ ⎩
Nr. 1 2 3 4 5 6 7 8 9 10 11
Category Support from below Marks on a surface Clingy attachment Hanging over / against Fixed attachment Point-to-point attachment Encircle with contact Impaled / spitted on Pierces through Partial inclusion Inclusion
Scale ‘on’
‘in’
Figure 5. The ‘on’ – ‘in’ gradience scheme analyzed in terms of DIMENSIONALITY ORIENTATION
This domain is the only one which is not further subdivided into subcategories, in that it is based on the notion of Vandeloise’s (1998) ‘control’ (the uppermost notion in the hierarchy). The way this ‘control’ is interpreted and further treated in this work is from the point of view of what is suggested as the main aspect of our experience as ‘bodily’ entities, perceiving and interacting with the environment, i.e. the force of gravity. From the moment we are born we are exposed to the force of gravity, and it is difficult to think of another experience which would so consistently and strongly permeate our existence. Vandeloise’s (1998) control is then interpreted as ‘control with respect to the force of gravity’, i.e., more simply, it is termed ‘opposition to the force of gravity’. The embodied experience structure suggested within the domain ‘ORIENTATION’ is, basically, identical with the experience of the
314
Marija M. Brala
‘opposition to the force of gravity’. The ‘ORIENTATION’ domain, in its capacity of having values ‘+’ or ‘–’, signals the human possibility or, rather, inclination (cf. e.g. Gregory 1998) to distinguish between cases of physical relations in which this ‘control’ is ‘oriented’ (in terms of the 90° / 180° angle existing between the main axis (cf. the ‘dimensionality’ structure) or not (as in e.g. ‘balloon (tied) on stick’ where the stick can be moved in many directions, and the balloon has the tendency to flow, so the angle between the F and G is neither constant nor critical for lexicalization). Very interestingly, visual perception researchers (e.g. Gregory 1998) have observed that when normally upright things (such as walls, floors, furniture) lean over, people ignore this information, i.e. perceptually adjust the angles to the usual degree. Furthermore, and this is linked to the DIMENSIONALITY feature, lines seem to be the most basic features recognized by the visual system, and it has been shown (cf. Prinzmetal in Gregory 1998) that in the very first stage of visual processing the image is split up into lines, and furthermore, that neurons in a cortical area called V1 monitor the visual information hitting specific regions of the retina and fire when they detect lines of a certain (usually right angle) orientation. It is then suggested (Gregory 1998) that certain neurons are sensitive to verticality with respect to gravity, no matter how the head is tilted. Furthermore, Sauvan and Peterhans (1999) report on having isolated a group of cells in the visual cortex of monkeys that keep firing in response to a single line, even when the head, and so the retina, is tilted. It is believed that these cells must be integrating non-visual information, probably from the inner ear, to keep track of the line’s orientation with respect to gravity. All the above should hopefully suffice to explain the relevance of the feature ‘opposition to the force of gravity’ within the context of our analysis, and this is why, indeed, while keeping the term ‘ORIENTATION’ for this feature, we shall treat the underlying embodied experience structure, something which could be verbally described as ‘opposition to the force of gravity’. This is the only domain i.e. structure which is not represented through presence alone, but through absence as well; ‘orientation present’ refers to the 90° or the 180° angle with respect to the Earth’s force of gravity (as exercised on the F). Thus the reading is: ‘parallel or perpendicular to the force of gravity’, when the domain ORIENTATION is ‘+’, or just ‘inclined with angle irrelevant’ when the domain has the ‘–’ value. It is crucial to note at this point that the negative value with all other features is not represented as a discriminatory feature (when other features assumes the negative value, they are simply treated as irrelevant, i.e. as not contributing to the lexical pattern).
Spatial ‘on’ – ‘in’ categories
315
ATTACHMENT
This domain is very similar, perhaps identical to the feature that e.g. Bowerman and Choi (2001) term ‘CONTACT’. This domain is also closely related to another ‘embodied experience domain’, which I first thought of keeping separate, but have later decided to conflate with ‘attachment’, i.e. the notion of ‘boundedness’. Conflated with boundedness – also since their separation seemed to complicate the picture without any gains at the explanatory level – the domain of ‘attachment’ is best understood as the quantity of attachment between G and F that seems to be relevant for lexicalization. Namely, in many languages (e.g. Dutch, Korean) the ‘quantity’ of attachment seems to play a crucial role in lexicalization. This fact, together with the awareness of general principles of language such as (cognitive) economy, made me decide that attachment should be internally subdivided on the basis of the notion of ‘boundedness’.9 The notion of ‘attachment’ is subsequently internally subdivided into the following features: ‘ATTACHMENT’ (e.g. in ‘sticker on cupboard’) and ‘1 SIDE BOUNDED ATTACHMENT’ (e.g. ‘cigarette in mouth’). The logically needed feature of ‘2 SIDE BOUNDED ATTACHMENT’ is conflated with the ‘3 DIM’ feature, since ‘volume containment’ includes ‘2 side bounded attachment’ (in fact ‘3 DIM’ or ‘volume containment’ can be seen as ‘3 side bounded attachment’). Furthermore, the distinction ‘2 side bounded attachment‘ vs. ‘3 side bounded attachment’ did not seem to be lexically critical in any of the languages under examination. Summarizing, the domain ‘ATTACHMENT’ yields two features i.e.: a) ATTACHMENT (simple contact or attachment via man-made means such as screws or glue); b) 1 SIDE BOUNDED ATTACHMENT.
These eight features now enable us to systematize the crosslinguistic variation in the ‘on’ – ‘in’ range of spatial usages, as shown in Figure 6 on p. 316.10 3.3.
Weaving the theoretical web
At this point, it is crucial to note several parallels between the analysis that I have just proposed, and other empirical evidence as well as proposals of theoretical frameworks in contemporary (psycho)linguistic literature. First of all, let us observe that the domains of DIMENSIONALITY, ATTACHMENT, and
316
Marija M. Brala
Nr.
Category
1 2 3 4 5 6 7 8 9 10 11
Support from below Marks on a surface Clingy attachment Hanging over / against Fixed attachment Point-to-point attachment Encircle with contact Impaled / spitted on Pierces through Partial inclusion Inclusion
Dimensionality 1 DIM 1 DIM 2 DIM 2 DIM 2 DIM 2 DIM CIRCLE 3 DIM
3 DIM 3 DIM 3 DIM
Orientation +OR -OR -OR +OR -OR +OR -OR -OR -OR -OR
Attachment
Scale
ATTCH
‘on’
ATTCH ATTCH ATTCH ATTCH
1 SBATTCH ATTCH ATTCH ATTCH
1 SBATTCH ‘in’
Figure 6. The ‘on’ – ‘in’ scale decomposed into basic features ORIENTATION introduced in this work can be found in many cognitive analyses of spatial language in general and spoken language spatial schemes in particular. A very striking analogy can be observed between these domains and the work by Talmy (2000, this vol.), who has identified
a) points, lines and planes (comparable to the 0- DIM, 1-DIM and 2-DIM features from within the DIMENSIONALITY domain). It should be noted that the 0-DIM feature, which has not been explicitly analyzed in this work, pertains to points and is essential in the analysis of, e˙g˙, the preposition at (cf. Brala 2000, 2002: 16–20); b) parallelness and perpendicularity (comparable to the features within the domain of ORIENTATION); and c) adjacency or contact (comparable to the domain of ATTACHMENT ), as being part of the universally available inventory of basic (primitive?) conceptual elements that constantly recombine in various patterns to constitute spatial schemas in spoken and signed languages, as well as in scene parsing in visual perception.
Another very interesting parallel can be observed between the domains proposed in this paper and work on spatial prepositions by Goddard (2002), as well as Lindstromberg (1997). Although the two approaches totally differ – in that Goddard works with verbal explications within the Natural Semantic Metalanguage Framework (NSM), and Lindstromberg uses verbal descriptions of the referential range of prepositions coupled with schematic representations of prepositional semantics – both authors seem to reach similar
Spatial ‘on’ – ‘in’ categories
317
conclusions, which see the semantics of prepositions ‘read’ in terms that can to a large extent be described as ‘dynamic’ rather than ‘locational’ and which, furthermore, massively parallel the domains i.e. features proposed in this paper. Lindstromberg’s (1997) analysis of on and in (and particularly of the on – in contrast), although being perhaps too specific in nature, and as such more useful for practical, pedagogical purposes than for theoretical dwellings, is, especially in its schematic part, extremely useful for our purposes, in that the schematizations (simplifications) of Figures and Grounds proposed by the author for a serious of relations seem to ‘abstract’ reality exactly following the domains (and obtaining the resulting features) as those introduced in this paper. Lindstromberg (1997) uses pictographs or icons, i.e. “highly schematic pictures whose forms suggest their meaning” (Lindstromberg 1997: 3). Abstracted in the icons are points, lines, planes, volumes, cyclical shapes etc. Our analysis in Section 3.2. is very compatible with such iconic representation of (prepositional) meaning. Namely, as highly schematic, icons (cf. also Lindstromberg 1997: 3) portray exactly those elements (horizontal and vertical axes, points, force dynamics i.e. vectors etc.), that very closely approximate semantic elements that are here suggested as being the basic features of prepositional sense. Furthermore, they can be easily mapped between pictures (schemata) and referents (selecting portions of the world, we talk about), thus helping to understand both prepositional choice and meaning. To clarify this point even further, we might wish to describe the ‘prepositional semantic mechanism’ as some sort of algorithm explicated below: 1) SELECT SALIENT PORTION OF F11 (this becomes F’) AND SALIENT POR TION OF G (this becomes G’). 2) PUT F’ AND G’ INTO RELATION F’G’. 3) NAME THE RELATION F’G’ WITH APPROPRIATE CATEGORY LABEL.
E.g. the category label is ‘in’ if F’G’ come to conceptually represent a relation where (at least): a) F’ is smaller than G’, and b) F’ is located internal to G’, which has to be either a one-, two- or three-dimensional geometric construct.12 We shall return to this point below. Going now back to our analysis of correspondences between the analysis in Section 3.2. and theoretical proposals currently found in literature, let us note some crucial parallels between the proposals spelled out in Sec-
318
Marija M. Brala
tion 3.2. and work by Goddard, especially his (2002) paper. Namely, Goddard’s (2002) analysis of the polysemic network of on makes it obvious that the NSM framework could not do without the primitive of TOUCHING, and the inclusion of this primitive within the list of NSM primes provides further support for the domain ‘ATTACHMENT’ as proposed in this paper.13 It is however crucial to observe at this point that neither Goddard’s nor Lindstromberg’s analysis, more specifically their categorical divisions of the relations belonging to various ‘subdomains’ of on do not hold at the crosslinguistic level. Namely, their analyses work only for the English prepositional system. In other words, when examples from their categories are translated into other languages, they frequently fall under another lexical category. As such, these analyses or rather interpretations are questionable when it comes to the issue of the cognitive basis of language (this is particularly surprising in the case of Goddard’s analysis, as the main feature of NSM is that it is universal, i.e., that its universals are present in all natural languages). One recent analysis of adpositions that does consider the (variation at the) crosslinguistic level is the study by Levinson and Meira (2003), who challenge the assumption that the spatial concepts instantiated in lexical spatial categories are uniform crosslinguistically. Using the TOPOLOGICAL RELA TION PICTURE SERIES by Bowerman and Pederson (the same as the one described in Section 2.3.) as their elicitation tool, the authors compare the semantics of spatial adpositions in nine unrelated languages and reach the following conclusions: 1) that the crosslinguistic difference they observe in the usage of adpositions are so significant that they make a strong version of the ‘universal conceptual categories hypothesis’ (as the one we propose in Section 3.2.) untenable; but also 2) that in view of the fact that crosslinguistically the various meanings of spatial adpositions seem to emerge as compact subsets of underlying semantic space (with certain central areas clearly identifiable), it might be possible that there exist some hierarchies for spatial adpositions. The authors propose they be treated as successive divisions of semantic space (much like the treatment of basic color terms). Now, one the first things that meets the eye when one considers the problems posed by Levinson and Meira (2003), particularly with respect to our conclusions in Section 3.2., is that the authors want to represent (the usage of) too many prepositions (i.e. IN, ON, UNDER, AT, NEAR etc.) in a twodimensional space.14 Namely, if we take our analysis of the ‘on’ – ‘in’ range presented in Section 2.3. (Figure 2) and Section 3.2. (Figure 6), we see that in the central portions of the cline many languages introduce a third/fourth
Spatial ‘on’ – ‘in’ categories
319
etc. preposition, but always in a ‘compact’ manner. Taking as an example the Italian preposition a (partly translatable with the English at) we see that it can be used interchangeably with IN in Categories 3 (‘clingy attachment’ – i.e. in Italian a drop can be both su (‘on’) the window, as well as a (‘at’) the window) through 7 (‘encircle with contact’ – i.e. in Italian a ring is either su (‘on’) the finger, or a (‘at’) the finger). Now, Categories 3 through 7 in the case of Italian exemplify only some of the categorial uses of the preposition a. It might be possible, given this observation, that Categories 3–7 represent only a portion of the ‘on’ – ‘in’ (planar) semantic space which is ‘intersected’ by the (planar) space of the Italian a (which would, however, probably have some categories preceding 3–7 of the ‘on’ – ‘in’ planar, and also some following them). The same reasoning applies to e.g. the very intriguing case of the Croatian preposition o (partly equivalent to the English against) that can be used interchangeably with na (‘on’), in Category 6 (‘point-to-point attachment’), with a very clearly lexicalized features of ‘force dynamics’ and motion, or some sort of vectorial representation intrinsic (cf. Brala 2004). Notwithstanding all the objections, even Levinson and Meira do not conclusively state that it is impossible to systematize on a continuum all the 71 pictures (i.e. all the resulting prepositions) used in the study, only that this task is undermined by computational intractability (Levinson and Meira 2003: 499). The authors however do recognize that Bowerman and Pederson (1992) have managed to systematize a cline for the ‘on’ – ‘in’ spatial relations, where categories (shown in Figure 5) map continuously, no matter which one of the 33 languages studied one chooses. Such coherence in the domain of spatial categories is ultimately allowed for by Levinson and Meira (2003), in the following terms: Languages may disagree on the ‘cuts’ through [the] semantic space, but agree on the underlying organization of space – that is, the conceptual space formed by topological notions is coherent, such that certain notions will have fixed neighbourhood relations. (Levinson and Meira 2003: 498)
Curiously, while Bowerman and Pederson, who have first observed the ‘on’ – ‘in’ continuum, fail to account for its underlying mechanisms, Levinson and Meira, who basically challenge the idea of systematic categorial coherence (in the domain of space), do try to propose a view that could account for (at least part of) the underlying (or deep) organization of space, possibly that part of spatial information structure that can be posited as being shared between
320
Marija M. Brala
language and other sub-systems of human cognition (or rather, that can be viewed as belonging to some sort of ‘spatial ontology’). In the light of the analysis proposed in Section 3.2. above, I would like to strongly agree with the view of Levinson and Meira (2003: 508) when they state that “crosslinguistically certain extensional classes tend to be shared”. Furthermore, very interestingly these authors note that “these [classes] cluster around notions such as attachment, superadjacency, full containment, subjacency, and proximity” (Levinson and Meira 2003: 508), which represents yet another finding in current literature which parallels the analysis proposed in Section 3.2. Another such close parallel – in even stronger correlation to my own findings given the employment of conceptual organization in different levels or hierarchies – is the work by Skopeteas (this vol.), who maintains that the lexical (and conceptual) distinctions in the domains of ‘on’ (or rather, ‘onness’) can, in different languages, be reconciled when (crosslinguistic) lexical analysis is conducted at various levels of taxonomic relations (where the superordinate domain of SUPERPOSITION is interestingly split into two at the next, subordinate level where we find the features +/–ATTACHMENT). Concluding this treatment of our findings in relation to current proposals in the literature, one last remark needs to be made; as far as concerns the discussion relative to the cognitive basis of language, it is important to observe that the division proposed above and summarized in Figure 5 is interestingly paralleled by some results from studies of the brain, i.e. plenty of neurobiological evidence (cf. Bloom et al. 1996; Deacon 1997).15 In ending this section let us also note that that the advantage of the universal ‘domains’ with the language specific realizations of features within these domains as well as the language specific patterning of these features proposed in this work is that they seem to allow for crosslinguistic diversity (i.e. seems to justify semantic variability within the category of prepositions at the crosslinguistic level) while still – through the universality of domains – maintaining (and justifying the view of) the cognitively based view of language. This view is further developed in the conclusive remarks of this paper. 4.
Conclusion
Following Piaget and Inhelder (1956), most analyses of spatial language have assumed that the simplest spatial notions are topological and universal. Furthermore, it has been generally assumed that languages provide a set of closed class words for the linguistic coding of these topological, universal concepts
Spatial ‘on’ – ‘in’ categories
321
(e.g. Talmy 2000). Recent psycholinguistic work (e.g. studies by the MaxPlanck-Institute Space and Cognition Group), focusing on crosslinguistic semantic coding of spatial adpositions, has partly undermined these assumptions. The aim of this paper has been to try and a) propose a unified analysis (for the ‘on’ – ‘in’ range) that would account for crosslinguistic variation i.e. categorical (semantic) differences while not undermining linguistic universality, and b) suggest that the mechanism proposed for the analysis of the ‘on’ – ‘in’ range, or rather, the domains and features proposed in Section 3.2. should, if correct, work for other (closed-class) crosslinguistic semantic categorial distinctions (spatial but perhaps also non spatial). One of the key questions of this volume is: what is the relation between the ontologies (systems of conceptualizations) in our minds and the various languages we speak. My contribution to the answer to this key question can be summarized in the following conclusive points: – Semantic knowledge would, at least in part, presuppose ontological knowledge (this is true at least at the deep, I-level, possibly the level of semantic primes). This knowledge exists in our nervous system, inside a body that lives in constant interaction with a physical and social world. However, this is not an ‘objective’ world, but rather a world registered through a series of sensory systems organized into modalities, which include vision, hearing, kinesthesis (i.e. limb position), vestibular senses (i.e. gravity and body acceleration) etc. The structure of (spatial) concepts in everyday ontologies of humans is determined by the structure of the channels through which we perceive space, or rather by the structure of the information perceived (overall). – Elements that belong to both the linguistic and ontological knowledge (as those of DIMENSIONALITY, ORIENTATION, and ATTACHMENT proposed in this paper) can restrict the codability (mechanisms of linguistic coding), both in terms of word class, as well as in terms of categories available. – Linguistic universality (primes) and crosslinguistic variation can be reconciled once we posit two levels of lexicalization; the (deep) level of (conceptual universals), and the surface level of lexical patterns (some sort of ‘molecular level’). The deep, ontological level, or the conceptual level, is not planar but rather hierarchical (cf. the ‘tree’ by Vandeloise in Figure 4), and different languages associate words with prelinguistic concepts at different levels of generality (e.g. the fact that the Spanish and Portuguese prepositional systems provide only one label for the whole range of ‘on’ – ‘in’ usages in Figure 3, while e.g. Dutch and German each need three lexical items for the same range, can be explained by the fact that in Spanish and Portuguese lexical mechanisms (see ‘selectional functions’ below)
322
Marija M. Brala operate at a higher level (i.e. at the level of ‘control’ – cf. Figure 4) than Dutch and German (which operate at lower levels). – Given the previous point, ontology is perhaps best investigated (and represented, at least for the purposes of linguistic ontologies) as a complex web of vertical and horizontal relations, which run in many directions, and the entirety of which cannot be represented in a single planar space.
Going back, once again, to Cooper’s (1968) analysis of 33 locational prepositions aimed at providing the basis for the extraction of the semantic reading for PPs, we recall that she defines prepositions as complex relational markers of the form R (f(x), g (y)), where ‘f’ and ‘g’ are what Cooper calls ‘function markers’, R is the ‘relation marker’ and ‘x’ and ‘y’ are the objects to be related. Relating the formula to our ‘algorithm’ introduced in Section 3.3., I would like to suggest rewriting the above formula as R(f (F), g(G)), where F and G stand, as said earlier, for Figure and Ground; f and g are the ‘selectional functions’ operating on Figure and Ground and selecting their semantically salient part (with the output respectively F’ and G’); and R is the ‘relational function’ putting F’ and G’ into relation (a relation for which a given language provides a lexical label). Now, it is crucial to note that f -s and g-s are not to be confused with F’-s and G’-s. In fact, f -s and g-s are not parts of any concrete, real object, but rather selectional functions which have the potential to select parts of real objects and assign them salience for the purpose of lexicalization (select e.g. points (1 DIM), horizontal or vertical axes (2 DIM, + ORIENTATION), volumes (3 DIM) etc.). On the other hand, F’-s and G’-s are the output of the selection function once it had operated on a concrete F and a concrete G. Since F and G can be anything at all (real or fictional, objects, people, abstract entities, entities yet to be invented . . . ), it follows that F’-s and G’-s are part of an unbounded set which cannot be cognitively predisposed. Only elements such as axes, circles, control, containment, support or attachment (tight fit), as part of a closed set of primitive traits associated with universal selectional functions (active for the purposes of lexicalization of relations established between F’s and G’s) could be cognitively predisposed. Language specificity is from this perspective not due to the fact that languages associate words with different prelinguistic concepts,
Spatial ‘on’ – ‘in’ categories
323
but rather that they do so at different levels of generality (i.e. f -s and g-s do not operate at the same level in all languages; for a thorough exemplification of how the mechanism works and its relation to crosslinguistic differences see Brala 2000, Chapter 3.). Finally, I would like to point out here that the ultimate objective of this study cannot be confined to providing a simple inventory of spatial relations that are profiled by each preposition. Instead, the scope would ideally be to try and in ultimate analysis get to the core of the universality realm by, eventually, replacing the question ‘what are the profiled relations’ by the more ambitious question ‘what relations are profilable’?16 Of course, looking only at a couple of prepositions does not justify making any definitive claims about the profilable relations. However, my intention here has been to propose a thorough exercise in cognitive semantic decomposition, and suggest a set of primitives (DIMENSIONALITY, ORIENTATION, and ATTACHMENT) that could, or rather should, subsequently, be probed in other linguistic contexts, and explored in other languages. One of the many interesting questions in this context is: are the three features employed for the semantic decomposition and systematization of the ‘on’ – ‘in’ range in this paper encoded by (and able to account for crosslinguistic variation of at least some) other closed class lexical items (e.g. case endings, derivational morphemes etc.)? Current research (e.g. Brala 2004; Talmy 2000, this vol.) is yielding some very encouraging answers in this respect.
Notes 1. Although spatial relations are expressed both by prepositions and postpositions, given the fact that the language of this paper is English, which realizes the meanings which are the focus of our investigation by the closed class of prepositions, we shall be using this latter term. However, from the linguistic perspective, it needs to be pointed out that what is crucial is the meaning, and not the word class per se (so in Section 3.2. we shall not only be considering prepositions and postpositions but also spatial nominals and case endings – see also Note 5). 2. Sentence (1) is taken from Vandeloise (1986: 233), and Sentence (2) from Herskovits (1986: 16), see also Cuyckens (1993: 33). 3. For some of the languages under investigation there was no local counterpart for the (Western) cultural Figure object depicted (e.g. book), so a replacement parallel scenario had to be used. Although this remark relates to a limited number of drawings, it is important in the light of the fact that in this paper we assume the position that it is not only the features of G, but also the salient features of F
324
4.
5.
6. 7.
8.
9. 10.
Marija M. Brala
that determine the choice of the lexical item used for coding the spatial relation between F and G. The actual booklet by Bowerman and Pederson (1992, 2003) contains seventy-one drawings covering a large range of spatial relations coded in English by prepositions such as on, in, but also under, over, near, against, etc. However, out of the entire data set elicited using this so-called Topological Relations Picture Series (or TRPS), in this work we only consider the data relative to the relations coded by the English on and in, reason being that this set crosslinguistically yielded a particular and most puzzling ‘continuum’, which represents the core of the analysis proposed in this paper. For more details see Section 3. of this paper. The study by Bowerman and Pederson is not about prepositions per se, but about the expression (or rather, semantic categorization) of ‘on’ – ‘in’ spatial relations in natural languages. Considered are: adpositions (as the lexical form most frequently used for the expression of the ‘on’ – ‘in’ relation), spatial nominals (used in, e.g., Japanese and Korean), and case endings (used, e.g., in Finnish). However, for the purposes of our paper and the language under immediate scrutiny i.e. English, we will talk only about prepositions. It should also be pointed out that the entire analysis by Bowerman and Pederson has been done for F (Figure) located in relation to G (Ground), and that reversing the relation often affects the lexical choice (frequently, the preposition needs to be changed when F and G are reversed). E.g. in Hindi, Categories 5 – fixed attachment, and 6 – point-to-point-attachment, can be lexicalized by two prepositions: per or me. Categories before Category 5 are lexicalized by per only, categories from Category 7 by me. Vandeloise (1998: 6) writes: “Even though some of the traits involved in the characterization of relationships container / content and bearer / burden like surrounding, contact, or order in the vertical axis are perceptually registered, the fundamental trait of control involved in containment and in support can only be noticed when it fails to work. In other words, while the kinetic mechanics is always noticeable, static mechanics involved in support and containment escapes the attention as long as the balance is respected” (i.e. as long as the function of control – be it containing or supporting – is ‘+’). Following Lakoff’s (1987: 93) proposal of the ‘domain-of-experience-principle’, the term ‘domain’ is here used to refer to basic patterns of neural activation which ‘mean’ without being propositional. Domains are part of the set of most basic human ‘meaningful wholes’ (atomic units of cognitive patterning ‘wholes’, seen as being shared between perception and the language faculty) which do not ‘mean’ anything in particular, but, as domains, contain all possible patterns of neural activation i.e. the ‘patterning potential’ for all values the domain can assume. Boundedness seems to play a very important role in semantic organization, with very clear manifestations on the syntactic level as well (see e.g. Talmy 2000, this vol.). Which is exactly what we get if we first switch the places of the ‘support branch’ and the ‘attachment branch’ on the right hand-side of Vandeloise’s tree, and then switch the right and the left hand-side branch. In such a way we get a tree that can be perfectly mapped onto Bowerman’s ‘on’ – ‘in’ scheme.
Spatial ‘on’ – ‘in’ categories
325
11. This is not frequently noted, because most analyses of the FG relation focus on G, but it is crucial to observe that F’s position contributes greatly to lexicalization patterns, one of the best known examples being that of posture verbs (e.g. in Dutch it sounds pretty awkward to say ‘the book is on the shelf’. The book is either ‘standing’ of ‘lying’ on the shelf, depending on whether it is vertically or horizontally oriented; see Hellwig this vol.). 12. This sort of ‘prepositional definition’ is a ‘blend’ based on the different definitions of the prepositional meaning of in that I managed to find (Herskovits 1986; Quirk et al. 1985; Cooper 1968; Leech 1969; Bennett 1975; Bowerman 1996a, 1996b). 13. It should be pointed out here that our other two domains find ‘justification’ (from the NSM standpoint) in the primitives HERE ONE, TWO SIDE, INSIDE for the domain of ‘DIMENSIONALITY’, whereas the domain ‘ORIENTATION’ is here viewed as resulting from the set of the NSM primes relative to space coupled with the NSM intensifiers, partonomy and similarity, the combinations of which yield some sort of geometry which would include parallelness and perpendicularity i.e. at least the 90° angle, which is crucial for our analysis. 14. It should be pointed out here that Levinson and Meira (2003: 499) themselves do note that part of the problem they note is due to the fact that their pictures (i.e. 71 drawings covering a large number of adpositions) are arranged and grouped in a two-dimensional space, and that the problem that they find with ‘lack of (semantic) contiguity in this space (or rather plane) is given exactly by this twodimensionality. Levinson and Meira, unfortunately, dismiss the possibility of researching the arrangement of drawings (meanings) in a higher-dimensional space (not even as a combination of more two-dimensional planes, which might be the first option to test) as ‘a project that does not lend itself to a computational solution’. 15. It has been shown that a) spatial information in the brain is modal (we seem to have representations or maps of motor space, haptic space, auditory space, body space, egocentric space, and allocentric space; cf. Bloom et al. 1996). We note that the primitive, bodily based features proposed here as the bases of prepositional semantics, seem to mirror the cognitive multimodality of spatial perception (i.e. ‘contact’ would mirror haptic space, ‘gravity’ body / motor space, and ‘orientation’ motor / visual space); and b) neural information about space does not include (detailed) representations of objects (in space), i.e. there seems to be a clear (although not total) separation between the neurobiological ‘what’ and ‘where’ systems (see Landau and Jackendoff (1993) who propose the divisions between the linguistic ‘what’ and ‘where’ systems, as well as Talmy’s (1983: 227) or Slobin’s (1985) proposals suggesting that the ‘what’ system is expressed by open class words, whereas the ‘where’ system is lexicalized by the closed class portion of language). 16. In fact, such a move is, in my view, indispensable (cf. also Hawkins’ (1993: 328) critique of Langacker’s (1987) Cognitive Grammar (CG) program for its selfimposed limitation boiling down to the claim that the search for universals among the inventory of profiled structures is outside the scope of CG).
326
Marija M. Brala
Acknowledgements I wish to thank Melissa Bowerman whose work on spatial language has inspired my own. The influence of her research is evident throughout this paper. I also wish to thank Henriette Hendriks and Keith Brown who, through fruitful discussions, have helped me refine many of the views expressed in this work.
References Ameka, Felix 1995 The linguistic construction of space in Ewe. Cognitive Linguistics 6: 285–311. Bennett, David C. 1975 Spatial and Temporal Uses of English Prepositions. London: Longman. Bloom, Paul, Mary A. Peterson, Lyn Nadel, and Merrill F. Garrett (eds.) 1996 Language and Space. Cambridge, MA: MIT Press. Bowerman, Melissa 1996a Learning how to structure space for language: A crosslinguistic perspective. In Bloom et al. (eds.), 385–436. 1996b The origins of children’s spatial semantic categories: Cognitive versus linguistic determinants. In Gumperz and Levinson (eds.), 145– 176. Bowerman, Melissa, and Soonja Choi 2001 Shaping meanings for language: Universal and language specific in the acquisition of spatial semantic categories. In Language Acquisition and Conceptual Development, Melissa Bowerman and Stephen C. Levinson (eds.), 475–511. Cambridge: Cambridge University Press. Bowerman, Melissa, and Eric Pederson 1992 Cross-linguistic perspectives on topological spatial relations. Paper presented at the American Anthropological Association, San Francisco, CA, December 1992. 2003 Cross-linguistic perspectives on topological spatial relations. Eugene: University of Oregon, and Nijmegen: Max Planck Institute for Psycholinguistics, ms. Brala, Marija M. 2000 English, Croatian and Italian prepositions from a cognitive perspective. When ‘at’ is ‘on’ and ‘on’ is ‘in’. Ph.D. diss., Research Centre for English and Applied Linguistics, University of Cambridge, U.K. 2002 Understanding and translating (spatial) prepositions: An exercise in cognitive semantics for lexicographic purposes. In Working Papers
Spatial ‘on’ – ‘in’ categories
327
in English and Applied Linguistics Vol. 7, 1–24. Cambridge, U.K.: University of Cambridge. 2004 The story of ‘o’. Croatian prepositions as vectors. In prep. Brown, Penelope 1994 The INs and ONs of Tzeltal locative expressions: The semantics of static descriptions of location. Linguistics 32: 743–790. Choi, Soonja, and Melissa Bowerman 1991 Learning to express motion events in English and Korean: The influence of language-specific lexicalization patterns. Cognition 41: 83– 121. Choi, Soonja, Laraine McDonough, Melissa Bowerman, and Jean M. Mandler 1999 Early sensitivity to language specific spatial categories in English and Korean. Cognitive Development 14: 241–268. Clark, Herbert H. 1973 Space, time, semantics and the child. In Cognitive Development and the Acquisition of Language, Timothy E. Moore (ed.), 27–63. New York: Academic Press. Cooper, Gloria S. 1968 A semantic analysis of English locative prepositions. Bolt, Beranek, and Newman Report No. 1587. Cuyckens, Hubert 1993 The Dutch spatial preposition ‘in’: A cognitive-semantic analysis. In Zelinsky-Wibbelt (ed.), 27–71. Deacon, Terry 1997 The Symbolic Species: The Co-Evolution of Language and the Brain. New York/London: W. W. Norton & Co. Goddard, Cliff 2002 On and on: Verbal explications for a polysemic network. Cognitive Linguistics 13 (3): 277–294. Gopnik, Alison, and Andrew Meltzoff 1986 Words, plans, things and locations: Interactions between semantic and cognitive development in the one-word stage. In The Development of Word Meaning, Stan A. Kuczaj and Martyn D. Barrett (eds.), 199–223. New York: Springer. Gregory, Richard 1998 Eye and Brain: The Psychology of Seeing. Oxford: Oxford University Press. Gumperz, John J., and Stephen C. Levinson (eds.) 1996 Rethinking Linguistic Relativity. Cambridge: Cambridge University Press. Hawkins, Bruce W. 1993 On universality and variability in the semantics of spatial adpositions. In Zelinsky-Wibbelt (ed.), 327–349.
328
Marija M. Brala
Hellwig, Birgit this vol. Postural categories and the classification of nominal concepts: A case study of Goemai. Herskovits, Annette 1986 Language and Spatial Cognition. An Interdisciplinary Study of the Prepositions in English. Cambridge: Cambridge University Press. Jackendoff, Ray 1996 The architecture of the linguistic-spatial interface. In Bloom et al. (eds.), 1–30. Johnston, Judith R. 1984 Acquisition of locative meanings: Behind and in front of. Journal of Child Language 11: 407–422. Johnston, Judith R., and Dan I. Slobin 1979 The development of locative expressions in English, Italian, SerboCroatian and Turkish. Journal of Child Language 6: 529–545. Lakoff, George 1987 Women, Fire, and Dangerous Things. What Categories Reveal about the Mind. Chicago: University of Chicago Press. Landau, Barbara, and Ray Jackendoff 1993 ‘What’ and ‘where’ in spatial language and spatial cognition. Behavioral and Brain Sciences 16: 217–265. Langacker, Ronald W. 1987 Foundations of Cognitive Grammar. Vol. I: Theoretical Prerequisites. Stanford: Stanford University Press. Leech, Geoffrey N. 1969 Towards a Semantic Description of English. London: Longman. Levinson, Stephen C. 1996 Relativity in spatial conception and description. In Gumperz and Levinson (eds.), 177–202. 2003 Space in Language and Cognition: Explorations in Cognitive Diversity. Cambridge: Cambridge University Press. Levinson, Stephen C., and S´ergio Meira 2003 ‘Natural concepts’ in the spatial topological domain – adpositional meanings in crosslinguistic perspective: An exercise in semantic typology. Language 79 (3): 485–516. Levin, Beth, and Steven Pinker (eds.) 1991 Lexical and Conceptual Semantics. Oxford: Blackwell. Lindstromberg, Seth 1997 English Prepositions Explained. Amsterdam: John Benjamins. 2001 Preposition entries in UK monolingual learner’s dictionaries: Problems and possible solutions. Applied Linguistics 22 (1): 79–103. Mandler, Jean M. 1996 Preverbal representation and language. In Bloom et al. (eds.), 365– 384.
Spatial ‘on’ – ‘in’ categories
329
Piaget, Jean, and B¨arbel Inhelder 1956 The Child’s Conception of Space. London: Routledge & Kegan Paul. Quirk, Randolph, Sydney Greenbaum, Geoffrey Leech, and Jan Svartvik 1985 A Comprehensive Grammar of the English Language. London: Longman. Sauvan, Xavier M., and Esther Peterhans 1999 Orientation constancy in neurons of monkey visual cortex. Visual Cognition 6 (1): 43–54. Skopeteas, Stavros this vol. Semantic categorizations and encoding strategies. Slobin, Dan I. 1985 Crosslinguistic evidence for the language-making capacity. In The Crosslinguistic Study of Language Acquisition. Vol. 2: Theoretical Issues., Dan I. Slobin (ed.), 1157–1256. Hillsdale, NJ: Lawrence Erlbaum. 1996 From ‘thought and language’ to ‘thinking for speaking’. In Gumperz and Levinson (eds.), 70–96. Talmy, Leonard 1983 How language structures space. In Spatial Orientation: Theory, Research, and Application, Herbert L. Pick and Linda P. Acredolo (eds.), 225–282. New York: Plenum Press. 2000 Toward a Cognitive Semantics. Vol. I.: Concept Structuring System & Vol. II: Typology and Process in Concept Structuring. Cambridge, MA/London: MIT Press. this vol. The representation of spatial structure in spoken and signed language: A neural model. Vandeloise, Claude 1986 L’espace en franc¸ais: S´emantique des pr´epositions spatiales. (Travaux Linguistiques.) Paris: Editions du Seuil. [Translated as (1991). Spatial Prepositions: A Case Study from French. Chicago: The University of Chicago Press.] 1998 Containment, support and linguistic relativity, ms. Weist, Richard M. 1991 Spatial and temporal location in child language. First Language 11: 253–267. Zelinsky-Wibbelt, Cornelia (ed.) 1993 The Semantics of Prepositions. From Mental Processing to Natural Language Processing. Berlin/New York: Mouton de Gruyter.
Semantic categorizations and encoding strategies Stavros Skopeteas 1.
Introduction1
Languages differ with respect to the lexicalization of the conceptual distinction between the region SUPERIOR & CONTACT (cf. English on), i.e. in a place which is higher than the place occupied by the landmark and in contact to it, and the region SUPERIOR & NON - CONTACT (cf. English above), i.e. in a place which is higher than the place occupied by the landmark and without contact to it. Both concepts are instances of the superordinate concept of SU PERIOR , i.e. the space that is in the positive domain of a vertical coordinate which originates at the landmark. The typological variation concerning the encoding of these concepts contains: (a) languages that lexicalize the subordinate concepts, e.g. German and Russian; (b) languages that only lexicalize the superordinate concept, e.g. Korean and Yucatec Maya; (c) languages that lexicalize the superordinate concept and one of the subordinate concepts, e.g. Nanafwe; and (d) languages that lexicalize the superordinate concept in one paradigm of local relators and the subordinate concepts in another, e.g. Modern Greek. Speakers of these representative languages have participated in a number of interactive games, in which either the superordinate or the subordinate concept of the taxonomic relation at issue was required in order to fulfill the game tasks. The collected results show that the diversity in semantic categorizations partially determines the encoding strategy used. In particular, languages differ in (a) the lexicalization pattern they choose in order to encode identical concepts, (b) the concept they choose in order to conceptualize identical situations, and (c) the semantic vs. pragmatic conveyance of the same concept. 2. 2.1.
Preliminary remarks Encoding taxonomical relations
A classical problem in semantic categorization is the mismatch between languages providing different exponents with respect to a taxonomical relation: given a taxonomy of a superordinate concept x and some subordinate con-
332
Stavros Skopeteas
cepts y and z, it is a usual situation that a language L1 lexicalizes x whereas some other language L2 lexicalizes y and z. The case study presented in this paper is concerned with the distinction between two spatial regions: the region of SUPERIOR & CONTACT and the region of SUPERIOR & NON - CONTACT (see definitions in Section 2.4.). English encodes this semantic distinction by the prepositions on and above. Setting aside the polysemy of these two prepositions, they both express location in the space in the positive domain of a vertical coordinate originating at the landmark (SUPERIOR). Furthermore, the preposition on expresses location in a place that is in contact to the place occupied by the landmark, whereas the preposition above expresses location in a place that is not in contact to it. The superordinate concept in this case is the region of SUPE RIOR , which only carries one feature and is not specified with respect to CONTACT / NON - CONTACT. The concepts SUPERIOR & CONTACT and SU PERIOR & NON - CONTACT are conceptually subordinated to the concept SU PERIOR , since their instances necessarily instantiate SUPERIOR as well. The relation among the subordinate concepts is one of conceptual incompatibility in terms of Schalley and Zaefferer (this vol.) since they are complementary. From a cross-linguistic perspective, languages differ with respect to the encoding of this categorization: (a) the subordinate-encoding type includes languages that lexicalize the subordinate concepts SUPERIOR & CONTACT and SUPERIOR & NON - CONTACT (see Section 3.); (b) the superordinateencoding type includes languages that do not make this distinction and provide only one element lexicalizing the concept of SUPERIOR (see Section 4.); and (c) the mixed type includes languages that lexicalize the superordinate concept of SUPERIOR and one of the subordinate concepts (see Section 5.). Furthermore, some languages may display internal variation as regards this typology, thus belonging to different types with respect to different constructions. This encoding type is treated as a split system (see Section 6.). The purpose of this paper is to explore the impact of the typological diversity in semantic classifications upon the performance of communicative tasks. By means of an interactive game, outlined in Section 1.2., speakers of the above language types are exposed to identical discourse situations. The data gained on the basis of this experiment differ with respect to the syntactic, semantic, and pragmatic means that speakers of different languages use to perform identical tasks.
Semantic categorizations and encoding strategies
2.2.
333
Experimental setting
The empirical data was gained through a series of interactive games that were designed to collect data on spatial descriptions concerning the distinction of CONTACT vs. NON -CONTACT in the domain of SUPERIOR . The games are performed by two consultants, one taking the role of the director and one taking the role of the matcher. The director holds a series of picture pairs in randomized order. In each pair one picture is highlighted. The matcher is given the same series in identical order, but his series lacks highlighting. The instruction to the director has been the following: “Describe to your partner the highlighted picture in each pair”. The matcher has been instructed to listen to his partner and to point to the highlighted picture. The pairs used in the games are presented in Table 1. Table 1. Experimental conditions Condition 1 Game 1 Game 2 Game 3 Condition 2 Game 4 Game 5 Game 6 Condition 3 Game 7 Game 8
highlighted pictures SUPERIOR & CONTACT a bird sitting on the elephant a hand holding a candle on a table a pot hanging on the fire SUPERIOR & NON - CONTACT a bird flying above an elephant a hand holding a candle above a table a pot hanging above the fire SUPERIOR
Game 9
a bird flying above an elephant a hand holding a candle above a table a pot hanging on the fire
Game 10
a pot hanging above the fire
Condition 42 Game 11
SUPERIOR & NON - CONTACT a bird sitting on a tree branch exactly above an elephant a hand holding a candle on a stack of books that are on a table a pot being on a table exactly above a fire which is underneath the table
Game 12 Game 13
background pictures SUPERIOR & NON - CONTACT a bird flying above an elephant a hand holding a candle above a table a pot hanging above the fire SUPERIOR & CONTACT a bird sitting on the elephant a hand holding a candle on a table a pot hanging on the fire NON - SUPERIOR a bird flying under an elephant a hand holding a candle at the left side of a table a pot hanging at the right side of the fire a pot hanging at the right side of the fire NON - SUPERIOR a bird sitting on a tree branch not being above the elephant a hand holding a candle at the left side of a stack of books that are on a table a pot being on a table not above the fire which is underneath the table
334
Stavros Skopeteas
There are some consequences of the selected methodology that have to be mentioned in order to delimit the scope of the resulting generalizations. First of all, elicitation through pre-constructed discourse situations does not render naturalistic data about human communication. Speakers of different cultures are exposed to identical discourse situations and thus perform identical tasks, but it is possible that the situations compared do not occur with identical importance and frequency in natural communication of different interacting communities. However, the exploration of cultural diversity is beyond the scope of the methodology of this paper. Second, the games are designed for situations that differ with respect to the distinction between CONTACT / NON CONTACT between a localized object and a landmark. Consequently, the data gained supplies generalizations about only one aspect of the meaning of the spatial relators used in this context. Further distinctions in the domain of SU PERIOR that possibly interact with the distinction CONTACT / NON - CONTACT may not be accounted for through these experimental items.3 Furthermore, the descriptions gained through this experimental setting are induced through the contrast between a highlighted situation and a background situation. The collected descriptions would be not necessarily the same if the highlighted pictures were presented in isolation and hence the results do not allow for generalizations about the encoding of the highlighted situations without contrast. 2.3.
Language sample
Speakers of representative languages for the types introduced in Section 2.1. have participated in the production experiment. The representative languages include: (a) two languages that encode the subordinate concepts: German and Russian; (b) two languages that encode the superordinate concept: Korean and Yucatec Maya; (c) a language representing the mixed systems, namely Nanafwe (dialect of Baole; Kwa: Ivory Coast), which encodes the superordinate concept of SUPERIOR and the subordinate concept of SUPERIOR & NON - CONTACT; and (d) a language providing a constructional split, namely Modern Greek, that belongs to the subordinate-encoding type as regards the prepositional paradigm and to the superordinate-encoding type as regards the adverbial paradigm.
Semantic categorizations and encoding strategies
2.4.
335
Semantic representations
The local situations of the experiment and the semantics of the collected expressions are represented in predicate logic in terms of Wunderlich and Herweg (1991). The localization of an entity is treated as a predicate with two arguments: the localized object x and the place it occupies in space. This place is part of the space defined by a region uj of a landmark y. (1)
λ xλ y LOC(x, uj (y)) (see Wunderlich and Herweg 1991: 772)
The spatial region termed as SUPERIOR contains the space in the positive domain of the vertical lines that fall within the outline of the landmark y. The semantics of a local relator which encodes the superordinate concept of SU PERIOR and is underspecified for CONTACT / NON - CONTACT, as for instance the relational noun sˇ in`ı in Mixtec, is given in (2): (2)
sˇ in`ı: λ xλ y LOC(x, SUPERIOR(y))
The subordinate concepts combine the concept of SUPERIOR with the concepts of CONTACT / NON - CONTACT. It should be mentioned that the concept of CONTACT does not necessarily imply physical contact between the localized object and the landmark (see Klein 1991: 96; Brala this vol.). The classical example is that the sentence the glass is on the table is true, even if a tablecloth is on the table and hence intervenes between the surface of the table and the glass. Following Klein (1991), we assume that the place occupied by the table is conceptualized as contiguous to the place occupied by the glass, insofar the place occupied by the tablecloth is not relevant enough to be chosen for landmark. The concept of CONTACT will be treated in the context of our experiment as a spatial region and not as a predicate independent from the localizing function (see alternative representations in Wunderlich and Herweg 1991: 778). In the experimental stimuli, the contrasted pictures differ with respect to the localization of the one entity. This contrast induces expressions that identify a certain search domain in the space, in which the hearer should find the localized object. In this sense, CONTACT is relevant for this discourse task only as a spatial region. Defined as such, the concept of CONTACT is a part of the space in proximity of the landmark y, termed as EXT(y). Within
336
Stavros Skopeteas
EXT(y), the part specified by the region of CONTACT is the space in which the place occupied by a localized object x and the place occupied by y are (conceptually and not necessarily physically) contiguous. This region will be represented as EXTC (y) following Wunderlich and Herweg (1991: 778) and will be referred in the plain text simply as CONTACT. The representation of the English prepositions on and above in (3) illustrates the semantics of local relators that encode the subordinate concepts:
(3)
on: λ xλ y LOC(x, [SUPERIOR(y) & EXTC (y)])
(4)
above: λ xλ y LOC(x, [SUPERIOR(y) & ¬EXTC (y)])
As already mentioned in Section 2.2., the semantic properties which are inspected through our experiment are induced through the contrast of situations which involve the concepts of SUPERIOR and CONTACT. It has been shown that spatial prepositions in different languages usually contain further semantic properties that restrict their use in identifying search domains, and notably functional properties concerning the relation between the localized object and the landmark (see Aurnague and Vieu 1993: 419–422).4 Semantic properties, beyond the concepts of SUPERIOR and CONTACT, are not accounted for through the current experimental design. In this sense, the semantic representations in this paper should be treated as partial representations containing only the variables which are experimentally manipulated. The semantic representations of the local relators in the object languages will be compared with the language-independent conceptual structure, which is assumed by the experimental manipulation. Since the aim of this paper is to enable the comparison between language-independent discourse situations and language specific expressions, the representations of the conceptual structure will have the same formal elements as the representations of language-specific semantics. We assume that the presentation of two pictures with two identical entities establishes the existence of two entities in the mental model of the informant. The design of situations forces the choice of a localized object and a landmark through a twofold asymmetry of the involved entities. In terms of salience, all pictures present pairs of asymmetrical entities (bird/elephant, candle/table, pot/fire), whereby the entity which is located in a higher location in the picture is more salient (i.e. smaller and more movable) than the entity
Semantic categorizations and encoding strategies
337
which is located in a lower location. In terms of information structure, the intended landmark occupies a given place in both pictures and the intended localized object occupies different places in the highlighted picture and the background one. These asymmetries induce descriptions in which the landmark is the less salient entity occupying a given location, i.e. the collected results only contain descriptions of the kind the candle is under the table and not description of the kind the table is under the candle. Since this asymmetry is maintained experimentally constant and since it uniformly induces the same role choice in all languages, it will be part of the language-independent representation of the discourse situation, which is illustratively presented for Condition 2 in (5). This representation contains the localization presented in the highlighted picture in the second line and the localization presented in the background picture in the next line. (5)
Condition 2: highlighted picture: background picture:
3.
∃x∃y [LOC(x, [SUPERIOR(y) & ¬EXTC (y)]) & ¬LOC(x, [SUPERIOR(y) & EXTC (y)])]
Subordinate encoding languages
Languages like German and Russian encode the subordinate concepts and not the superordinate one – at least by means of prepositions as illustrated in (6) for German. (6)
German → superordinate (SUPERIOR) C → subordinate (SUPERIOR & EXT ) subordinate (SUPERIOR & ¬EXTC ) →
– auf u¨ ber
Speakers of this language type include the concept of CONTACT / NON in their descriptions, irrespective of its relevance for the discourse situation. Thus, German speakers use expressions like (7) both for the discourse situation of Game 6, whereby ‘the pot above the fire’ is contrasted to ‘the pot on the fire’, as well as the discourse situation of Game 10, in which the concept of NON - CONTACT is not needed for the task, since both pictures display ‘a pot above the fire’. The same holds for the Russian expression in (8)5 that applies to Game 4 and to Game 7 as well. CONTACT
338 (7)
Stavros Skopeteas
Der DEF : NOM . SG . M
h¨angt u¨ ber dem Topf pot(NOM . SG . M) hang:3. SG above DEF : DAT. SG . N
Feuer. fire(DAT. SG . N) ‘The pot hangs above the fire.’ (Games 6 & 10) (8)
nad slon´om. pt´ıca bird:NOM . SG . F above elephant:INSTR . SG . M ‘A bird is above an elephant.’ (Games 4 & 7)
The encoding strategy of this language type is represented in (9)–(10). The discourse situation in (9) represents the case in which NON - CONTACT is the crucial concept for the identification of the highlighted picture. In the discourse situation in (10) the concept of NON - CONTACT is not relevant, since the highlighted picture may be successfully identified through the concept of SUPERIOR . In both cases, languages of the subordinate-encoding type use an expression that includes the concept of NON - CONTACT. (9)
Games 4 and 6:
(10)
Games 7 and 10:
4.
∃x∃y highlighted picture: [LOC(x, [SUPERIOR(y) & ¬EXTC (y)]) & background picture: ¬LOC(x, [SUPERIOR(y) & EXTC (y)])] local relator (subordinate encoding languages): λ xλ y LOC(x, [SUPERIOR(y) & ¬EXTC (y)])
∃x∃y highlighted picture: [LOC(x, [SUPERIOR(y) & ¬EXTC (y)]) & background picture: ¬LOC(x, ¬SUPERIOR(y))] local relator (subordinate encoding languages): λ xλ y LOC(x, [SUPERIOR(y) & ¬EXTC (y)])
Superordinate encoding languages
The second language type includes languages that encode the superordinate but not the subordinate concepts. This pattern occurs in Mixtec as shown in (11). The same relational noun sˇini occurs in examples that imply CON TACT to the landmark, e.g. ‘a person on the top of the tree’ and in examples that exclude CONTACT, e.g. ‘a bird flying above the tree’ (see Macaulay 1996: 173, 179).
Semantic categorizations and encoding strategies
(11)
339
Mixtec, Oto-Manguean (Macaulay 1996: 173) superordinate (SUPERIOR) subordinate (SUPERIOR & CONTACT) subordinate (SUPERIOR & ¬ CONTACT)
→ → →
sˇin`ı – –
The language type exemplified in (11) is very widespread: it occurs in native American languages, like Mixtec, Yucatec Maya (cf. (13)–(14)), and Imbabura Quechua (Cole 1985: 122–123), in East Asian languages like Japanese (see Section 4.2.1.) and Korean (cf. (12)), in Altaic languages like Turkish (Kornfilt 1997: 246–247), in Niger-Kongo languages like Koromfe (Rennison 1997: 175–178). 4.1.
Discourse situations profiling the superordinate concept
The expression of CONTACT / NON - CONTACT in these languages mainly depends on the relevance of this concept for the discourse situation, i.e. speakers disregard this information if it is not relevant for the task. This is illustrated for Korean in (12). In both Games 9 and 10, Korean speakers use the same expression to identify the highlighted picture, although in the first, the pot is hanging with contact to the fire and in the latter it is hanging without contact to it. In both Games, the background picture shows a pot that is hanging not directly above the fire. The description in (12) only expresses the concept of SUPERIOR . (12)
iss-ta suphwu-ka pul uy-ey soup-NOM fire on/above- LOC be- DECL ‘The soup is on/above the fire.’ (Games 9 & 10)
The same pattern occurs in Yucatec Maya: both (13) and (14) convey the highlighted location by means of the preposition y´ook’ol ‘on/above’, although sentence (13) has been produced in Game 9 which involves contact to the fire and (14) in Game 10 which does not. (13)
ch’´ooy y`aan ti’ le k’´aak’-o’, y´ook’ol. te’l-a’ hun p’´eel there-D 1 one CL . INAN bucket EXIST LOC DEF fire-D2 on/above ‘There is a bucket at the fire, on/above.’ (Game 9)
(14)
ch’´ooy y`aan y´ook’ol le k’´aak’-o’. – te’l-a’ hun p’´eel there-D1 one CL.INAN bucket EXIST on/above DEF fire-D2
340
Stavros Skopeteas
– y´ook’ol? hach y´ook’ol? on/above really on/above – y´ook’ol. on/above ‘– There is a bucket on/above the fire. / – Is it on/above it? Really on/above it? / – Yes, on/above it.’ (Game 10) In these examples, the discourse situation profiles the superordinate concept. (15) represents the encoding strategy of superordinate-encoding languages in situations in which CONTACT / NON - CONTACT is not relevant for the communicative task; this feature is simply ignored (compare (10)). (15)
Game 10:
(16)
Game 9:
4.2.
∃x∃y highlighted picture: [LOC(x, [SUPERIOR(y) & ¬EXTC (y)]) & background picture: ¬LOC(x, ¬SUPERIOR(y))] local relator (superordinate encoding languages): λ xλ y LOC(x, SUPERIOR(y)) ∃x∃y highlighted picture: [LOC(x, [SUPERIOR(y) & EXTC (y)]) & background picture: ¬LOC(x, ¬SUPERIOR(y))] local relator (superordinate encoding languages): λ xλ y LOC(x, SUPERIOR(y))
Discourse situations profiling the subordinate concepts
The crucial question with respect to this language type is how speakers deal with discourse situations that profile subordinate concepts. Languages of this type employ different strategies to resolve this task: (a) speakers encode similar semantic representations by using different lexicalization patterns, especially by using verbs to encode the concepts of CONTACT / NON - CONTACT (see Section 4.2.1.) rather than adpositions; (b) speakers encode different semantic representations to perform identical discourse tasks, e.g. expressing absolute location in the vertical axis instead of CONTACT / NON - CONTACT (see Section 4.2.2.); or (c) speakers make use of inferential patterns instead of semantic representations to resolve the task, e.g. inferring CONTACT / NON CONTACT from the information about MANNER of motion or POSTURE of the localized object (see Section 4.2.3.).
Semantic categorizations and encoding strategies
4.2.1.
341
Diversity in lexicalization pattern
The first case of language diversity to be discussed here concerns the use of different lexicalization patterns.6 As a case of difference in means of lexicalization, recall Example (12) from Korean. Whenever the concepts of CONTACT / NON - CONTACT are not relevant for the discourse situation, they are not expressed, as it is shown in the representation in (15). In Game 6, however, the discourse situation requires the concept of NON - CONTACT, since the highlighted picture differs from the background picture only with respect to this concept. Korean speakers resolve this task by using a converb with the meaning ‘disjoint’ (cf. (17)). The same converb appears also in other Games that require the concept of NON - CONTACT (cf. Game 5 in (18)). (17)
ttele-ci-e iss-ta yangtongi-ka pul uy-ey bucket-NOM fire on/above-LOC disjoin-PASS - INF be-DECL ‘the bucket is above the fire (lit. is disjoined in the on/above domain of the fire)’ (Game 6)
(18)
teyibl uy-ey ttele-ci-e iss-ta cho-ka candle-NOM table on/above-LOC disjoin-PASS - INF be-DECL ‘the candle is above the table (lit. is disjoined in the on/above domain of the table)’ (Game 5)
Thus, while German lexicalizes both SUPERIOR and NON - CONTACT through a single adposition (cf. (6)), Korean lexicalizes SUPERIOR through an adposition and NON - CONTACT through a verb (see (19)). Not taking into account the difference in obligatory/optional encoding of the subordinate concept, the Korean example in (17) and the German example in (7) show two different lexicalization patterns for the same information for Game 6. (19)
Korean → uy-ey superordinate (SUPERIOR) subordinate (SUPERIOR &¬CONTACT) → uy-ey ttele-ci-e
In the examples under consideration, the superordinate-encoding language makes use of an additional element, namely a verb form, in order to provide additional information about NON - CONTACT. An interesting question is if motion verbs in a language lacking the concepts of CONTACT / NON - CONTACT in the adpositional paradigm are specified with respect to these concepts. This
342
Stavros Skopeteas
would support the idea of functional complementarity of verbs and adpositions in the lexicalization of spatial relations. Without allowing a generalization concerning all motion verbs, there are such instances of complementarity as it will be exemplified by the verb ‘fly’ in Korean and English. English fly does not imply necessarily NON - CONTACT to the landmark encoded through the adjoined PP. Thus, it is possible to use the verb fly with either on or above as in (20a–b). (20)
a. b.
Aladdin is flying on the carpet. Aladdin is flying above the carpet.
Contrary to English, the Korean verb for the concept FLY includes the concept of NON - CONTACT. Consider Example (21): the Korean postposition uy ‘on/above’ only encodes the concept of SUPERIOR. The compositional interpretation of a verb encoding the concept FLY with an adposition encoding SUPERIOR is expected to be ambiguous between the meanings illustrated by the English Examples (20a–b). However, it is not ambiguous in Korean. Example (21) can only mean ‘Aladin is flying above the carpet’. Since the postposition does not contain the concept of NON - CONTACT, it should be carried by the verb (notice that the verb in this construction governs the Korean postposition with an accusative suffix). The meaning ‘flies on the carpet’ is only possible with the use of an additional verb that contains the concept of CONTACT and is a converbal dependent of the matrix verb (see (22)). (21)
nal-n-ta. aladin-un yangtanca uy-lul on/above- ACC fly.above-PRES - DECL Aladin-TOP carpet ‘Aladin is flying above the carpet.’ *‘Aladin is flying on the carpet.’
(22)
nal-n-ta. aladin-un yangtanca-lul tha-ko Aladin-TOP carpet-ACC get.on-CON fly.above-PRES - DECL ‘Aladin is flying on the carpet (lit. by being on the carpet).’
To that effect, Korean descriptions involving the verb ‘fly’ (cf. (23)) contain the concept of NON - CONTACT and are semantically equivalent to corresponding descriptions in subordinate-encoding languages (cf. (24)), differing only with respect to the distribution of features in the syntactic constituents. (23)
khokkili uy-lul nal-n-ta. say-ka bird-NOM elephant on/above- ACC fly.above-PRES - DECL ‘The bird is flying above the elephant.’ (Game 4)
Semantic categorizations and encoding strategies
(24)
Der DEF : NOM . SG . M
343
fliegt u¨ ber dem Vogel bird(NOM . SG . M) fly:3. SG above DEF : DAT. SG . N
Elefanten. elephant:DAT. SG . N ‘The bird is flying above the elephant.’ (Game 4)
4.2.2.
Diversity in semantic structure
In the examples considered so far, languages select a concept included in the discourse situation either obligatorily (subordinate encoding languages) or only if relevant (superordinate encoding languages). Another instance of language diversity in our corpus concerns cases in which languages lexicalize different concepts in order to fulfill the same task. A difference in lexicalization pattern may occur as accompanying property to this deviation, but the relevant issue is the difference in semantic structure. Consider Games 3 and 6 that oppose ‘a pot on the fire’ to ‘a pot above the fire’, respectively highlighting the first and the latter situation. There are at least two possible ways to express the location of the pot: The first way is to express the relative location of the bucket with respect to the landmark ‘fire’. Languages of the subordinate-encoding type do this by means of ‘on’ and ‘above’. An alternative way is to express the location in terms of ‘high’ and ‘low’, hence in absolute terms by encoding position in the vertical axis as an absolute frame of reference originating from the ground.7 This solution occurs in the data from Yucatec Maya and is exemplified in (25) for CONTACT and in (26) for NON - CONTACT. In both examples the director of the game has given an ambiguous description using the preposition y´ookol ‘on/above’. The matcher asks for clarification using the adverbs k´aabal ‘low’ (etym. ‘ground’) and ka’nal ‘high’ (etym. ‘sky’). (25)
hun-p’´eel ch’´ooy y`aan y´ook’ol le k’´aak’-o’. – te’l-a’ there-D1 one-CL.INAN bucket exist on/above DEF fire-D2 – k`aabal w´aah ka’nal? low or high – k`aabal. low ‘– There is a bucket on/above the fire./ – Low or high?/ – Low.’ (Game 3)
344 (26)
Stavros Skopeteas
– hun-p’´eel ch’´ooy y´ook’ol k’´aak’ y`aan. one-CL.INAN bucket on/above fire exist – ka’nal ti’? high LOC – ka’nal ti’. high LOC ‘– There is a bucket on/above a fire. / – High? / – Yes, high.’ (Game 6)
The encoding strategy used in Game 6 is represented in (27) (compare subordinate encoding languages in (9)). The adverb ka’nal ‘high’ is a oneplace function with a internal argument for the zero point of the axial system which is not lexicalized in the utterance.8 This is by default the ground, but it may also be another reference object retrieved from the context. (27)
Game 6:
4.2.3.
Pragmatic inferences
∃x∃y highlighted picture: [LOC(x, [SUPERIOR(y) & ¬EXTC (y)]) & background picture: ¬LOC(x, [SUPERIOR(y) & EXTC (y)])] semantics of Yucatec local relator: λ x LOC(x, HIGH(y))
A further varying parameter in the discourse situations under consideration concerns the different manners of motion and the different postures of the localized objects. Several manners of motion or postures often offer the basis for pragmatic inferences of the concepts of CONTACT / NON - CONTACT. For example, the fact that a localized object is sitting on/above a landmark implies that there is contact between localized object and landmark, or the fact that the localized object is flying on/above a landmark implies that there is no contact to the landmark. In contrast to manner/posture verbs that entail the concept of CONTACT such as the Korean verb ‘fly’ (see Section 4.2.1.), the current section deals with verbs that do not entail this concept. Posture verbs in several languages may be used without specification of the concept of CONTACT by the locative adjunct although this concept is part of the situation (cf. Enfield 2002: 32–33; Newman 2002: 5). In Lao, utterances with the verb ‘to sit’ and the landmark ‘chair’ without any overt marker of either SUPERIOR or CONTACT give rise to an interpretation that includes
Semantic categorizations and encoding strategies
345
both concepts as illustrated in (20a). Nevertheless, this inference is possible insofar this localization is a default situation in the common ground of the interlocutors, and it is cancelable by the change of the entities involved (cf. (20b)). (28)
a.
b.
man2 nang1 tang1 3 sit chair ‘He sat/is sitting (on a) chair.’ (cf. Enfield 2002: 32) man2 nang1 toq2 3 sit table ‘He sat/is sitting (at a) table.’ (cf. Enfield 2002: 32)
In our experimental data, languages that do not encode the subordinate concepts, use sometimes pragmatic inferences on the basis of the encoded manners or postures. The positional verb ‘squat’ in Yucatec Maya is used for the posture of the bird on the elephant (Game 1), and also for the posture of the candle on the table (Game 2; see (29)). In both discourse situations, the concept of CONTACT is not encoded through the preposition y´ookol ‘on/above’ although it is the identifying property of the highlighted picture. (29)
t´aasche’ hun- p’´eel y`aan esten hun-p’´eel EXIST HESIT one-CL. INAN table one-CL.INAN kib t’´uuch-kin-ah y´ook’ol. candle squat-FACT-NR on/above ‘There is a table, a candle is put on/above it in a squatting position.’ (Game 2)
The description in (29) has been successfully accepted by the matcher of the game, who identified the target situation which involves the concept of CONTACT. Both in the highlighted picture as well as in the background picture the candle is presented as being hold by a hand in order to eliminate the difference in posture which adds a further difference in the situations. Nevertheless the Yucatec speaker has imposed postural information in terms of squatting in order to introduce the concept of CONTACT. The question is if the verb t’´uuch- ‘squat’ includes the concept of CON TACT or if this is pragmatically inferred as in the Lao Example (28a). Condition 5 in the experimental material (see Table 1) has been designed to test the defeasibility of such inferences. In Game 12, the same posture occurs but the candle is not located immediately on the table but it is squatting on a stack
346
Stavros Skopeteas
of books that are on the table. The informant has been asked if the utterance in (30) is true for the situation in Game 12 and she judged it as acceptable. (30)
kib t’´uuchukbal y´ook’ol le t´aasche’-o’ y`aan hun-p’´eel EXIST one-CL . INAN candle squat:POS on/above DEF table-D 2 ‘There is a candle put on/above the table in a squatting position.’ (Game 12)
Consequently the relation between the concept of CONTACT is not a semantic feature of the verb t’´uuch- ‘squat’ but a defeasible inference as represented in (31). The pragmatic inference originates in the conjunction of the posture SQUAT and the concept of SUPERIOR: ‘if x is squatting somewhere in the vertical axis of y, it is highly probable that it squats on y because of gravity’. (31)
5.
Game 2:
∃x∃y highlighted picture: [LOC(x, [SUPERIOR(y) & EXTC (y)]) & background picture: ¬LOC(x, [SUPERIOR(y) & ¬EXTC (y)])] semantics of Yucatec t’´uuch-kin-ah y´ook’ol: λ xλ y [LOC(x, SUPERIOR(y)) & SQUAT(x)] pragmatic inference: [LOC(x, SUPERIOR(y)) & SQUAT(x)] +> LOC(x, EXTC (y))
Mixed systems
Languages with “mixed systems” combine the encoding of the superordinate with the encoding of one of the subordinate concepts. Babungo (NigerKongo) employs a preposition, which encodes the superordinate concept, and another one, which encodes the one subordinate concept of SUPERIOR & CONTACT (see examples in Schaub 1985: 159). (32)
Babungo, Niger-Kongo (Schaub 1985: 159) → superordinate (SUPERIOR) C → subordinate (SUPERIOR & EXT ) subordinate (SUPERIOR & ¬EXTC ) →
t f´uu –
Semantic categorizations and encoding strategies
347
The alternative mixed system is attested in Evenki (cf. Nedjalkov 1997: 74), where a postposition is used for the superordinate and another one for the subordinate concept of SUPERIOR & NON - CONTACT. (33)
Evenki, Altaic (Nedjalkov 1997: 74) → -ojo superordinate (SUPERIOR) C subordinate (SUPERIOR & EXT ) → – subordinate (SUPERIOR & ¬EXTC ) → -ugi
The characteristic of this language type with respect to the interaction between semantic categorizations and encoding strategy is that the elements denoting concepts in the domain of SUPERIOR are organized as an entailment scale (cf. Levinson 2000: 79). In our language sample, the language representing this language type is Nanafwe which expresses unspecified SUPE RIOR and SUPERIOR & NON - CONTACT. These concepts build an entailment scale of the form: (34)
where the concept on the right entails the concept on the left. Entailment scales allow for a particular type of conversational implicatures: “assertion of a lower ranking (rightwards) alternate implicates that the speaker is not in a position to assert a higher ranking one” (cf. Levinson 2000: 79). By using a weaker expression, namely the superordinate concept, the hearer assumes that the speaker is not in a position to make a strong assertion by means of the subordinate concept. The application of this inference is exemplified by the data in Nanafwe. In Game 5, the concept of NON - CONTACT has to be asserted. However, both expressions in (35) and (36) are judged as “possible” for the description of the highlighted situation. Although both assertions qualify as true for this situation, the explicit mention of SUPERIOR & NON - CONTACT in (36) is judged as preferable since (35) generates a strong inference that the speaker does not have enough evidence to make explicit the concept of NON - CONTACT. The inverse discourse situation is tested in Game 2, in which the concept of CONTACT has to be asserted. In this context, (36) is false, and (35) may be successfully used since it implies CONTACT in that it does not assert NON CONTACT.
348 (35)
Stavros Skopeteas
l`e b`uz´ı-n n´u t´abl´ı-n s´u have candle- DEF in table-DEF on/above ‘He holds the candle on/above the table.’ (Game 5: “true but not felicitous”; Game 2: “felicitous”) SBJ:3. SG
(36)
l`e b`uz´ı-n n´u t´abl´ı-n s´u ngl¯o9 SBJ:3. SG have candle- DEF in table-DEF above ‘He holds the candle above the table.’ (Game 5: “preferable”; Game 2: “false”)
The encoding strategy of Nanafwe in Game 2 is represented in (37). The assertion of the superordinate concept is felicitous, since the subordinate concept is pragmatically inferred on the basis of the entailment scale in (34). (37)
Game 2:
∃x∃y highlighted picture: [LOC(x, [SUPERIOR(y) & EXTC (y)]) & background picture: ¬LOC(x, [SUPERIOR(y) & ¬EXTC (y)])] semantics of Nanafwe s´u: λ xλ y LOC(x, SUPERIOR(y)) pragmatic inference on the basis of (34): LOC (x, SUPERIOR (y)) +> LOC(x, [SUPERIOR(y) & EXTC (y)])
Like the superordinate encoding languages, Nanafwe uses expressions that are not specified for CONTACT when this concept has to be asserted. In Game 1, ‘a bird sitting on an elephant’ is contrasted to ‘a bird flying above an elephant’. The Nanafwe expression in (38) is a felicitous expression in this discourse situation, which according to (37) and (34) allows for two pragmatic inferences: the inference on the basis of the entailment scale in (34) and the inference from the asserted posture SIT to the concept of CONTACT as exemplified for superordinate encoding languages in Section 3. (38)
s´u. a´ n´um¯an-n t`ı sw´ı-n bird-DEF sit elephant-DEF on/above ‘The bird sits on/above the elephant.’ (Game 1)
The inferential basis of the utterance in (38) is shown through the results in Game 11, in which the bird is sitting on a tree branch above the elephant. In this context, (38) has been judged as “true” but “confusing”. The preferred expression is the utterance in (39).
Semantic categorizations and encoding strategies
(39)
6.
349
s´u ngl¯o. a´ n´um¯an-n t`ı sw´ı-n bird-DEF sit elephant-DEF above ‘The bird sits on/above the elephant.’ (Game 11)
Split systems
Splitting introduces a further typological parameter that allows for multiple typological classifications of languages with respect to their superordinate/subordinate/mixed encoding properties in different constructions. A split system conditioned by semantic parameters is related to the opposition between static location and motion: Rapanui (Polynesian; Du Feu 1996: 126– 127) distinguishes between CONTACT and NON - CONTACT in static events, but neutralizes the opposition in motion events (see (36)). A further example is German: the distinction between CONTACT and NON - CONTACT is neutralized in perlative situations (cf. ein Vogel fliegt u¨ ber die Br¨ucke ‘a bird flies over the bridge’, vs. ein Mann geht u¨ ber die Br¨ucke ‘a man goes over the bridge’). (40)
Rapanui, Polynesian (Du Feu 1996: 126–127) location → – superordinate (SUPERIOR) → a ru a subordinate (SUPERIOR & EXT C ) subordinate (SUPERIOR & ¬EXT C ) → i ru a
motion i ru a – –
A split system conditioned by syntactic parameters occurs in Modern Greek, which distinguishes between the subordinate concepts in prepositional constructions, but encodes only the superordinate concept in adverbial constructions. The Modern Greek prepositions are complex elements formed by the spatial adverb p´ano ‘up’ and the simple prepositions se ‘LOC’ and ap´o ‘ABL’ (cf. Theophanopoulou-Kontou 1993).10 (41)
Modern Greek superordinate (SUPERIOR) subordinate (SUPERIOR & EXT C ) subordinate (SUPERIOR & ¬EXT C )
→ → →
preposition – p´ano se p´ano ap´o
adverb p´ano – –
The point at issue in languages with syntactic splits is that the relevant concept for a discourse situation may motivate the choice of syntactic structure. In discourse situations that profile the subordinate concepts such as Games 3
350
Stavros Skopeteas
and 6, a prepositional expression is needed to give an unambiguous description of the relevant situation (see Examples (42)–(43)). (42)
to
katsar´oli ´ı io ´ıne same:NOM . SG . N stove.pot:NOM . SG . N be:3.SG s-ti foti´a LOC (=on)- DEF : ACC . SG . F fire: ACC . SG . F
DEF : NOM . SG . N
p´ano up
‘The same pot is on the fire.’ (Game 3) (43)
tsuk´ali ´ıne DEF : NOM . SG . N fire.pot:NOM . SG . N be:3. SG ti p´ano ap´o foti´a ABL (=above) DEF : ACC . SG . F fire:ACC . SG . F up
to
‘The fire-pot is above the fire.’ (Game 6) In discourse situations that profile the superordinate concept, such as in Game 9, the use of adverbial constructions is also possible (see Example (44)). The use of the preposition ap´o ‘ABL’ in front of the adverb does not form a complex preposition as in (43). The preposition in this example governs the adverb and its ablative function is interpreted as a static orientation (lit. ‘in a directed axis falling from the upper region’) and is not sensitive for the distinction between CONTACT and NON - CONTACT. The utterance in Example (44) may not be successfully used in the Games 3 or 6, in which the subordinate concepts need to be asserted. (44)
ik´ona ´ıxni picture:NOM . SG . F show:3. SG ti foti´a ke to DEF : ACC . SG . F fire:ACC . SG . F and DEF : ACC . SG . N
i
DEF : NOM . SG . F
tsuk´ali ap´o p´ano pot:ACC . SG . N ABL up ‘The picture shows the fire and the pot upon/above it.’ (Game 9) The choice of an adverbial or a prepositional constructions is properly conditioned by the inferability of the landmark. If the landmark is uniquely inferable from the context, an adverbial construction may be chosen. The split system in encoding spatial regions adds a further parameter in the choice of
Semantic categorizations and encoding strategies
351
syntactic construction: if the subordinate concept is relevant for the discourse situation, the prepositional construction must be used. The encoding strategy in Modern Greek is represented in (45)–(46). If the concept of CONTACT is relevant for the discourse situation, then it is a prepositional construction that is appropriate. If CONTACT is not relevant, then it is possible to use an adverbial construction that only encodes the superordinate concept of SUPERIOR. The argument of the concept of SUPERIOR in the adverbial construction is a free parameter that has to be filled in the context. (45)
Game 3 (see Example (42)): ∃x∃y highlighted picture: [LOC(x, [SUPERIOR(y) & EXTC (y)]) & background picture: ¬LOC(x, [SUPERIOR(y) & ¬EXTC (y)])] semantics of preposition p´ano se: λ xλ y LOC(x, [SUPERIOR(y) & EXTC (y)])
(46)
Game 9 (see Example (44)): ∃x∃y highlighted picture: [LOC(x, [SUPERIOR(y) & ¬EXTC (y)]) & background picture: ¬LOC(x, ¬SUPERIOR(y))] semantics of adverb p´ano: λ x LOC(x, SUPERIOR(y))
7.
Conclusions
It has been shown that four possible language types may be distinguished with respect to the encoding of the superordinate concept of SUPERIOR and the subordinate concepts of SUPERIOR & CONTACT and of SUPERIOR & NON CONTACT: (a) languages that only encode the superordinate concept; (b) languages that only encode the subordinate concepts; (c) languages that encode the superordinate concept and one of the subordinate concepts; and (d) languages with split systems that display the properties of one language type in one construction and the properties of another type in another construction. Speakers of representative languages of these four types have participated in a production experiment which was designed to reveal the encoding strategies that speakers with different semantic classifications use to resolve identical tasks. On this empirical basis, it has been shown that the possible language types with respect to the available semantic categorizations within
352
Stavros Skopeteas
a taxonomy of concepts crucially differ in the encoding strategies they use, depending on which concept is relevant in the discourse situation. Languages that encode the subordinate concepts of this domain, namely SUPERIOR & CONTACT and SUPERIOR & NON - CONTACT, were shown to be characterized by obligatorily encoding the concept of CONTACT, i.e. irrespective of its relevance for the discourse situation. Languages that encode the superordinate concept were shown to ignore the concepts of CONTACT/NON - CONTACT whenever they are not relevant for the discourse situation. In the case that these concepts are needed in order to perform the communicative task, languages of this type were shown to make use of three different means of conveying this information: (a) they make use of a different lexicalization pattern, e.g. subordinate verbs in Korean, giving the same semantic representation by alternative means; (b) they encode alternative concepts implied by the concepts of CONTACT/NON - CONTACT to describe the same situation, e.g. Yucatec Maya speakers used the concept of position in the absolute frame of reference (HIGH) in order to describe the situation of NON - CONTACT; or (c) they make use of different concepts, that allow for inferences about the concepts of CONTACT/NON - CONTACT, e.g. Yucatec Maya speakers used the posture SQUAT that allows for the pragmatic inference of CONTACT. Languages of the mixed type display a semantic classification involving one subordinate concept and the superordinate one. Concerning the impact of semantic classifications to encoding strategy, these languages provide a particular inferential means for the conveyance of the subordinate concepts, namely implicatures based upon the entailment scale formed by the abstract superordinate concept and the available subordinate one. This language type was represented in our sample by Nanafwe that displays a postposition encoding the superordinate concept and a postposition encoding SUPERIOR & NON - CONTACT. The use of the superordinate concept in this language allows for inferences that the NON - CONTACT is excluded, since it is not explicitly mentioned. Some languages were shown to display constructional splits with respect to the encoding of SUPERIOR : the point at issue in the case of split systems is that the relevance of CONTACT / NON - CONTACT for the discourse situation can motivate the choice of construction. For instance, the condition that properly accounts for the choice between an adverbial and a prepositional construction is the inferability of the landmark: the adverbial construction is possible when the landmark is retrievable from the context. In a language
Semantic categorizations and encoding strategies
353
with a split system in encoding spatial regions, the relevance of a certain spatial region may influence the syntactic choice as well. Modern Greek has been shown to display such a constructional split: the prepositional paradigm encodes the distinction of the subordinate concepts, whereas the adverbial paradigm only encodes the superordinate concept. Thus, in conditions in which the subordinate concept has to be asserted, the adverbial construction is not felicitous, even if the landmark is inferable from the context. This construction is only possible when the superordinate concept is at issue. Finally, a comment on the ontology dealt with in this article is necessary here, especially with respect to the dimension of objectivity in typological perspective (Nickles et al., this vol.). The introduced concepts are treated as a minimal requirement in order to give an adequate account about the differences between the object languages. Not all languages encode these concepts, but encoding is not the only way for a concept to get a place in grammar. As the discourse patterns in each language type show, the same conceptual relation serves as resource for making inferences and as basis for solving communicative tasks across languages.
Notes 1. Cordial thanks to Amani Bohoussou, Marija Maya Brala, Sonia Cristofaro, Silke Fliess, Dagmar Haumann, Robin H¨ornig, Johannes Helmbrecht, Christian Lehmann, Silvia Luraghi, Daniela Nienstedt, Yoko Nishina, Su-Rin Ryu, Maria Sepsa, Alkistis Skopetea, Olga Stralets, Irina Utjuznikova, Elisabeth Verhoeven, and Thomas Weskott who have contributed to this paper as language and/or linguistics experts. 2. The highlighted pictures 11–13 are designed to be compared with the descriptions for the highlighted pictures 1–3: 11–13 differ to 1–3 with respect to the concept of CONTACT, but – in contrast to pictures 1–3 – they display the same POSTURE. 3. For instance, many languages specify properties of the landmark, e.g. localization on a surface or not in Koromfe (Rennison 1997: 175–178), localization on top of a tall object vs. on the flat surface of an object vs. on a surface, that is conceived as the outside of an object, in Wari (cf. Everett and Kern 1997: 265). 4. See for instance the concept of SUPPORT for the French preposition sur in Vandeloise (1986: 195), the concept of STABILIZATION for the French preposition sur in Aurnague and Vieu (1993: 421), or the functional properties of the English preposition above in Carlson (2000: 100). 5. Glosses: 1=1st person; 3=3rd person; ABL =ablative; ACC =accusative; CL =classifier; CON =converb; D =deictic; DECL =declarative; DEF =definite; EX IST =existential; F =feminine; FACT =factitive; GEN =genitive; HESIT =hesitative; INAN =inanimate; INF =infinitive; INSTR =instrumental; LOC =locative;
354
Stavros Skopeteas
M =masculine; N =neuter; NOM =nominative; NR =nominalizer; PASS =passive; POS =positional; PFV =perfective; PL =plural; PRES =present; SBJ =subject; SG =singular; TOP =topic.
6. Cf. Talmy (2000; Vol. II, 21–146) on lexicalization patterns; see also Lehmann (1992) on central/decentral encoding of spatial functions. 7. Cf. Levinson (1996) on relative vs. absolute spatial regions; a further possibility would be a ‘deictic solution’: the location of a localized object in space is specified with respect to the perspective of the speaker (i.e. higher vs. lower than the speaker). Such a system is reported for Palestinian Arabic (cf. Regier 1991). 8. Cf. Stolz 1996 §6.5.3: ka’nal = “location on the major orientational plane or axis of VERT or OBS, location is further away from the zero point of the axial system”; k`aabal = “location on the major orientational plane or axis of VERT or OBS, location is closer to the zero point of the axial system”. 9. The postposition s´u ngl¯o is formed on the basis of the postposition s´u. The etymology and compositional meaning of the suffix -ngl¯o is unclear. 10. Both simple prepositions combine with different spatial adverbs giving rise to different semantic distinctions. Complex prepositions with se ‘LOC’ are only used in static and allative relations, whereas complex prepositions with ap´o ‘ABL’ are used in static, allative, ablative, and perlative relations. Thus, there is an opposition between complex prepositions with se ‘LOC’ and ap´o ‘ABL’ in static and allative relations, which renders different distinctions in different spatial regions. In combination with the adverb m´esa ‘inside’, the complex preposition m´esa se encodes a location in the interior of a bounded entity and the complex preposition m´esa ap´o encodes a location in the inner side of a boundary. In combination with the adverb brost´a ‘in front’, the complex preposition brost´a se encodes a location in a place that is contiguous to the front side of the landmark and the complex preposition brost´a ap´o encodes a location in a place which may be anywhere in the front axis of the landmark. In combination with the adverb p´ano ‘up’, the opposition between the two complex prepositions encodes the opposition between the concepts of CONTACT and NON - CONTACT. A compositional account for the opposition between se and ap´o could be based on the concept of a BOUNDED REGION for the preposition se and UNBOUNDED REGION / DIRECTION for the preposition ap´o (see Tachibana 1994). These concepts only hold for the use of the simple prepositions in combination with spatial adverbs and for their opposition in static and allative relations. In the glosses of this paper, the prepositions se and ap´o are glossed with their meanings as free morphemes and the meaning of the complex preposition is given non-compositionally in parenthesis.
References Aurnague, Michel, and Laura Vieu 1993 A three-level approach to the semantics of space. In The Semantics of Prepositions, Cornelia Zelinsky-Wibbelt (ed.), 393–439. Berlin/New York: de Gruyter.
Semantic categorizations and encoding strategies
355
Brala, Marija M. this vol. Spatial ‘on’ – ‘in’ categories and their prepositional codings across languages: Universal constraints on language specificity. Carlson, Laura A. 2000 Object use and object location: the effect of function on spatial relations. In Cognitive Interfaces: Constraints on Linking Cognitive Information, Emile van der Zee and Nikanne Urpo (eds.), 94–115. Oxford: Oxford University Press. Cole, Peter 1985 Imbabura Quechua. London: Croom Helm. Du Feu, Veronica 1996 Rapanui. London/New York: Routledge. Enfield, Nick J. 2002 Semantics and combinatorics of ‘sit’, ‘stand’, and ‘lie’ in Lao. In Newman (ed.), 25–41. Everett, Daniel L., and Barbara Kern 1997 Wari: The Pacaas Novos Language of Western Brazil. London/New York: Routledge. Klein, Wolfgang 1991 Raumausdr¨ucke. Linguistische Berichte 132: 77–114. Kornfilt, Jaklin 1997 Turkish. London/New York: Routledge. Lehmann, Christian 1992 Yukatekische lokale Relatoren in typologischer Perspektive. Zeitschrift f¨ur Phonetik, Sprachwissenschaft und Kommunikationsforschung 45: 626–641. Levinson, Stephen C. 1996 Language and space. Annual Review of Anthropology 25: 353–382. 2000 Presumptive Meanings: The Theory of Generalized Conversational Implicature. Cambridge, MA: MIT Press. Macaulay, Monica 1996 A Grammar of Chacaltongo Mixtec. Berkeley: University of California Press. Nedjalkov, Igor 1997 Evenki. London/New York: Routledge. Newman, John 2002 A cross-linguistic overview of the posture verbs ‘sit’, ‘stand’, and ‘lie’. In Newman (ed.), 1–24. Newman, John (ed.) 2002 The Linguistics of Sitting, Standing, and Lying. Amsterdam/Philadelphia: Benjamins. Nickles, Matthias, Adam Pease, Andrea C. Schalley, and Dietmar Zaefferer this vol. Ontologies across disciplines.
356
Stavros Skopeteas
Regier, Terry 1991 Learning spatial concepts using a partially-structured connectionist architecture. International Computer Science Institute (Berkeley) Technical Report 91–050. Rennison, John R. 1997 Koromfe. London/New York: Routledge. Schalley, Andrea C., and Dietmar Zaefferer this vol. Ontolinguistics – An outline. Schaub, Willi 1985 Babungo. London: Croom Helm. Stolz, Christel 1996 Spatial Dimensions and Orientation of Objects in Yucatec Maya. (Bochum-Essener Beitr¨age zur Sprachwandelforschung 29). Bochum: Brockmeyer. Tachibana, Takashi 1994 Spatial expressions in Modern Greek. Studies in Greek Linguistics 14: 525–539. Talmy, Leonard 2000 Toward a Cognitive Semantics. Vol. I: Concept Structuring Systems. Vol. II: Typology and Process in Concept Structuring. Cambridge, MA/London: MIT Press. Theophanopoulou-Kontou, Dimitra 1993 The complex Modern Greek prepositions and their structure [I sinthetes prothesis tis Neas Ellinikis ke i domi tus]. Studies in Greek Linguistics 13: 311–331. Vandeloise, Claude 1986 L’espace en franc¸ais: S´emantique des pr´epositions spatiales. (Travaux Linguistiques.) Paris: Editions du Seuil. Wunderlich, Dieter, and Michael Herweg 1991 Lokale und Direktionale. In Semantik: Ein internationales Handbuch der zeitgen¨ossischen Forschung, Arnim von Stechow and Dieter Wunderlich (eds.), 758–785. Berlin/New York: de Gruyter.
Part IV: Categories with open-class coding
Taxonomic and meronomic superordinates with nominal coding Wiltrud Mihatsch 1.
Lexical hierarchies – The classic approach1
Vertical taxonomic hierarchies such as the one represented by the English nouns thing – object – garment – skirt – miniskirt are generally considered as providing the most important structure for the noun lexicon and a fundamental organization principle of ontologies (see Miller 1998: 24). Since Plato and Aristotle hyponyms are defined on the basis of their modified superordinate or genus proximum, e.g. garment would be defined as “object that clothes” (cf. Klix 1993: 345–347). This structure allows the inheritance of information as well as the easy creation of new concepts and therefore offers a very economical, flexible and elegant type of knowledge organization, for instance in WordNet (see Fellbaum this vol.). I will analyse noun taxonymies in the domain of clothing in French and Spanish complemented by data from other languages and show that the coding of superordinates reveals the conceptual structure of everyday hierarchies, which is very different from the traditional account of taxonymies based on genus proximum and differentiae specificae.
2.
Evidence against the classic approach
In the Western world traditional hierarchies are still considered basic for knowledge organization (cf. Murphy 2002: 39). They are propagated by the model of scientific taxonomies and western-style schooling which creates “a bias toward taxonomy” (Iris, Litowitz, and Evens 1988: 285). However, psychological as well as linguistic research has shown that logical hierarchies are refined categorization principles that do not correspond to everyday reasoning (Ungerer and Schmid 1996: 60–63; Oerter 1988: 345). For instance, a taxonymic organization of nouns would probably to some extent at least produce a corresponding morphological structure, where the label of the superordinate would be part of the labels of the hyponyms. We do observe morphological relations between some nouns of the more specific levels, for instance between skirt and miniskirt, but hardly ever above the level
360
Wiltrud Mihatsch
of skirt, for instance between nouns meaning GARMENT and SKIRT. Thus the formal level at least does not manifest a uniform hierarchical organization of concepts. Furthermore hyponymy (at least above basic level) only plays a minor role in word-association tests, where antonymy, co-hyponymy and other associations prevail (cf. Aitchison 3 2003, Chapter 8.). This means vertical relations are probably not stored but rather computed (see Mihatsch 2006 for a detailed analysis) and are therefore not part of our stored lexical knowledge. Similar results come from aphasia, slips of the tongue and child language acquisition (Aitchison 3 2003, Chapter 8.; Klix 1984: 18–20, 42, 56). Thus hyponymy and therefore taxonymies seem to play a minor role in the mental lexicon: . . . it seems that scientific taxonomies are neither mind-sized nor mindoriented. The question is what a more subject-related alternative of organizing our knowledge of the word would look like . . . Yet so powerful has been the impact of logical taxonomies on modern Western thinking that it is difficult for anyone who has been educated in the Western tradition to imagine such an alternative. (Ungerer and Schmid 1996: 63)
What would an alternative account look like? How are lexemes organized if not hierarchically? The key to the organization of the lexicon at least in the domain of concrete nouns, where taxonomies seem to be most developed, is imagery. It is well known that images play an important role in long-term memory and that words which can be represented by imagery are more easily memorized than other types of words (cf. Kintsch 1982: 206–208). It is therefore very probable that concrete nouns are represented by images.2 Psychological research has shown that the most salient nouns of a hierarchy are those found on an intermediate level, the so-called basic level (Rosch et al. 1976; for an overview see Murphy 2002, Chapter 7.). The basic level is the highest level of generalization where nouns can still be represented by an overall shape, which explains why a definition such as “equine animal” for horse (cf. Cruse 1986: 140) is redundant and a definition such as “a garment hanging from the waist” for skirt (CED) is a useful lexicographical definition, but probably does not correspond to the mental representation of skirt. Basiclevel nouns such as skirt or horse are better represented by schematic images than verbal definitions. On that level, lexemes are usually the shortest, morphologically simplest items of a noun hierarchy (Taylor 2 1995: 46–48; Rosch et al. 1976). Lexemes on the other levels tend to be morphologically more complex, and it is assumed that the other levels are conceptually derived from
Taxonomic and meronomic superordinates with nominal coding
361
the basic level via “parasitic categorization”, i.e. by conceptually, sometimes also morphologically, elaborating on basic-level gestalts (Ungerer 1994; cf. also Lakoff 1987: 282): garment generalization basic level
?
skirt specialization miniskirt
Figure 1. Basic level and parasitic categorization
Conceptually, a subordinate corresponds to the basic-level image plus additional properties, thus preserves the overall gestalt of the basic-level concept. Therefore subordinate concepts such as MINISKIRT can be defined by a modified basic-level lexeme such as “very short skirt”, which means that the traditional intensional inclusion relation seems to work between basic level and subordinate level. This is often reflected on the level of the linguistic form of the corresponding nouns as in the case of miniskirt. However, so far it is not clear in what way superordinate concepts can be derived from basic-level concepts. Hyperonyms such as garment or French vˆetement GARMENT cannot be based on a simple gestalt like skirt and miniskirt or French jupe SKIRT or minijupe MINISKIRT. There is in fact ample evidence for a conceptual gap between the basic level and the superordinate level. Above basic level we find many lexical gaps. Often only awkward paraphrases are possible (Cruse 1986: 149–150; Taylor 2 1995: 46–47). Many languages such as Gbaya and Warlpiri do not have nouns for PLANT or ANIMAL (Berlin 1972: 78; Roulon-Doko 1997: 345; Wierzbicka 1992: 8). Quite striking is the great number of learned words above basic level. The superordinates animal, plant and object as well as the corresponding nouns in Spanish, French and many other languages were originally scientific terms. These are in fact defined on the basis of their genus proximum and differentiae specificae, i.e. in the Aristotelian tradition (see Mihatsch 2006, Chapter 4.). Beside these characteristics the syntactic and semantic deviations found on the superordinate level will be shown to be the key to the conceptual properties of everyday superordinates. Strikingly, many superordinates in count noun hierarchies such as English clothing, French fringues CLOTHING and
362
Wiltrud Mihatsch
Spanish ropa CLOTHING are collective nouns, i.e. meronymic superordinates, since subordinate and superordinate are linked by a part-whole relation.3 Finally, there is a second type of everyday superordinate that deviates from the superordinates presented so far. Next to the less frequent, morphologically complex and often syntactically deviant nouns a second group of superordinates consists of highly frequent, very short words such as thing (cf. Vossen 1995: 377). It will be shown that these are more grammatical than lexical and therefore conceptually very different from lexical superordinates. 3.
Collective nouns
The great number of collective nouns above basic level is impressive. Markman (1985: 39) has examined 19 languages and observes that an average of 34% of all analyzed superordinates are mass or collective nouns. If we look at one conceptual domain such as clothing we discover a very similar result. We find a high number of collectives roughly meaning OUTFIT or CLOTHING such as French accoutrement, costume, fringues, fripe(s), frusques, garderobe, habillement, habits, mise, nippes, sape(s), tenue, vestiaire, vˆetement (which is polysemous and means both GARMENT and CLOTHING) and many others, in Spanish we find atav´ıo, atuendo, equipo, gala(s), guardarropa, indumentaria, ropa, traje, vestido(s), vestuario and many others. The same observation holds for German and English. Compared with this the number of count hyperonyms meaning GARMENT is very limited. In French we find vˆetement, in Spanish prenda, in English there is the rather formal noun garment, in German there exists the slightly awkward and not very common derived compound Kleidungsst¨uck, next to very colloquial often pejorative nouns such as English rag. In Sections 3.1. to 3.5. it will be shown that collectives are the key to the conceptual make-up of everyday hierarchies both synchronically and diachronically.4 They reveal both the conceptual properties of everyday superordinates and their relations with basic-level nouns. 3.1.
Group collectives
In the domain of clothing we find a great number of collectives meaning OUTFIT, that designate all items of clothing, shoes, headgear and accessories a person wears at some point.5 In Spanish we find arreglo, atav´ıo(s), atuendo,
Taxonomic and meronomic superordinates with nominal coding
363
gala(s), equipo, indumentaria, traje, vestido (which goes back to lat. vestitus OUTFIT, see OLD) and others. In French this concept is expressed by habillement, mise, tenue, toilette and others (PR). In German we find Aufzug, Kleidung, Kluft, Outfit, in English apparel, attire, dress, gear, getup, outfit or wear. Unlike the hyponyms in traditional hierarchies, nouns denoting members of these collectives do not depend semantically on the superordinate, in this case the collective, i.e. they are not defined on the basis of their genus proximum. Not all items of clothing belong to an outfit, but the concept OUTFIT is conceptually based on diverse garments types on the basic level such as trousers, shirt, skirt etc. (cf. Cruse 1986: 176–177).6 Leisi (4 1971: 31–32) calls such collectives “group collectives”. These collectives often diachronically derive from nouns meaning EQUIP MENT, which can be paraphrased as “collection of things that have the function x in a certain situation” (cf. Vossen 1995: 335). They usually stress contingent, temporary functions of objects, not inherent properties. There is no semantic relation with any subordinate nouns (cf. Wierzbicka 1985: 267), for instance a watch is no type of pledge. These nouns are not based on stable imagery, but rather on a temporary categorization of referents. Spanish prenda PLEDGE, thus, PAWNED THINGS, became prenda GAR MENT, via a collective CLOTHING . English garment stems from Old French garniment EQUIPMENT, again via a collective meaning CLOTHING (OED). French habillement CLOTHING is also derived from a noun meaning EQUIP MENT (DHLF). Spanish ropa and French robe can be traced back to a Germanic noun meaning ANYTHING OBTAINED BY PILLAGING , BOOTY (DCECH, DHLF). In Old Spanish ropa probably soon came to mean PIL LAGED CLOTHES, but then also OUTFIT, this sense still exists today. In Quebec French we find the same path with butin CLOTHING / FURNITURE (Guillemard 1991: 111), derived from French butin BOOTY. Thus, as these nouns become more conventionalized they are reinterpreted. The temporary function is complemented and in some cases such as Spanish ropa eventually replaced by a stable representation on the basis of fixed types of subcategories. Virtually everything can become part of a booty, whereas only certain types of referents can be designated by clothing. Conceptually, these collective nouns are then conjunctions of a selection of spatially and temporally contiguous basic-level concepts (cf. Vossen 1995: 173), which can thus be represented by imagery and are therefore good units of long-term memory (cf. Kintsch 1982: 206–208). Many group collectives such as an older sense of French vˆetement (DHLF) are closed collections:
364 (1)
Wiltrud Mihatsch
une petite robe de laine, un tablier, une brassi`ere de futaine, un jupon, un fichu, des bas de laine, des souliers, un vˆetement complet pour une fille de huit ans (Hugo) (PR) ‘a little woollen dress, a pinafore, a flannel undershirt, an underskirt, a scarf, woollen stockings, shoes, a complete outfit for an eight-year-old girl’
In some cases the conceptual structure founded on basic-level items even reflects on the morphological level. A closer look at cross-linguistic data reveals morphological evidence in favour of a representation based on the conjunction of basic-level concepts. In everyday language speakers often prefer syntagms such as knifes and forks to superordinates such as cutlery (Aitchison 3 2003: 96; Haspelmath in press). Many languages dispose of a compound type that directly reflects this structure, i.e. coordinative compounds consisting of two representative basic-level concepts, as in Mordvin ponks.t-panar.t SHIRT. PL + TROUSER . PL meaning OUTFIT / CLOTHING (W¨alchli 2005: 139; Haspelmath in press). In American Sign Language we find a complex sign meaning CLOTHING that is a combination of the signs meaning DRESS , BLOUSE and PANTS (Newport and Bellugi 1978: 58).
3.2.
Generic nouns
If group collectives are further conventionalized they follow a clear-cut path of change. Most strikingly, the members of the collective noun do not have to be contiguous any longer. Thus words designating the outfit of a person often come to mean CLOTHING, which no longer designates all the items of clothing that are worn by a person at a time, but items of clothing in general, such as French habillement in industrie de l’habillement (PR). French vˆetement CLOTHING also goes back to a collective noun meaning OUTFIT (DHLF, OLD, PR) – see Example (1) – as well as French habit(s), or fringues from fringue LUXURY OUTFIT (DHLF) and others. Spanish ropa CLOTHING , TEXTILES, vestido as in Historia del vestido (MOL) and prenda GARMENT must have gone through the same process. Similarly the English collectives wear and gear meaning OUTFIT have developed additional senses meaning CLOTHING as in men’s wear or head gear. English clothing and German Kleidung probably also went through a stage with the meaning OUTFIT (OED, GDW).
Taxonomic and meronomic superordinates with nominal coding
365
At the same time that the temporary contiguity of several items of clothing as part of the meaning disappears – i.e. they are no longer closed collections – the degree of similarity of these items increases, since they acquire increasingly inherent perceptual properties that are similar to some extent. In the domain of clothing such collectives do not include shoes, headgear and accessories any longer. French habit used to include shoes and headgear (DHLF). French vˆetement GARMENT shows the old meaning OUTFIT in didactic uses where shoes and headgear are also included, today the current collective sense of vˆetements corresponds to OUTER CLOTHING (DHLF). A similar development can be observed in the case of Spanish ropa, that still shows both senses. The older sense is illustrated by the following example: (2)
El astro estaba indemne, pero de su ropa [emphasis mine] s´olo conservaba ´ıntegros los zapatos y los calzoncillos. (CREA, Vargas Llosa (1977): La t´ıa Julia y el escribidor, Per´u) ‘The star was unharmed, but of his clothing only his shoes and his underpants were left intact.’
In Example (3) the referent types are more restricted, here no complete outfit is referred to, but rather an unordered set of clothes: (3)
En el almac´en se vend´ıa tanto alimentos, ropa [emphasis mine], sombreros y zapatos . . . (CREA, Silvestrini (1987): Historia de Puerto Rico: trayectoria de un pueblo, Puerto Rico) ‘The shop sold food as well as clothing, hats, and shoes . . . ’
Leisi (4 1971: 32–34) calls these collective nouns “generic nouns”. They typically designate classes of individuals which are not related through contiguity, i.e. mostly spatial proximity, but through the similarity of their members. Unlike the collectives discussed in Section 3.1. the relation between the elements and the collective is stable and obligatory. All shirts are an item of clothing because clothing is based on intrinsic properties of the members, thus the relation is permanent, and seems very close to hyponymy. But due to the great salience and stability of basic-level gestalts, these collectives are still based on different basic-level concepts with distinct shapes and not just on a few common properties. Unlike classic hyponymy and like the members of group collectives the members of generic nouns do not conceptually depend on the collective. Shirts are not conceptualized as a type of clothing, but independently, as gestalts.
366
Wiltrud Mihatsch
Syntactically, these nouns behave like mass nouns, since they stand for a class, not a (closed) set of individuals. Therefore in principle they can refer to one individual (Leisi 4 1971: 32): (4)
He bought a shirt → He bought clothing7
But although generic nouns can in principle refer to one individual, and thus correspond to disjunctions such as SHIRT OR TROUSERS OR JUMPER ETC ., it is more common for them to refer to a plurality of heterogeneous items (see Murphy and Wisniewski 1989). Native speakers of Arabic have confirmed that most Arabic transnumerals (which behave like generic nouns) require plural referents, and Welsh unmarked collectives, which also correspond to generic nouns, require plural agreement (Kuhn 1982: 62, 66). If we compare generic nouns and group collectives we see that generic nouns are conceptually more autonomous, less context-dependent and semantically stabler than group collectives, which designate temporary groupings of fixed but heterogeneous referent types. Generic nouns are therefore better lexical items, which is confirmed by typological observations according to which group collectives tend to be marked, generic nouns unmarked lexical items (cf. Kuhn 1982: 79). The smaller number of generic nouns compared to the quite numerous group collectives proves that the former are usually the result of a further entrenchment of group collectives. 3.3.
From plurale tantum to count hyperonym
If the (rather vague) similarity of the referents is felt to be strong enough, generic nouns can be conceptualized as a homogeneous but comparatively vague plurality of members of one common class complemented by a common function. Therefore we find many pluralia tantum based on generic nouns such as French fringues, habits, Spanish ropas and vestidos (Clara P´erez, pers. comm.), English clothes or German Kleider. In colloquial French we find a whole series of plurale tantum in this conceptual domain meaning CLOTHES such as fringues, nippes, hardes, fripes. Many pluralia tantum are no longer discernible as such since they are based on the Latin neuter plural, e.g. Spanish prenda GARMENT from Latin pignora (cf. Morreale 1973: 121– 131). In some cases the plurale tantum can be reinterpreted as an inflectional plural and can then be freely combined with all types of numerals and quan-
Taxonomic and meronomic superordinates with nominal coding
367
tifiers. However this seems to be a slow process (cf. Mihatsch 2006, Chapter 3.), strongly individuating quantifiers and numerals, in particular small numerals, are only allowed at the end of this process. Thus *dos ropas ‘*two clothes’ is considered ungrammatical (cf. Bosque Mu˜noz 1999: 29). It is difficult to form an analogical singular, even if small numerals can modify the plurale tantum, since the individuation is greatest in the singular form. Thus there is a gradual movement from plurale tantum to count hyperonym. Spanish ropa is only sporadically found in the singular: (5)
. . . alguna ropa vieja [emphasis mine]; un pantal´on de su ni˜no, por ejemplo. (CREA, Tudela/Herrer´ıas (1988): Costura para la familia. Mexico) ‘any old garment; a pair of trousers of your little son, for example.’
Well established count nouns without any pejorative connotations (cf. Section 3.5.) are the former collectives (and pluralia tantum) Spanish prenda (DCECH; Morreale 1973: 127–131), French vˆetement, and English garment, thus there seems to be only one unmarked count superordinate per language in this domain. It will be argued that count superordinates preserve the imagery of the collectives and that they conceptually correspond to a vague disjunction like SHIRT OR TROUSERS OR JUMPER ETC . rather than to an exclusively verbal definition such as THING TO WEAR (cf. Murphy and Wisniewski 1989: 583). If asked to list attributes for count superordinates, informants often give names of several basic-level categories (Miller and Johnson-Laird 1976: 281, 298). Count hyperonyms are therefore much like generic nouns, which tend to be disjunctive, too. At first sight it is clear that group collectives are by far the most frequent senses, which are also diachronically earlier than generic nouns, let alone count hyperonyms. Here we find most cases of innovation, whereas the other senses seem to develop through the increasing entrenchment and conventionalization of these collectives. However, count superordinates develop very rarely and seem to be highy unstable once they have emerged (see Section 3.5.). These observations point to a very different semantic organization from traditional logical hierarchies. 3.4.
Disjunction vs. conjunction: The theory of mental models
Mental model theory (e.g. Johnson-Laird and Byrne 1991) can explain the preference for collective over disjunctive uses.
368
Wiltrud Mihatsch
Group collectives, for instance outfit, are conjunctions since they refer to a collection of several subordinate referents. Generic nouns such as clothing and hyponyms such as garment can refer disjunctively. The proposition “This is a garment” can be paraphrased as “This is a shirt or a skirt or a pair of trousers, etc.” and thus logically corresponds to an exclusive disjunction and is a lot vaguer than the conjunction on the conceptual level. The illustration in Figure 2 (cf. Johnson-Laird and Byrne 1991: 6–7, 119– 121) contrasts conjunctions and disjunctions in formal logic and mental model theory. formal logic
mental models
conjunction p q p and q t t t t f f f t f f f f
exclusive disjunction p q either p or q but not both t t f t f t f t t f f f
inclusive disjunction p q p or q or both t t t t f t f t t f f f
Figure 2. Disjunction and conjunction in formal logic and mental model theory
Taxonomic and meronomic superordinates with nominal coding
369
In mental model theory everyday reasoning is rather based on concrete imaginable models (represented as rectangles in Figure 2) than on truth tables based on propositions that list all cases, even false ones. Mental model theory explains processing differences. The greater the number of models to process, the more difficult a task is, thus conjunctions, which can be represented by one image, are easier to process than exclusive disjunctions. Exclusive disjunctions are easier to process than inclusive disjunctions, which allow a huge number of models (Johnson-Laird and Byrne 1991: 43–45; JohnsonLaird, Schaeken, and Byrne 1992: 427). Due to the limited storage capacity of the working memory only as many models as necessary are explicitly used for reasoning (Johnson-Laird, Schaeken, and Byrne 1992: 421). This also explains why lexemes based on conjunctions are preferred to disjunctive concepts where several images have to be checked and compared (cf. Miller and Johnson-Laird 1976: 298). Children learn collective superordinates earlier than count superordinates (Markman, Horton, and McLanahan 1980). The conjunctive model, which also corresponds to one of the mental models of inclusive disjunction, therefore seems to be the default model of concepts with a possible inclusive disjunctive use such as the generic nouns clothing or clothes, which are halfway between group collectives and hyperonyms, or even count nouns such as garment, whereas group collectives always correspond to conjunctions. Another piece of evidence for the preference of the conjunction of basic-level gestalts to disjunctions comes from the frequent pluralization of generic nouns. In Section 3.5. I will show that even count superordinates such as garment are rather used conjunctively than disjunctively.
3.5.
The status of disjunctive concepts in the lexicon
According to mental models theory, disjunctions should be less suitable units of long-term memory than conjunctions. Even conventionalized count superordinates are not very appropiate for the reference to one individual based on a disjunction since speakers would have to activate a series of representations and then decide whether a referent belongs to a superordinate (Wisniewski and Murphy 1989: 256). Therefore, even count superordinates are preferably used conjunctively to refer to several heterogeneous basic-level concepts by children as well as by adults in informal contexts (Markman, Horton, and McLanahan 1980: 227,
370
Wiltrud Mihatsch
238–240; Waxman and Hatch 1992: 163). Wisniewski and Murphy (1989) have classified the uses of count superordinates and basic-level lexemes in a corpus analysis and have found that non-biological basic-level nouns are found in 73% of all occurrences in the singular, in only 15% in the plural and in 12% in generic use, whereas nonbiological superordinates are found in 26% of all occurrences in the singular, in 46% in the plural form and in 28% in generic use. The preference for the plural is sometimes stated in dictionaries, as the entries for English garment (CED) or French vˆetement (PR) show. A look at corpora confirms these observations: Table 1. Plural frequency in superordinates and basic-level nouns8
Language
singular
plural
6 91
13 13
15 45
76 10
Spanish prenda camisa
GARMENT
vˆetement chemise
GARMENT
SHIRT
French SHIRT
The fact that singular uses of superordinate hyperonyms are relatively rare explains why these are diachronically very unstable. The meaning SINGLE GARMENT of English cloth and clothing (OED), French habit (PR), Spanish ropa (Morreale 1973: 131), vestido (Clara P´erez, pers. comm.), and German Kleid is now extinct, only the plurale tantum or collective noun is preserved. Despite the problematic use of singular hyperonyms there are of course singular occurrences. They can be useful in some special contexts. Sometimes the exclusive disjunctive reading is exploited in unspecific reference that offers a free choice among basic-level concepts as in Example (5) alguna ropa vieja ‘any old garment’. Singular hyperonyms can also be justified in informal contexts. A speaker can show his or her contempt for an object by choosing not to use a more specific noun. Interestingly, pejorative hyperonyms such as rag or the Spanish equivalent trapo and many others start as count superordinates, not as collective nouns as most other superordinates do. As soon as they become conventionalized and lose their pejorative connotation, the plural is again the more frequent form. The case of trapo is one of clear polysemy. The older pejorative sense is frequent in its singular meaning TORN OLD RAG. The plural
Taxonomic and meronomic superordinates with nominal coding
371
trapos is the informal, but more conventionalized noun meaning CLOTHING (MOL). Nouns with a similar history are, for example, Argentinian Spanish pilchas and French guenilles and haillons (MOL, Guillemard 1991: 26) and a few others. The singular use is also common in learned contexts. Here, very general but precise ontological distinctions can be required. As a consequence count superordinates such as garment (OED) often have formal connotations and are not very common in everyday usage. They are useful when exact categorizations and definitions are needed: (6)
El antiguo y tradicional quechquemitl deriva su nombre del nahuatl quechtli . . . Es una prenda [emphasis mine] de vestir femenina . . . (CREA, Momprad´e/Guti´errez (1981): Indumentaria tradicional ind´ıgena, Tomo I, M´exico) ‘The ancient and traditional quechquemitl derives its name from Nahuatl quechtli . . . It is a ladies’ garment . . . ’
Interestingly, originally learned superordinates such animal, plant or vehicle are more frequently found in the singular form than the hyperonyms created in everyday language. The better such terms are integrated into everyday language the less frequent is the singular (see Mihatsch 2004, 2005, 2006, Chapter 4.). The influence of learned knowledge organization models such as logical taxonomies probably causes the emergence of most singular hyperonyms in the first place by (superficial) analogies to scientific superordinates. Thus the emergence of count hyperonyms is not a consequence of natural lexical change via entrenchment as in the case of group collectives and generic nouns, but of processes influenced by cultural scientific models.9
4.
Short frequent superordinates
Everyday superordinates such as garment with a strong dependence on basiclevel lexemes tend to be morphologically complex and not very frequent. Now there is a set of everyday superordinates such as English thing or stuff, German Sache and Ding, Spanish cosa and chisme, and French chose, truc and machin, which semantically belong to the uppermost categories of an ontology (cf. Goddard this vol.).10 These are morphologically simple, short and very frequent (cf. Vossen 1995: 377–378). Unlike everyday superordi-
372
Wiltrud Mihatsch
nates such as clothing that reinforce their links with basic-level nouns as they become more conventionalized, lexical items such as thing go the other way, they are subject to generalization: Old French ren THING (> Fr. rien NOTHING)
Italian roba STUFF
Latin res THING, PROPERTY
Germanic rauba BOOTY
Spanish res HEAD OF CATTLE
Spanish ropa CLOTHING
Figure 3. Superordinates between generalization and specialization
They do not establish any conceptual relations with more specific lexemes, the mental representation of skirt certainly does not correspond to a specification of thing, and thing is not based on a conjunctive or disjunctive representation founded on subordinates such as skirt, hammer, box, etc. (cf. Kleiber 1994: 21), although there is an extensional inclusion relation and a unilateral implication relation between phrases containing nouns of different levels of generalization such as thing and skirt. These relations exist, but they do not tell us anything about the mental representations of the underlying concepts. Nouns such as thing must be based on completely different conceptual representations. In fact, these are elements with grammatical rather than lexical functions, they are pure placeholders and do not refer independently (see Koch and Oesterreicher 1990: 104–109). The origins of most placeholder nouns can be traced back to certain discourse strategies. On the one hand, in informal situations, nouns with a pejorative potential are often employed to hide word-finding problems by pretending that the referent is not worth being designated by the correct label (see Mihatsch 2006). Thus, speakers can hide their ignorance of a correct word. This is how placeholders such as French truc and machin or Spanish chisme and trasto must have emerged. The older meanings of French machin and truc (< COMPLEX STAGE MACHINERY ) (DHLF), or European Spanish chisme (< BUG, i.e. A SMALL , USELESS , ANNOYING THING ) or Spanish trasto (< BULKY PIECE OF FURNITURE ) (DCECH) point to such a strategy.
Taxonomic and meronomic superordinates with nominal coding
373
On the other hand, in more formal contexts, a speaker can choose a learned word that stresses general distinctions in order to hide word-finding problems. This explains the development of the placeholder nouns such as English thing, German Sache and Ding, Spanish cosa, or French chose and rien on the basis of nouns meaning LEGAL AFFAIR , MATTER. Some of these (weakly) grammaticalized noun placeholders can be the starting point for the grammaticalization of indefinite pronouns, which are really pro-NPs (a thorough account for these processes can be found in Mihatsch 2006, Chapter 5.).
5.
Conclusion
In everyday language the mental representation of taxonomies such as thing – clothing – skirt – miniskirt does not correspond to a uniform organization on the basis of taxonymic inclusion relations on the level of intension, at least not above basic level, where the origins and the synchronic properties of the superordinate nouns point to a different conceptual organization. I have shown that everyday superordinates11 are skewed between placeholders with pragmatic and grammatical functions like thing, that do not establish any conceptual relations with more specific nouns, and lexical items based on the conjunction and disjunction of basic-level schemas. Since conjunctions are easier to process than the former, we find many collective nouns such as clothes or clothing, rather than count superordinates such as garment in everyday language.
Notes 1. I would like to thank Christiane Fellbaum, Andrea Schalley and Dietmar Zaefferer for their useful comments on an earlier version of this paper as well as Paul O’Neill (Madrid) for the stylistic revision of that paper. 2. At least on the basic level this also explains why co-hyponymy is more primitive than hyponymy, since in the case of co-hyponymy several gestalts can be globally compared by mental juxtaposition, whereas hyponym and superordinate belong to different levels of abstraction and cannot be compared on the basis of imagery. 3. As Talmy (this vol.) points out, sign languages share many, but not all, properties with spoken languages. Interestingly ASL superordinates show the same properties as those of spoken languages, they tend to be loanwords from spoken language based on the finger alphabet, or are collectives uniting several basic-level concepts (Newport and Bellugi 1978: 52–58).
374
Wiltrud Mihatsch
4. A detailed study that covers more conceptual domains can be found in Mihatsch (2006, Chapter 3.). 5. Many of these collective nouns undergo specialization and mean SUIT , for instance English suit, Spanish traje, German Anzug, French costume, or LADY ’ S DRESS , for example French robe, Spanish vestido, English dress, German Kleid, since both a suit and a dress can function as a whole outfit of a person. 6. There are other types of collectives with similar properties in this domain, e.g. wardrobe, which designates all items of clothing belonging to one person. These collectives show the same subsequent development as those meaning OUTFIT (Mihatsch 2006, Chapter 3.). 7. Sam Featherston, pers. comm. 8. The Spanish corpus data come from CREA, subcorpus 1994–1999, libros, novelas, Spain. The French data are taken from Frantext, subcorpus 1990–2000. 9. Maybe the adaptation to the prevailing count nouns (‘system congruity’, cf. Mayerthaler 1987: 52) in Germanic and Romance languages also plays a certain role here. 10. Hellwig (this vol.) shows that grammatical elements such as nominal classifiers also exploit higher ontological levels. 11. Many nouns on the higher levels of generalization such as object are technical terms, which can be more or less integrated into everyday language. Technical terms tend to be analytical and they are often based on a specification of a superordinate term (see Mihatsch 2004, 2005, 2006).
References Aitchison, Jean 3 2003 Words in the Mind. Oxford: Blackwell. Berlin, Brent 1972 Speculations on the growth of ethnobotanical nomenclature. Language in Society 1: 51–86. Bosque Mu˜noz, Ignacio 1999 El nombre com´un. In Gram´atica descriptiva de la lengua espa˜nola, Vol. 1, Ignacio Bosque Mu˜noz and Violeta Demonte (eds.), 3–75. (Real Academia Espa˜nola: Colecci´on Nebrija y Bello.) Madrid: Espasa Calpe. Cruse, D. Alan 1986 Lexical Semantics. (Cambridge Textbooks in Linguistics.) Cambridge: Cambridge University Press. Fellbaum, Christiane this vol. The ontological loneliness of verb phrase idioms. Goddard, Cliff this vol. Semantic primes and conceptual ontology. Guillemard, Colette 1991 Les mots du costume. (Collection le franc¸ais retrouv´e.) Paris: Belin.
Taxonomic and meronomic superordinates with nominal coding
375
Haspelmath, Martin in press Coordination. In Language Typology and Linguistic Description, Timothy Shopen (ed.). 2nd ed. Cambridge: Cambridge University Press. Hellwig, Birgit this vol. Postural categories and the classification of nominal concepts: A case study of Goemai. Iris, Madelyn Anne, Bonnie E. Litowitz, and Martha W. Evens 1988 Problems of the part-whole relation. In Relational Models of the Lexicon: Representing Knowledge in Semantic Networks, Martha W. Evens (ed.), 261–287. (Studies in Natural Language Processing.) Cambridge: Cambridge University Press. Johnson-Laird, Philip N., and Ruth M. J. Byrne 1991 Deduction. (Essays in Cognitive Psychology.) Hove/Hillsdale/London: Lawrence Erlbaum. Johnson-Laird, Philip N., Walter Schaeken, and Ruth M. J. Byrne 1992 Propositional Reasoning by Model. Psychological Review 99 (3): 418–439. Kintsch, Walter 1982 Ged¨achtnis und Kognition. Translated by Angelika Albert. Berlin/Heidelberg/ New York: Springer. Kleiber, Georges 1994 Nominales: essais de s´emantique r´ef´erentielle. Paris: Armand Colin. Klix, Friedhart ¨ 1984 Uber Wissensrepr¨asentationen im menschlichen Ged¨achtnis. In Ged¨achtnis, Wissen, Wissensnutzung, Friedhart Klix (ed.), 9–73. Berlin: Deutscher Verlag der Wissenschaften. 1993 Erwachendes Denken: geistige Leistungen aus evolutionspsychologischer Sicht. Heidelberg: Spektrum. Koch, Peter, and Wulf Oesterreicher 1990 Gesprochene Sprache in der Romania: Franz¨osisch, Italienisch, Spanisch. (Romanistische Arbeitshefte 31.) T¨ubingen: Niemeyer. Kuhn, Wilfried 1982 Formale Verfahren der Technik KOLLEKTION. In Apprehension. Das sprachliche Erfassen von Gegenst¨anden. Teil 2: Die Techniken und ihr Zusammenhang in Einzelsprachen, Hansjakob Seiler and Franz Josef Stachowiak (eds.), 55–83. (Language Universal Series 1, II.) T¨ubingen: Narr. Lakoff, George 1987 Women, Fire and Dangerous Things: What Categories Reveal About the Mind. Chicago/London: University of Chicago Press. Leisi, Ernst 4 1971 Der Wortinhalt: Seine Struktur im Deutschen und im Englischen. (UTB 95.) Heidelberg: Quelle & Meyer.
376
Wiltrud Mihatsch
Markman, Ellen M. 1985 Why superordinate category terms can be mass nouns. Cognition 19: 31–53. Markman, Ellen M., Marjorie S. Horton, and Alexander G. McLanahan 1980 Classes and collections: Principles of organization in the learning of hierarchical relations. Cognition 8: 227–241. Mayerthaler, Willi 1987 System-independent morphological naturalness. In Leitmotifs in Natural Morphology, Wolfgang U. Dressler, Willi Mayerthaler, Oswald Panagl, and Wolfgang Ullrich Wurzel (eds.), 25–58. (Studies in Language Companion Series 10.) Amsterdam/Philadelphia: John Benjamins. Mihatsch, Wiltrud 2004 Labile Hyperonyme. In Historische Semantik in den romanischen Sprachen, Franz Lebsanft and Martin-Dietrich Gleßgen (eds.), 43– 54. (Linguistische Arbeiten 483.) T¨ubingen: Niemeyer. 2005 Desterminologizaci´on y cambio sem´antico. In Ling¨u´ıstica en el texto – contribuciones de Argentina y Alemania, Guiomar Ciapuscio, Konstanze Jungbluth, and Dorothee Kaiser (eds.), 263–285. (Neue Romania 32.) Berlin: Freie Universit¨at. 2006 Kognitive Grundlagen lexikalischer Hierarchien untersucht am Beispiel des Franz¨osischen und Spanischen. (Linguistische Arbeiten 506.) T¨ubingen: Niemeyer. Miller, George A. 1998 Nouns in WordNet. In WordNet. An Electronic Lexical Database, Christiane Fellbaum (ed.), 23–46. (Language, Speech and Communication.) Cambridge, MA/London: The MIT Press. Miller, George A., and Philip N. Johnson-Laird 1976 Language and Perception. Cambridge/London/Melbourne: Cambridge University Press. Morreale, Margherita 1973 Aspectos gramaticales y estil´ısticos del n´umero (segunda parte). Bolet´ın de la Real Academia Espa˜nola 53: 99–206. Murphy, Gregory L. 2002 The Big Book of Concepts. Cambridge, MA: MIT Press. Murphy, Gregory L., and Edward J. Wisniewski 1989 Categorizing objects in isolation and in scenes: What a superordinate is good for. Journal of Experimental Psychology: Learning, Memory, and Cognition 15 (4): 572–586. Newport, Elissa L., and Ursula Bellugi 1978 Linguistic expression of category levels in a visual gestural language: A flower is a flower is a flower. In Cognition and Categorization, Eleanor Rosch and Barbara B. Lloyd (eds.), 49–71. Hillsdale, NJ: Lawrence Erlbaum.
Taxonomic and meronomic superordinates with nominal coding Oerter, Rolf 1988
377
Wissen und Kultur. In Wissenspsychologie, Heinz Mandl and Hans Spada (eds.), 333–356. M¨unchen/Weinheim: Psychologie Verlagsunion. Rosch, Eleanor, Carolyn B. Mervis, Wayne Gray, David M. Johnson, and Penny Boyes-Braem 1976 Basic objects in natural categories. Cognitive Psychology 8: 382– 439. Roulon-Doko, Paulette 1997 Structuration lexicale et organisation cognitive: L’exemple des zoonymes en gbaya. In Les zoonymes. Actes du colloque international tenu a` Nice les 23, 24 et 25 janvier 1997, Sylvie Mellet (ed.), 343–367. Nice: Publications de la facult´e des lettres, arts et sciences humaines de Nice. Taylor, John R. 2 1995 Linguistic Categorization. Prototypes in Linguistic Theory. Oxford: Clarendon Press. Talmy, Leonard this vol. The representation of spatial structure in spoken and signed language: A neural model. Ungerer, Friedrich, and Hans-J¨org Schmid 1996 An Introduction to Cognitive Linguistics. (Learning about Language.) London: Longman. Ungerer, Friedrich 1994 Basic level concepts and parasitic categorization. Zeitschrift f¨ur Anglistik und Amerikanistik 42: 148–162. Vossen, Piek 1995 Grammatical and Conceptual Individuation in the Lexicon. (Studies in Language and Language Use 15.) Amsterdam: IFOTT. W¨alchli, Bernhard 2005 Co-Compounds and Natural Coordination. (Oxford Studies in Typology and Linguistic Theory.) Oxford: Oxford University Press. Waxman, Sandra R., and Thomas Hatch 1992 Beyond the basics: preschool children label objects flexibly at multiple hierarchical levels. Journal of Child Language 19: 153–166. Wierzbicka, Anna 1985 Lexicography and Conceptual Analysis. Ann Arbor: Karoma. 1992 Semantics, Culture, and Cognition: Universal Human Concepts in Culture-Specific Configurations. New York/Oxford: Oxford University Press. Wisniewski, Edward J., and Gregory L. Murphy 1989 Superordinate and basic category names in discourse: A textual analysis. Discourse Processes: A Multidisciplinary Journal 12 (2): 245– 261.
Motion events in concept hierarchies: Identity criteria and French examples Achim Stein 1.
Introduction
This contribution has the following goals: On the practical side, it will give a critical account of the representation of verb meanings in existing concept hierarchies. It will also try to pin down the critical points of these representations. On the theoretical side, it will make suggestions for more general principles of verb meaning representation in concept hierarchies. The prerequisite is a detailed semantic analysis: verb meanings are not considered as atomic formulas, but as having a composite internal structure. This approach is based on and inspired by work on ontology engineering (Guarino 1998) and different lexical semantics approaches (Jackendoff 1990; Pinker 1989; Parsons 1990). A lexicon should be a formally precise and efficiently accessible inventory of the meanings of the individual words of a language. For each of the lexical items it contains, it should specify its meaning in a transparent form which is compatible with the way in which the theory characterises linguistic content or which the system can readily use for its semantic computations. Concept hierarchies can represent such lexical information efficiently. From a linguistic point of view, the ideal representation of verb meanings contains information about (i) argument structure: what are the selectional restrictions of the arguments? (ii) syntactic behaviour: do the verbs participate in alternations? (iii) polysemy: what are the potential changes of meaning in context? and (iv) inferences: what kind of information can be drawn from the represented meaning? The most efficient hierarchy of verb meanings would, of course, be a representation structured according to principles which allow for the inheritance of a maximum of these types of information. But it is obvious that the four parameters listed above can not be combined in a satisfactory way. Verbs with similar meanings allow similar inferences, but they do not participate in the same alternation types (Levin 1993) nor do they necessarily realise their arguments in the same way, and it is even more difficult to make predictions about potential changes of meaning.
380
Achim Stein
Nevertheless, I claim that efficient structuring of verbal meanings is not impossible if the criteria operate on an appropriate level of semantic description. An analysis of verb meanings in terms of semantic decomposition into sub-categories can provide these criteria. Work in various fields has revealed the importance of such subcategories for the cognitive structures of the lexicon and the syntactic representation of arguments (Jackendoff 1983), for the lexicalisation patterns in various languages (Talmy 2000), and for the influence of subevents on syntactic behaviour (Pinker 1989). In contrast to the decompositional approach, most ontologies operate on purely relational principles: normally, they interlink word meanings disregarding their internal structure (although the system may reflect some aspects of decomposition, such as the “unique beginners” in WordNet (cf. Fellbaum 1999: 93). However, I do not want to suggest that the hierarchy has to represent the components in a particular way, but rather that an in-depth semantic analysis using decomposition will provide the criteria which are necessary for a coherent structure of verb meanings. Recent works in formal ontology express the need of a comprehensive analysis of lower hierarchical levels of the event domain (e.g., Gangemi, Guarino, and Oltramari 2001). In this contribution, I will address linguistic methods of semantic analysis and ontological principles in order to shed some light on current (Section 2.) ways of representing verb meanings in concept hierarchies and make suggestions for alternatives (Section 3.), especially with respect to the representation of polysemy and vagueness.
2.
Verbs in existing concept hierarchies
First of all, I will introduce some linguistically relevant concept hierarchies and focus on how they represent the meanings of two French “motion verbs”, pousser ‘push’ and glisser ‘slide’, which have been selected because their internal structure represents different degrees of complexity. I will briefly explain how the subsumption relation is encoded in existing representation systems, how the motion domain is structured, and I will also have a look at the analysis and classification of the concepts ‘push’ and ‘slide’. 1) In the Mikrokosmos ontology1 PUSH is defined as ‘to move in a direction away from a force’ and classified as an instance of APPLY- FORCE in the domain PHYSICAL - EVENT. This analysis sets off push, pull and simi-
Motion events in concept hierarchies
381
lar verbs from CHANGE - LOCATION verbs like arrange. There is no concept for ‘slide’. Most verbs of combining (cf. Levin 1993: 159–164) are in the CHANGE - POSITION domain which is immediately subordinated to CHANGE LOCATION and contains a surprisingly heterogeneous set of concepts: typical instances are COIL , ENTWINE and ROTATE, but the category also contains PUT ‘to move an object to a location where it is to remain in equilibrium’, and CLOSE ‘to close an object that can be opened or closed’. It is obvious that the change of position is true only of subevents of these situations, i.e. of the intended result of putting (at least if we assume the correctness of the given definition), or of parts of the affected object, as in close the lid. 2) FrameNet has entries for more than 1700 verbs. Lexical units are linked to frames. Frame inheritance is a relationship in which a child frame is a more specific elaboration of a parent frame. In such cases, all of the frame elements, subframes, and semantic types of the parent have equally or more specific correspondents in the child frame (Baker, Fillmore, and Lowe 1998). The lexical unit for push is linked to three Frames:2 M ANIPULATION , C AUSE - MOTION and I NFLUENCE OF EVENT ON COGNIZER. The manipulation reading is defined as ‘exert force on (someone or something) in order to move them away from oneself’ – a satisfying definition, which however does not match the example quoted from the FrameNet website in (1a). The cause-to-move reading is defined as ‘An Agent causes a Theme to undergo directed motion. The motion may be described with respect to a Source, Path and/or Goal’ (1b). The mental reading is not an issue here. (1)
a. b.
She closed the wooden gate and PUSHED a brick against it with her foot. She PUSHED his chair clear of the desk.
The lexical unit for slide is linked to the Frame C HANGE POSITION ON A SCALE, which accounts only for abstract (non-local) interpretations of slide and represents verbs like increase, diminish, triple etc. 3) The Upper Model (UM, cf. Bateman, Magnini, and Fabris 1995) is largely inspired by systemic-functional linguistics. The top level of the hierarchy of situations (in UM terms “process hierarchy”) is an elaboration of the process types distinguished by Halliday and others (Halliday 1985). The characteristic feature of the UM is the close relation between the process and its “participants”, e.g. for shoot: shoot (murderer:A, murdered:B, time:T, place:P).
382
Achim Stein
Motion processes are material processes in the domain of “Doing and Happening Configurations”. They are characterised by the roles ACTOR and ACTEE. Constructions which are related by syntactic alternations are located in different subdomains of “Doing and Happening”, e.g. causative constructions in directed and inchoative constructions in non-directed actions. Since the Upper Model has been conceived as an “interface ontology” (Bateman 1992), it does not describe the semantics of words, and therefore no concrete lexical entries for our sample verbs are available (cf. Farrar, this vol., Section 2.). 4) In WordNet, the relation between verbal expressions realizing a concept and its superordinate concept is named “troponymy” and paraphrased “To V1 is to V2 in some particular manner” (Fellbaum 1999: 79). Still in WordNet terminology, a concept “entails” its superordinate (e.g. marching entails walking), and both processes are temporally coextensive. According to WordNet principles, the verbal domains have a very flat structure. “MOVE , DISPLACE” is the beginner of the domain and has over ninety immediate subnodes. It is typical for WordNet that concepts do not have an inherent definition, but are defined by their relation to other concepts. CAUSE TO MOVE :1 has a causal link to its result (MOVE :14, CHANGE POSITION :1). A closer look at the synsets which form the motion domain reveals that the notion of motion is rather broadly defined. Strictly speaking, motion results in the change of location of the object that is moved, and this central notion is indeed present in the synsets referring to the manner of motion (pull, carry, tug etc.), to the direction (expel, propel, lower etc.), and to object-specific motion (wave, pour, spill etc.). It is not central to the synsets referring to change of position or configuration (agitate, stir, unwind etc.), as well as changes in the environment of the affected object (wrap, encircle etc.). Figure 1 shows how the different readings of pousser and glisser are accounted for in the French EuroWordNet.3 The hierarchy goes from left to right, with the two beginners, causative D E´ PLACER and inchoative SE ´ PLACER in the left column, and the subordinated notes in the column to DE the right. For clarity, we indicated the corresponding English synset along with the French synset. The local readings of pousser and glisser (as opposed to the mental readings) extend over three domains headed by MAKE MOVE, MOVE, and TOUCH. (A) The MAKE MOVE-domain contains agent-controlled situations: POUSSER :5 (no example given) is superordinate to POUSSER :11 which is synonym with donner une chiquenaude ‘to flick one’s finger at s.b.’ as well
Motion events in concept hierarchies Top
Level -1 move forward:1 pousser:5
make move:1 d´eplacer:5
push:8 force:15 pousser:6
force to move:1 pousser:10 slide:10 glisser:3 move heavily:1 pousser:17 rise:17 monter:7 move:15 se d´eplacer:1
touch:18 toucher:5
glide:4 glisser:4 slide:9 glisser:2 press:17 presser:8
Level -2 flip:8 flick:4 . . . pousser:11 push forward:1/out:1 pousser:4 push away:1/aside:2 pousser:7 thrust:7 pousser:14 prod:5 nudge:2 pousser du coude:1 shove:3 jostle:2 pousser:21
383
Level -3
stab:2 . . . pousser:12 jog:3 pousser:20
go up:3 be built:1 pousser:19 slide:8 glisser:1 skim:7 plane:7 glisser:6
push:9 pousser:8
Figure 1. pousser and glisser in the French EuroWordNet
as six other concepts (not cited here) which refer to combinations of bodypart movement and displacement of objects (pousser avec le pied ‘kick’, frapper ‘hit’, lancer ‘throw’ etc.). POUSSER :6 (ex. He pushed the table into a corner) is the central concept with six subordinated readings. These are more specific with respect to (a) the direction of pushing as in POUSSER :4 ‘push forward’ and POUSSER :7 ‘push out of the way’, (b) the manner of pushing as in POUSSER :14 (ex. He thrust his chin forward) which in turn is superordinate to thrusting events like POUSSER :12 (synonyms: stab, prod, poke, jab, dig) or POUSSER :20 ‘give a slight push to’, and (c) the intensity or acceleration of
384
Achim Stein
pushing, as in POUSSER :21 (synonyms shove, jostle). Finally, POUSSER :10 (synonyms force to move, force out, displace) as well as GLISSER :3 (ex. He slid the money over to the other gambler) refer to situations where the object is displaced. (B) The MOVE-domain contains POUSSER :18 (‘of vehicles, such as streetcars’, synonym trundle) and POUSSER :19 which refer both to situations of uncontrolled motion. This use of pousser is not an alternation of the causative readings cited above, contrary to GLISSER :1 (‘slide sideways’) and GLISSER :6 (‘travel on the surface of water’), both subsumed by GLISSER :4 which is the inchoative construction to GLISSER :3. (C) The TOUCH-domain contains one single reading of pousser: POUSSER :8 (ex. She pushed against the wall with all her strength) is defined as ‘press against forcefully without being able to move’.
3.
Towards a more consistent representation of verb meanings
The concept hierarchies discussed in Section 2. are not comparable: WordNet and FrameNet are structured mainly according to cognitive principles, the Mikrokosmos ontology is a part of a system for knowledge-based machine translation, and the Upper Model has not been designed for specific applications at all and provides a general framework for NLP instead of a description in lexical semantics. It is therefore not surprising that most of the hierarchies give a poor account of our two verbs, without regarding polysemy or facets of meaning (we shall differentiate between types of ambiguity below). The most detailed account of polysemy is given by the relational lexical semantics approach of WordNet, which, on the other hand, pays less attention to issues which matter for application-oriented systems, like argument structure. The following subsections will discuss two questions: the status of argument structure as a link between lexicon entries and concepts (Section 3.1.), and the relations between concepts within hierarchies (Section 3.2.).
3.1.
Argument structure
Our view of how argument structure can be represented in hierarchical systems has been realised in SIC, a semantic lexicon for Italian, where lexicon entries are interlinked with nodes in a concept hierarchy.4 The top levels of
Motion events in concept hierarchies
385
process classification have been inspired by the Upper Model approach, and similarly to Mikrokosmos, the lexicon is independent of the concept hierarchy. Each entry has a syntactic frame and is linked to a concept in the concept hierarchy. Thus, SIC attempts to combine semantic and syntactic structuring principles: whereas the syntactic realization is specified in the lexicon, the semantic restrictions are specified at the conceptual level by links between the verbal concept and nodes of an object hierarchy. However, and this distinguishes this approach from the Upper Model and from FrameNet, entries for selectional restrictions are not thematic roles but concepts taken from an object hierarchy. Whenever a verb meaning is instantiated, compatibility rules check if a given word satisfies the sortal restrictions in the argument slot. A sortal restriction is satisfied if the concept related with the argument is equal to or subsumed by the concept of the argument slot, or else if such a concept can be accessed via a regular polysemy rule. The top nodes of the SIC hierarchy are M ATERIAL (with the subnodes D OING and H APPENING), R ELATIONAL, M ENTAL, and V ERBAL (similar to the Upper Model and to the DOLCE upper level ontology),5 followed by nodes which may be abstract (i.e. need not have verb meanings assigned to them). Due to this structure of the top level, syntactic alternations are represented like verbal polysemy. For example, the causative construction is attached to a subnode of D OING, whereas the inchoative construction is attached to a subnode of H APPENING. Note that although theoretical considerations concerning ontology design were taken into account, the main concern while constructing the SIC hierarchy was the creation of a system which automatically determines the correct meaning of a verb from its context (i.e. from its arguments). The next issue to be addressed are therefore the criteria for the definition of the subsumption relation between subordinated verbal concepts in a hierarchy. 3.2. 3.2.1.
Identity and unity of verbal concepts Theoretical prerequisites
Identity and unity, the two major criteria for ontological construction, are defined as follows: Strictly speaking, identity is related to the problem of distinguishing a specific instance of a certain class from other instances by means of a characteristic
386
Achim Stein
property, which is unique for it (that whole instance). Unity, on the other hand, is related to the problem of distinguishing the parts of an instance from the rest of the world by means of a unifying relation that binds them together (not involving anything else). (Guarino and Welty 2001: 112)
The important issues with respect to verbal concepts are therefore the definition of their invariant properties and their external delimitation. Properties are named “essential” if they hold irrespective of time and in all possible worlds, and “rigid” if they are essential to all the instances of an individual. These theoretical assumptions leave many problems of linguistic verb analysis unresolved.6 The notions of identity and unity both require a neat distinction of the essence and status of subevents. More precisely, we have to distinguish the subevents which are a proper part of the concept from those which are merely related to the situations represented by the concept. Only the former have to be considered for the definition of subsumption conditions, the latter belong to encyclopaedic knowledge which could be included in a frame-like representation without being a proper part of the concept. However, it is not always easy to draw this distinction clearly. Linguistic, cognitive and empirical evidence may yield different results.
3.2.2.
Subevents of verbal concepts
In what follows, we will concentrate on the concept PUSH and first give an intuitive account of its essential parts. Three questions will make clear what exactly happens when A pushes B: (I) Does B move from X to Y? Not necessarily: pousser can refer to situations where A pushes B without making B move. Levin (1993) puts the verb in the group “Verbs of Exerting Force”, very similar to the WordNet classification (cf. Figure 1) of POUSSER :8 under TOUCHER as in (2): (2)
Max a pouss´e la table (mais elle n’a pas boug´e). ‘Max pushed at/on/against the table (but it didn’t move).’
(II) Does A move from X to Y? Again, this is not a necessary part of the situation. Levin puts this reading in the group of “Carry verbs”, and for WordNet, PUSH :8 is a particular way of moving (MAKE MOVE :1). The sentence (3), which in Talmy’s terms would be an example for onset causation (cf. Talmy 2000: 473), expresses this situation:
Motion events in concept hierarchies
(3)
387
Assis dans son fauteuil, il a pouss´e la boˆıte dans un coin. ‘Sitting in his armchair, he pushed the box into the corner.’
(III) Does A make a movement (with his body or a bodypart)? Yes, and this seems to be an essential part of situations expressed by push. The contradiction of this subevent produces unacceptable sentences (4a), and adverbials which focus on this subevent are acceptable (4b). (4)
a. b.
??Il a pouss´e la boˆıte sans bouger. ‘He pushed the box without moving.’ Il a pouss´e la boˆıte e´ nergiquement (mais elle n’a pas boug´e). ‘He vigorously pushed (at) the box (but it didn’t move).’
To resume, this first intuitive analysis shows that the concept of pushing denoted by pousser essentially belongs to the domain of bodypart movement. Neither the motion of the object nor the movement of the agent are essential parts of the situation. If they are present, they associate the situation with its result (object motion), with the manner of causation (accompanied motion) or both. However, since they are not essential to the situation, they can not define the concept as such. For Jackendoff, verbs like push describe a continuous causation of motion (which, according to Pinker (1989), accounts for the ungrammaticality of double object constructions like *Sam pushed Bill the ball). Contrary to pure causatives like break, a pushing event has “no determined outcome”7 and therefore push cannot express the result of the action in an inchoative construction. This analysis coincides with the definition of pousser in the Petit Robert, which focuses on the exertion of force, of which the motion event is only a possible consequence (“Soumettre (qqch., qqn) a` une force agissant par pression ou par choc et permettant de mettre en mouvement, et de d´eplacer dans une direction.”, Petit Robert 1993). This explains why (4b) is acceptable despite the explicit negation of the motion result. Therefore, the minimal intension of the concept denoted by pousser should be subsumed by a concept ‘exert pressure’ (which is the case for WordNet PUSH :9). An alternative approach to this analysis, which has isolated a kind of archiseme (the meaningful common part of all the situations pousser denotes), is the decomposition of the situation into its subevents. We adopt a neo-Davidsonian notation similar to Parsons (1990)8 for a more complete representation of ‘push’: the initial movement (em ) of the Agent or a bodypart of the Agent culminates in a contact (ec ) with the Theme, whose displacement
388
Achim Stein
(ed ) culminates in a resulting state (Be-at). In (5) the situation is instantiated for the sentence Max pushed the box behind the wall. Culm is the culmination of an event and Hold the holding of a durative event or a state at a time t.9 (5)
∃e [Push(e) & Ag(e,Max) & Th(e,box) & Culm(e,t) & ∃em [Move(em ) & Ag(em ,Max) & Culm(em ,tm ) & ∃ec [Contact(ec ) & Ag(ec ,Max) & Th(ec ,box) & Hold(ec ,tc ) & ∃ed [Displace(ed ) & Ag(ed ,Max) & Th(ed ,box) & Culm(ed ,td ) & ∃sb [Be-at(s) & Th(sb ,box) & Behind(sb ,wall) & Hold(sb ,tb )]]]]]
The sentence Max pushed the box behind the wall is true if the event it refers to contains minimally the following subevents: (i) Max’s movement, (ii) his contact with the box, and (iii) the displacement of the box, such that (iv) the box comes to be behind the wall (the final location of the box need not be reached for the sentence to be true, but telicity is not an issue here). Although the analysis in (5) contains Time, it makes no assertion as to the temporal relation between the subevents, and this reflects the vagueness of pousser demonstrated below (6). The contact can hold before Max moves, and it can still hold after the box has stopped moving. In the two most frequent situations of pushing, the contact is already established when the movement of Max begins and it is maintained as long as the box is moving (accompanied motion) or it is limited to a moment of transition between the accomplishment of the movement and the beginning of the displacement (onset causation, as in (3)). However, the time interval in which the contact holds can occupy any intermediate position on the scale between the beginning of Max’s movement and the end of the motion of the box.10 The application of Lakoff’s ambiguity test (cf. Lakoff 1970) clearly shows that pousser is not ambiguous but vague with respect to these temporal relations: in (6a), both events of pushing can refer to either of the two readings independently of each other (with ambiguous verbs, the same reading must be present in both propositions, as in John hit the wall, and so did Bill). The introduction of modifiers which favour one or the other reading in (6b) does not produce a contradiction. (6)
a.
b.
Max a pouss´e le carton dans un coin, et la chaise dans l’autre. ‘Max pushed the box into one corner, and the chair into the other one.’ Max a prudemment pouss´e le carton dans un coin, et la chaise avec son pied dans l’autre.
Motion events in concept hierarchies
389
‘Max carefully pushed the box into one corner, and the chair with his foot into the other one.’ On the other hand, vagueness is restricted by the general assumption that whenever two subevents of a movement event are temporally adjacent, then their paths are spatially adjacent, and vice versa (Krifka 1998). This excludes for example situations with detached initial or final events. Consequently, ‘push’ cannot refer to situations where the agent makes no movement, where no contact is present at all, or where the contact ceases before the displacement starts. The culmination of ‘push’ (t) must occur within the interval delimited by tc and td . In order to state the temporal relations between the subevents more clearly I use the following ordering relations: