194 39 70MB
English Pages 240 [244] Year 1993
Linguistische Arbeiten
299
Herausgegeben von Hans Altmann, Peter Blumenthal, Herbert E. Brekle, Gerhard Heibig, Hans Jürgen Heringer, Heinz Vater und Richard Wiese
Ralf Meyer
Compound Comprehension in Isolation and in Context The contribution of conceptual and discourse knowledge to the comprehension of German novel noun-noun compounds
Max Niemeyer Verlag Tübingen 1993
Die Deutsche Bibliothek - CIP-Einheitsaufnahme Meyer, Ralf : Compound comprehension in isolation and in context : the contribution of conceptual and discourse knowledge to the comprehension of German novel noun-noun compounds / Ralf Meyer. -Tübingen: Niemeyer, 1993 (Linguistische Arbeiten ; 299) NE:GT ISBN 3-484-30299-2
ISSN 0344-6727
(D 700 Fachbereich Sprach- und Literaturwissenschaft) © Max Niemeyer Verlag GmbH & Co. KG, Tübingen 1993 Das Werk einschließlich aller seiner Teile ist urheberrechtlich geschützt. Jede Verwertung außerhalb der engen Grenzen des Urheberrechtsgesetzes ist ohne Zustimmung des Verlages unzulässig und strafbar. Das gilt insbesondere für Vervielfältigungen, Übersetzungen, Mikroverfilmungen und die Einspeicherung und Verarbeitung in elektronischen Systemen. Printed in Germany. Druck: Weihert-Druck GmbH, Darmstadt Einband: Hugo Nadele, Nehren
Contents Vorwort 0
Introductory Remarks 0.1 Types of Noun-Noun Compounds 0.2 Overview of the Work
1
Some Requirements for a Semantic Theory of Novel Noun-Noun Compounds 1.1 Dynamics of Compound Meaning 1.2 Knowledge Dependence of Compound Meaning 1.3 Discourse Dependence of Compound Meaning 1.4 Underdeterminacy of Compound Meaning 1.5 The Choice of the Representational Language 1.6 Discussion and Conclusion
2
3
ΓΧ XI XIII XV
Previous Work on Word Semantics and Compounding 2.1 Theories of Word Meaning 2.1.1 Default Information as a Basis for Dynamics of Word Meaning . . 2.1.2 Lexical Stereotypes as a Basis for Dynamics of Word Meaning . . . 2.1.3 The Two-Level-Approach for Semantics 2.2 Conceptual Categories for Compounds 2.3 The Interpretation of Nominal Compounds in Montague-Grammar . . . . 2.4 Other Semantic Approaches 2.5 Novel Noun-Noun Compounds as Anaphorical Elements: Discourse Influence on Compound Interpretation 2.6 Psycholinguistic Models on Concept Combination as a Basis for Compound Interpretation 2.7 The Interpretation of Compounds in Artificial Intelligence 2.8 Discussion and Conclusion Categorization of the Compounding Mechanism into a General Theory of Word Formation 3.1 Word Formation as a Part of the Syntax 3.2 Word Formation as an Autonomous Module 3.3 Word Formation as a Property of the Conceptual System 3.4 Discussion and Conclusion 3.4.1 Conclusion One: The Structure of Lexical Knowledge 3.4.2 Conclusion Two: On the Morphology of Noun-Noun Compounds .
1 2 4 6 7 9 11 12 12 12 15 16 21 22 25 26 29 34 36
39 40 45 52 56 57 58
VI
4
A Semantic Model for the Integration of Conceptual and Discourse Knowledge 61 4.1 4.2 4.3 4.4
4.5 4.6 4.7 4.8 4.9
5
On Genericity of the Modifier 62 Discourse Representation Theory DRT 65 Syntax and Semantics of DRLC 69 Knowledge Representation in DRSs 72 4.4.1 Properties of Knowledge Representations 72 4.4.2 Knowledge Representation in KL-ONE Lookalikes 75 4.4.3 Translation of TBox-Expressions into DRLC 78 Lexical Meaning of Nouns: Approaching the Two-Level Semantics in DRT 81 Syntax and Semantics of DRLiex 84 Lexical DRSs for Nouns and NN-Compounds 89 Conceptual Shifts of Single Nouns 93 Conclusion 100
Relational Ambiguity of Isolated Novel Noun-Noun Compounds
102
5.1 5.2 5.3 5.4
Relational Nouns and Sortal Nouns in NN-Compounds 104 Possible Sources for Relations in NN-Compounds 109 Relations from Lexical Representations Ill The Conceptual Basis 114 5.4.1 Prototypical Properties of Substances 116 5.4.2 Spatial Functions of Three-Dimensional Objects 123 5.4.2.1 A Short Digression into Semantics of Local Prepositions . 125 5.4.2.2 Location Relations in Novel NN-Compounds 127 5.4.3 The Made-Of, Part-Of and Has-Part Relations 134 5.4.4 Object-Specific Relations 139 5.4.5 Conjunctive Compounds 141 5.4.6 Summary and Conclusion 143 5.5 A Network of Compounding Rules 146 5.6 Conceptual Shifts of Novel NN-Compounds 149 5.7 Summary and Conclusion 157
6
Utterance Meanings of Novel Noun-Noun Compounds in Discourse 6.1
6.2
Current Assumptions on Discourse Comprehension and Their Implications for Semantics of Novel NN-Compounds 6.1.1 Local Discourse Constraints 6.1.2 Global Discourse Constraints Some Preceding Remarks 6.2.1 Extensions of DRLUx 6.2.2 The Treatment of Definite NPs 6.2.3 The Principle for Novel NN-Compound Interpretation in Context
160 161 162 163 165 166 167 . 169
VII
6.3 6.4 6.5
6.6 6.7 6.8 6.9
Conjunctive Compounds in Discourse Script-Driven NN-Compound Interpretation Giving of a Relation in Discourse 6.5.1 Anaphoric Compounds 6.5.2 Cataphoric Compounds Compounds in Discourse without Anaphoric Links Genericity of the Modifier Revisited Conceptual Shifts of Novel NN-Compounds Revisited Discussion and Conclusion
171 183 189 189 196 198 206 210 211
7 Summary and Outlook 7.1 The Model for Computing Utterance Meanings of Novel Noun-Noun Compounds 7.2 Open Problems
214
References
217
214 215
Vorwort Die vorliegende Arbeit entstand während meines dreijährigen Aufenthaltes am MaxPlanck-Institut für Psycholinguistik in Nijmegen, Niederlande, und wurde durch ein Stipendium der Max-Planck-Gesellschaft zur Förderung der Wissenschaften ermöglicht. Dank gebührt vielen Leuten, die mit Ratschlägen und Kritik die Entstehung der Arbeit begleitet haben. Hierbei ist zuerst Prof. Dr. Siegfried Kanngießer zu nennen für seine allgemeine fachliche Unterstützung und die Begutachtung der Arbeit. Nicht zu vergessen seien aber auch Manfred Börkel, Peter Bosch, Veronika Ehrich, Ino Flores d'Arcais, Helmar Gust, Henk van Jaarsveld, Antje Roßdeutscher und Heinz Vater für Ratschläge und wertvolle Hinweise sowie die gesamte PhD-Gruppe am MPI in Nijmegen für die gesellige Atmosphäre am Institut. Frau Jane Welsh übernahm die Endkorrektur des Englischen, was sicherlich manchmal nicht ganz einfach war. Mein besonderer Dank richtet sich jedoch an Prof. Dr. Manfred Bierwisch für seine Bereitschaft, diese Arbeit in Nijmegen zu betreuen und sein stetiges Interesse an meinen Überlegungen, die sich in vielen intensiven Diskussionen niedergeschlagen hat und die Grundlagen dieser Arbeit mit geprägt haben.
Heidelberg, im April 1993
Ralf Meyer
0
Introductory Remarks "A theory of the sort I have tried to develop [ i.e. the DRT ] brings to bear on the nature of mental representation and the structure of thought, a large and intricate array of data relating to our (comparatively firm and consistent) intuitions about the truth conditions of the sentences and sentence sequences we employ. I very much hope that along these lines it may prove possible to gain insights into the objects of cognitive operations, as well as into these operations themselves, which are unattainable if these data are ignored, and which have thus far been inaccessible to psychology and the philosophy of mind precisely because those disciplines were in no position to exploit the wealth of linguistic evidence in any systematic fashion." Kamp (1984:6)
My decision to write this work arises from two contrasting facets I became familiar with during my studies. One is the notorious difficulty in explaining meanings of nominal compounds. Existing explanations are based on problematic and less satisfying notions. This is an issue to be found in the field of word semantics in general. The other field is provided by the fascinating machineries afforded in semantics and the area of knowledge representation - which in turn is developed in the domain of Artificial Intelligence (AI) with their strong connection to logic. I believe that both fields, semantics and knowledge representation, provide by the use of logical languages important information on various aspects of meaning. Their application might even take into account psychological aspects if they are considered as models of human concept representation and reasoning. Hence there is no need to presume a principal distinction between logic (or mathematics) and psychology1. The goal of my work is to demonstrate that the problem of noun-noun compound meaning calls for the tools developed in modern semantics and ΑΙ-research in order to get an adequate representation of compound denotations as well as a contextsensitive compound interpretation mechanism. Every serious semantic theory must work with a logical language in order to be able to get a grasp of the central concept of semantics, i.e truth of an expression. Logic and semantics are inseparably connected with each other and some semanticians even claim that semantics is logic. However, there is a general tendency within the last ten years or so to replace powerful semantic theories, developed in the tradition of Montague's work, by semantic theories based on simpler logics enabling deduction and making considerable use of structured domains and knowledge of the internal structure of the language to be modelled. One reason - and for the ideas I want to develop the most important one - for turning from these complex logics to simpler artificial languages as a basis for semantics is the belief that such a strong theory as Montague-Grammar actually cannot deal with the general problem of semantics: one has to distinguish the language-determined view on the world from the categorization of what is comprehended in the conceptual system. There is a difference between what is said and what is meant and this gap has to be filled by contextual information. 1
A problem discussed by Partee (1979)
XII One of these new theories is the Discourse Representation Theory (DRT). Its original version is the attempt to replace the traditional notion of truth by the more dynamic concept of embedding a representation, which is a partial model built by means of syntactically-driven construction rules, into a complete model. This two-step-process makes it possible to exceed the borders of single sentences and to take into account information given in real discourses so that a more adequate representation and processing of cognitive 'entities may be possible. However, while the DRT might turn out to be a development in semantics that points the way ahead, currently it still suffers from inadequacy, borrowed from traditional approaches2, by representing words as no longer analyzable items which are not exposed to any contextual influence. The necessity of taking into account detailed analyses of word meanings in order to draw correct inferences, based on the meaning of the sentence where the word occurs, is recognized by ΑΙ-researchers on language comprehension, as well as by more cognitively oriented semanticians. ΑΙ-research is characterized by the axiom that knowledge of existing entities in the world and the properties they have and how they change in various states, is the crucial source for language comprehension in general and word meaning in particular. The time is ripe for combining principles of DRT with developments in AI and related work on concepts and conceptual structures in order to get a theory of discourse comprehension that takes into account systematic meaning variabilities of the discourse's basic constituents by means of contextual influence; i.e. the words. First and foremost, such a unification requires the replacement of the notion of 'model'. The main idea behind the DRT is the construction of an intermediate level between discourse - called discourse representations - and what is called model or domain. Starting with the discourse, the intermediate level is constructed and afterwards embedded in the model. The notion of 'model is thought to be an abstract representation of our world (and sometimes even for all possible likely worlds). Generally, it consists of all existing entities as well as all existing relations holding between them. A model in the traditional sense as the representation of reality, however, does not tell anything about assumptions a person has about reality. Human beings often draw theories about the world that do not meet reality. Hence, a second intermediate level should be introduced, representing a 'projected world' or a 'conceptual system' that is organized by experience, perception and other kinds of knowledge aquisition: Lexical Items | Related to Conceptual System | Mapping Model (Abstraction of the world) 2
To my knowledge, there U only one work under the paradigm of Montague-Grammar where lexical decomposition is carried out, namely Dowty's (1979) 'Word Meaning and Montague Grammar'.
XIII The extensions of lexical items are concepts (more specifically, families of concepts) and the extensions of concepts are entities within the model. Discourse representations are built by syntax-driven rules operating with lexical items. Therefore their extensions will be complex configurations of concepts. Discourse representations are as it were embedded in the conceptual system. This work is an attempt to develop a theory of meaning variability for a particular class of words that is presumably exposed to contextual influence as no other class of words: novel noun-noun compounds (NN-compounds). German especially is a very productive language in generating novel NN-compounds. Each issue of any newspaper holds several NN-compounds the reader never read before but is able to comprehend in the given context. In isolation, most of these compounds are ambiguous. For example, a compound like Museumsbuch ('museum book') can mean 'book about museums' as well as 'book bought in a museum' or 'book published by a museum'. Büchermuseum ('books museum') can mean, among others, 'museum exhibiting on books' or 'museum described in a certain book'. The relations possible are ordered however with respect to applicability; some are more salient than others. These relations are based on different kinds of knowledge, each having implications for prominence of the relation. Within a certain text, however, ambiguity mostly disappears; processes interacting between text and compound lead to a context-specific meaning of the compound. The kinds of processes and their interactions with knowledge types responsible for possible compound meanings are explained in this work.
0.1
Types of Noun-Noun Compounds
There are several kinds of noun-noun compounds (NN-compounds) and I will define them briefly. Noun-noun compounds are compounds belonging to the class of nominal compounds. These are all compounds with a noun as head. Hence this term also includes adjective-noun compounds and verb-noun combinations, among others3. NN-compounds are combinations of two - simple or deverbal - nouns. Combinations of two simple nouns are called root compounds. Combinations with a head derived from a verb by affixation are called synthetic, primary or sometimes verbal compounds. In this work I will mainly be concerned with root compounds. Lexicalized NN-compounds are NN-compounds which are stored permanently in the lexicon. They have a fixed meaning that does not have to be computed each time the compound is used. Their meaning is stored as semantic component of the lexical entry. Whether a compound becomes lexicalized depends on its frequency in everyday use and its importance for the hearer. For example, technical terminologies contain many NN-compounds. People working in a particular area with specialized terminology store meanings of particular NN-compounds occuring in this terminology for economic 3
Which kinds of combinations exist is discussed in chapter two.
XIV reasons. The laity however must try to compute meanings of such NN-compounds without specialized knowledge in that area. One important distinction between lexicalized and non-lexicalized NN-compounds is the ability of the former to have unpredictable meanings as their semantic component. Fixed meanings of lexicalized NN-compounds can differ from meanings that can be computed on the basis of knowledge of the constituent extensions. For example Doktorvater (lit.: 'doctor father'; its meaning is 'supervisor') in the academic world means 'a professor who is in charge of a graduate student studying for a doctoral degree'; he looks after him 'like a father'. The transparent and non-lexicalized meaning would be 'father of a doctor . So lexicalization can fix non-transparent meanings of compounds; transparent meanings, of course, can be lexicalized as well. Deictic NN-compounds are non-lexicalized NN-compounds that get their meaning by means of the non-verbal context. This can be an ostensive act or other situative factors. For example, one can point to a chair with a puddle of juice on the seat and say: 'Setz dich nicht auf den Saftstuhl da!' (Don't sit on the 'juice seat'!). In this case the apparent location of juice on the seat in the field of vision of the hearer enables the interpretation of the compound. Another example for a deictic compound is: On a Sunday I walked with friends in a park on a crowded path with normal breadth. After a while one of them said: 'Laßt uns doch die Fußgängerautobahn hier verlassen und über die Wiese gehen.' (Let us leave the 'passenger-highway' here and walk across the grass.). The compound was immediately understood by all of us due to the situative context: the walk on the crowded path is an analogy to the (almost) normal situation on highways and, by that, the compound becomes comprehensible. Novel NN-compounds are non-lexicalized noun-noun compounds appearing as names for a certain concept in the context provided by the text. Contrary to deictic NNcompounds, verbal context, given by the text, gives complete support for compound comprehension. This definition is different from those given in other work. For example, Ryder (1990:12) defines novel NN-compounds as 'interpretable without recourse to a single immediate context'. They should be detachable from their presentation context. For backing this definition, she describes a unique event where a novel NN-compound was created by a speaker in a group of persons in order to characterize a particular individual. Afterwards the compound was used in this group as name for a certain kind of person. However, in this case the compound becomes lexicalized for the group. It is no longer a novel NN-compound! So novel NN-compounds, if they are defined as a certain type of non-lexicalized compounds, are bound with respect to comprehension to a particular context. They can become lexicalized, either for a certain group of speakers or the whole speech community but then they are no longer novel NN-compounds. Verbal context provides all kinds of knowledge necessary for novel NN-compound interpretation. Although meanings of novel NN-compounds are bound to unique contexts, their meanings are not arbitrary at all. Their meanings must be strictly predictable in a particular verbal context. The context-dependent meanings they get arise from specific interactions of different knowl-
XV edge systems. The kinds of knowledge systems and their interaction for context-sensitive semantics of novel noun-noun compounds are explained in the next chapters.
0.2
Overview of the Work
Before I describe the content of the following chapters, some remarks on the examples I employ might be useful. German compounds are either simply written as concatenations of both words in addition with an internal inflectional morpheme (as e.g. Pferdeschwanz 'horse tail'), or they are connected by a hyphen (as e.g. Berlin-Fan 'Berlin fan'). In this work German examples are printed in italics. Texts are translated into English, but the compounds are always translated literally because the attempt of translating the compounds into correct English would pass the semantic problem. So although the compound translations are odd or even completely out, the German counterparts are not semantically ill-formed at all. Chapter one enumerates the main requirements a context-sensitive theory of novel noun-noun compounds must meet. Additionally, the fundamental assumptions for semantics in general are pointed out. In chapter two, current theories on word semantics in general, and compound interpretation in particular, are reviewed and it is checked whether they meet the requirements given in chapter one. Chapter three is about the possible place of word formation. Based on problems of certain assumptions and their implementation in theories, a postulate on the structure of lexical knowledge is developed. In chapter four I will generate the formal foundations for the context-sensitive semantics for NN-compounds I developed. Chapter five is on interpretation abilities of isolated novel NN-compounds. Structures of their denotations and the underlying processes for searching and inferring relations are explained. Their interpretation without contextual influence results in the set of possible denotations of novel NN-compounds. In chapter six discourses containing novel NN-compounds, which are found in the German magazine 'Der Spiegel' and the weekly paper 'Die Zeit', are analyzed in order to develop the knowledge-dependent approach for the interpretation of novel noun-noun compounds that is able to explain interpretation abilities of these compounds as well as the underlying processes in texts that makes novel NN-compounds unambiguous. In chapter seven the complete algorithm for computing context-dependent NN-compound meanings is given and open problems are pointed out. Finally, a short comment about my manner of writing with respect to concepts. The text written will not always reflect the distinction between concepts and the real world I pointed out. For reasons of simplicity I will sometimes talk about the extension of a noun and mean objects in the external world rather than the concept. This will, however, create no confusion.
Some Requirements for a Semantic Theory of Novel Noun-Noun Compounds German is very productive in forming new complex words. I was surprised what kinds of compounds occur after I had started to pay attention to their existence in mass-media and normal conversation. German is so productive in forming ad-hoc compounds that there seem to be no restrictions at all1. Within the already productive word formation rules of German, the process of combining two nouns to a novel noun-noun compound (NNcompound) is the most effective one. Each issue of any newspaper or magazine contains a whole range of novel NN-compounds. Hence the reader is not bound to have read them, but is able to understand immediately within the context of the corresponding text. However, while novel NN-compounds in contexts seem to be comprehensible without any problems, in isolation they can cause problems of understanding. Taken as single items, they can have a lot of meanings on the basis of suitable relations holding between both extensions. So novel noun-noun compounds are ambiguous, but they can get a specific meaning depending on the context. The observance of the semantics of novel noun-noun compounds is a challenge to every semantic theory. NN-compounds are formed by a rather trivial morphological rule, but they exhibit various semantic properties that prevent a compositional approach to their interpretation. On the one hand, these properties arise from the general fuzzy meaning of words that is related to the knowledge-dependent interpretation. On the other hand there are also properties exclusively occupied by noun-noun compounds; these properties arise from the ambiguity of compounds. The problem of semantics of novel noun-noun compounds seems to be finding a suitable relation that holds between the extensions of the constituents. Subsequently however more problems arise. Within this chapter I will argue for the necessity of handling some particular problems concerning the semantics of novel compounds. Without considering these problems, no theory of compound interpretation can be developed that reflects variability of word meaning and the knowledge-dependent character of compound interpretation. It will be pointed out that a knowledge-sensitive interpretation approach has to specify which kinds of knowledge lead to which kind of compound interpretation. Here knowledge is not just to be understood as general world knowledge of properties of certain entities and the relationships between them, but as discourse knowledge as well. Discourse knowledge means information obtained from the preceding text, i.e. referential as well as conceptual knowledge. compounds that seem to contradict our intuition on semantic properties of denoted objects appear. An example (that was reported to me by Claus Heeschen) is erdbeerblond ('strawberry blond'). In isolation this [NA]-compound does not make sense. It was however used during the live broadcast of the French Open quarter finals by the reporter. He attributed this property to Jim Courier, but it was not clear what he actually meant.
1.1
Dynamics of Compound Meaning
What novel noun-noun compounds and all major lexical items have in common is the crucial influence of contextual information on the determination of their extension. Lexical items refer to concepts, but there is no one-to-one mapping from items to concepts. This is the reason why it is hard to define what the meaning of a certain word is. While it is possible to determine the truth value of a single sentence by compositionality, the constituents of that sentence seem to elude a successful formal treatment. Let us assume that concepts are built by simple properties2, and that concepts are the basis for classification. Then the properties of a concept which is a possible candidate as an extension cannot be characterized as features of word meaning since this would presuppose a one-to-one function between concepts and word meanings. We can just state that the word can denote a concept with these properties but this is not necessarily the case. So without contextual support the relevant concept cannot be determined. The conclusion is: defining word meaning is not possible without contextual information. But now the question arises what kind of semantic information is to be stored for lexical items within the lexicon? A theory of word meaning has to specify reasonably the semantic component of a lexical item. It has to be explained why a single noun as e.g. Museum (museum) gets different extensions in different contexts and how these extensions are related to each other: 1. Das Museum kaufte ein Bild von Turner ( 'Institution') (The museum bought a painting of Turner) 2. Das Museum befindet sich in der Stadtmitte ('Building') (The museum is located in the centre) 3. Dos Museum zeigt diesen Monat eine Ausstellung über die Geschichte des Buchdrucks ('Institution' and 'Building') (This month the museum shows an exhibition on the history of printing) 4. Das Museum wurde in der griechisch-römischen Zeit aus einfachen Sammlungen entwickelt. ('Principle') (The museum was developed in the Greco-Latin age out of simple collections) In (1), museum is described as an institution, in (2) it refers to a building. In (3) at least two concepts are involved since an exhibition is normally located in a building but organized by an institution. In (4) the underlying principle is the correct extension. This comes close to a generic reading. It is important to differentiate this example and nouns like Buch (book), Zeitung (newspaper), Bild (picture), Parlament (parliament), Kaffee (coffee) and so on from ambiguous nouns as bank or ball. The latter denote different entities that are unrelated to each other. Thus, corresponding to the number of extensions 2
Fodor argues against this view. He regards concepts not as strictly definable by simple properties. See Fodoret al.(1980).
there is the same number of lexical entries that contain the same phonological form but different semantic information. The former however are used for referring to different extensions that stand in particular relationships to each other. For that, following Bierwisch (1983), I will introduce the concepts 'lexical meaning' and 'utterance meanings'. Lexical meaning is the semantic representation that belongs to a lexical entry; utterance meanings are context-specific denotations in an actual discourse. The same dynamics in word meaning and therefore the same distinction between lexical meaning and utterance meanings plays a role in semantics of novel noun-noun compounds. An example may be Museumsbuch ('museum book'). Being ambiguous is a characteristic but not a necessary feature of compounds. So very often many relations can be introduced for interpreting a compound3. A particular referent of the constituents is selected depending on the relation chosen. For Museumsbuch ('museum book') there exist meanings like: 1. book ('physical object') located in a museum ( 'building') 2. book ('book information') about a museum ('building' or 'institution' (or both)) 3. book ('physical object') published by a museum ( 'institution') 4. book ('physical object') with a cover showing a museum ('picture of a building') Some of these paraphrases seem to be more natural descriptions of the compound meaning than others. The most natural interpretation seems to be (2), so 'informing' might be a relation strongly tied to 'book'. If we interchange the constituents, the modifier Buch has to be marked by its plural morpheme for certain reasons, so we get Büchermuseum ('books museum'). Again several relations can be used: 1. museum ('whole conceptual complex') exhibiting on books ('physical objects') 2. museum ('institution') buying books ('physical objects') 3. museum ('building') where books ('physical objects') are located in 4. museum ('institution') publishing books ('physical object' and 'book information') Also for this compound there is a salient relation: 'exhibiting' seems to be related to 'museum' in a strong way. A semantic theory of novel noun-noun compounds has to explain what the lexical meanings of the constituents contribute to the utterance meaning of the compound. The existence of salient relations has to be taken into consideration for this question as well. 3
Although most often several relations are accessible, the logical operations negation and disjunction are principally excluded for NN-compound interpretation. An NN-compound will never be interpreted as 'an A that is not B' or 'something that is either A or B'. Presumably the reason is the exclusive character of such interpretations. The underlying concepts are not linked but named as not combinable. This would lead to arbitrarily in compounding; restrictions for compounding would disappear.
Novel NN-compounds are not principally ambiguous however. Museumsbuch and Büchermuseum are compounds with several interpretations possible in isolation. But Bücherfan ('book fan') or Schrankhändler ('cupboard trader') are hardly interpretable in a different way than 'fan of books' and 'trader of cupboards'. These compounds are not ambiguous. Their meaning can only be changed in marked contexts. Although unambiguous novel NN-compounds are quite rare, the distinction between ambiguous and unambiguous novel NN-compounds must be explained by a semantic theory of novel noun-noun compounds. Summarizing, dynamics of compound meaning are based on the variability of constituent meaning and the number of relations possible within the compound. Therefore the first requirement a semantic theory of novel noun-noun compounds has to meet can be stated as follows: Novel noun-noun compounds are subject to variability in their meaning. The patterns of variability and the acceptability of possible relations have to be explained by a semantic theory of novel noun-noun compounds in view of the semantic information belonging to the lexical entries and the concepts denoted by the constituents.
1.2
Knowledge Dependence of Compound Meaning
For most novel NN-compounds the utterance meaning is determined mainly by a relation deduced from conceptual knowledge. Without this assumption the relation between the constituents often cannot be inferred. So access to conceptual knowledge has to be a necessary condition for interpretation. Compounds like Schreibtischtasse ('desk cup'), Schneepfeiler ('snow pillar'), Schneestab ('snow bar'), and Wasserauto ('water car') require knowledge about the objects and masses, respectively, that are denoted by the constituents. A 'desk cup' can be a cup located on a desk since desks have a horizontal tabletop and cups can be placed on tables. A snow pillar can be a pillar made of snow since snow is formable to a certain degree. However, a made-of relation is excluded for snow bar since bars are too thin to be formed out of snow. But this compound can be interpreted as a bar located on the surface of snow. Finally, a water car can be a car that swims on water or a car transporting water. These compounds can get other meanings as well, but the paraphrasing shows that conceptual knowledge is involved in compound interpretation. So a reasonable distinction between semantic knowledge and conceptual knowledge has to be made. Additionally, it has to be explained how both types of knowledge interact in compound interpretation. Hobbs (1989) claims that there is no useful distinction between semantic knowledge and conceptual knowledge because both kinds of knowledge are used in language comprehension as well as in language generation. This argument could be transferred to compound interpretation as well. Most novel compounds can only be interpreted using knowledge of the properties of the objects denoted by the constituents. Exceptions are compounds with a relational or derived head, as e.g. Blumenfan ('flower fan') or Blumen-
maier ('flower painter'). If the semantic restriction of one of the head's internal arguments is compatible with the modifier, the argument will be satisfied by the modifier. So the internal argument of the head can be applied to the modifier but it does not have to. This means, although the head consists of a relational or derived noun, object knowledge has to be necessary for compound interpretation, since the modifier must meet selectional restrictions. Indeed there seems to be no clear cut between semantic information and conceptual information. Denying a distinction between semantic knowledge and world knowledge, however, makes it hard to explain the difference between lexical meaning and utterance meanings. Semantic knowledge is a necessary part of compound meaning, since it is the basis for the interpretation of lexical items. Conceptual knowledge for compound interpretation should be located on particular levels, namely levels provided by prototypical properties. For each assumed domain subdomains should be apparent formed by certain properties. These prototypical properties of domains are relevant for compound interpretation. For example, the domain of substances might be divided into five subdomains which are the cornerstones for a theory of substance-properties: liquids, pastes, powders, solids and gases. Each subdomain has its own inherent characteristics, provided by common features of the molecular structure. So we know that liquids cannot provide a place for physical objects with a certain specific weight. They also take the shape of the inner space of their container, they flow and cannot be formed, and so on. Knowledge of this kind influences the interpretation of a novel NN-compound with a modifier denoting a liquid. Thus, Wassertisch ('water table') cannot be interpreted as a 'table that swims on water' since tables are not swimming objects on water, and Wasserstatue ('water statue') cannot be interpreted as 'statue made of water' because water cannot be formed. Powder, on the other hand, is typically characterized by granularity, formability, localizability in a container object or on a horizontal place. So compounds as Mehltisch ('flour table') or Zuckerhaufen ('sugar heap') will most probably be interpreted as a 'table with flour on it' and a 'heap of sugar', respectively. But powder can also be adhesive, although this is not a prototypical property. So the property of being sticky is not important in compounds with a modifer denoting a powder because this property is not shared by most powders. Thus, I assume that interpreting novel compounds is based mainly on prototypical features of objects and of certain domains. But not only domain specific knowledge is necessary for compound interpretation, knowledge of particular entities plays a significant role as well. For example, interpreting compounds as Rheinschiff ('Rhine ship'), Bananenschiff ('banana ship'), Dieselschiff ('diesel ship'), or Stahlschiff ('steel ship') requires many properties connected with the concept 'ship'. It must be known that ships transport goods and swim in water. Nowadays they are mainly run by engines that need fuel whereas earlier sails and/or oars were used. Ships are made of solids, need a crew, and so on. This special knowledge of ships is related directly to knowledge of space and time, knowledge of marketing, ports, docks, or to leisure, sports and so on. Human knowledge seems to be extremely interwoven and every concept property can be necessary for interpreting a particular novel compound. This becomes obvious by
interpreting compounds in discourses on expert domains where very specific concepts are developed. We can derive from the considerations above the second requirement for a semantic theory of novel noun-noun compounds: A semantic theory of novel noun-noun compounds has to explain which prototypical concept properties are involved in forming NN-compound extensions and why certain concept combinations are impossible as NN-compound extensions.
1.3 Discourse Dependence of Compound Meaning Representing lexical meaning and having access to general knowledge is only part of the story on interpreting novel NN-compounds. Novel NN-compounds appear in discourses and have a particular utterance meaning. Therefore it must be known for their interpretation in which context which relation is determined in the conceptual basis or created in the discourse. The ambiguity of compounds is based on the existence of various conceptual relations. But within a certain discourse, ambiguity of a novel compound disappears or is irrelevant. Two examples of novel compounds in a discourse will show the difference between both interpretation strategies. The first one is taken from a gloss on the former Berlin Wall in the newspaper Die Zeit, the second one is from an article on musicals in the magazine Der Spiegel: (a) ...seitdem die Mauer in Stücke ging, gibt es in West-Berlin die Mauertrauer.. (Since the Wall broke down, there is the 'Wall mourning1 in West-Berlin) In isolation, this compound does not have a preferred meaning. It can mean at least 'mourning about the Wall' and 'mourning about events related to the Wall'4. The inferences drawn from previous cotext exclude the second definition: the Wall does not exist anymore, so the second interpretation is ruled out. The context also states the meaning of the compound more precisely. The gloss describes the special life that was possible in West-Berlin under the legal requirements of the Allies and the changes of life-style due to the demolition of the Wall. So 'mourning about the Wall' can mean 'mourning about the existence of the Wall', 'about the people shot at the Wall', etc. but here the precise meaning is 'mourning about the demolition of the Wall'. Thus, in this case the context supports the creation of a precise utterance meaning of a novel compound. Contrary to this example, in the article on musicals no precise meaning of the compound Kinder-Musical ('children musical') seems to be necessary: (b) Tibor Rudas machte Walt Disneys "Schneewittchen" zu einem 18 Jahre lang bestsellernden Kinder-Musical. (Tibor Rudas made Walt Disney's "Snow White" to a best-selling 'children musical' for 18 years) The deverbal noun Trauer is derived from trauern. The verb has two meanings: mourning about animates ('trauern um') and mourning about events ('trauern über1), as the prepositions indicate. Both interpretation abilities are inherited to the noun; therefore the compound gets at least the interpretations given above. I thank Antje Roßdeutscher for this advice.
The compound is ambiguous. The preferred interpretation is that of a musical for children, because it is known that "Snow White" is a fairytale and Walt Disney is known as a producer of movies and stories for children. But the discourse does not exclude a meaning like 'a musical made by children', or even 'a musical made by children for children'. Thus, context can play an important role in novel compound understanding but it does not have to. The principles of discourse understanding and the determination of discourse interpretation based on these principles have to be related to interpretation strategies for novel compounds to get insights into the interface between both areas. This leads to the third requirement for a theory of compound semantics: A semantic theory of novel noun-noun compounds has to explain what the contextual factors for compound interpretation are and why a novel NN-compound gets a specified meaning depending on the discourse.
1.4
Underdeterminacy of Compound Meaning
Three sources for information on the utterance meaning of a compound have been named: semantic knowledge, world knowledge, and discourse knowledge. They are related to each other and determine the utterance meaning. This could lead to the idea that every compound could get a very specific interpretation. But this is not the case; instead there are two kinds of possible underspecifications. The first one is related to the generality versus specificity of the relation in a novel compound. The second one is concerned with quantification of the modifier: it often gets a generic interpretation. It is sometimes argued that all relations possible in noun-noun compounds can be clustered in classes called conceptual categories (e.g. Levi (1978)). These classes might reflect general principles of concept combination such that all(!) compound meanings can be derived from these classes. However, until now all attempts to establish a satisfactory classification failed. If the classification is too coarse, the compounds are still ambiguous because abstract categories do not provide enough information for a sufficient classification. But if a fine-grained classification based on a corpus is developed, there are always compounds that do not fit into this system. This is admitted by many descriptive linguists who develop compound classifications5. So how precise can a compound meaning be indicated if no exhaustive classification can be determined? Many compounds can be interpreted precisely due to concept knowledge involved. For example, plastic car can be interpreted approximately as 'a car made of plastic'. But this natural-language paraphrase, although using an explicit relation, is not precise enough: a 'car made of plastic' normally is a car with a bodywork of plastic. The necessity for specialization very often holds for the 'made-of relation: Holzhammer ('wood hammer') is a hammer with a wooden stick; Latexschuh ('latex shoe') is a shoe with a latex sole; Ledersitz ('leather seat') is a seat with a shell made of leather. However, while the made-of relation involves specializations of this kind, there are relations possible that can be specialized further 5
See Ryder (1990) for a discussion of various classification systems and criticisms of these attempts.
8
but are not applied to NN-compounds in specialized forms. It is also possible that relation specialization involves specializations for its arguments. For instance, an onion knife probably is a knife for cutting onions. But this paraphrase can be specified to 'a knife for cutting onion rings', 'a knife for cutting whole onions or onion cubes' and so on (ad absurdum). The cut-relation can also be specified by different modifiers since different masses need different kinds of cuttings. Thus, Brotmesser ('bread knife'), Käsemesser ('cheese knife'), and Papiermesser ('paper knife') are characterized by different kinds of cutting. This distinction is not relevant for assigning the compounds to a certain class. We are confronted with two problems in naming a suitable relation for a compound: it is known that a relation exists, but it is not known how many relations exist at a particular level and how much further we should specialize a relation:
R1m.
How much further?
How many?
The problem of finding a relation that provides sufficient information for compound interpretation is a specialization of a well-known problem in knowledge representation: all assumed classes can only be approximations to the richness of conceptual structures and therefore they cannot exhaust all possibilities of relationship. Compound interpretation needs access to knowledge. Representing knowledge is always a specific task for one general reason: there is no context-independent classification since concepts as classificatory devices are based on partial experience. This becomes clear in knowledge representation for a knowledge-based system: the structure is determined by the specific domain. Therefore all attempts for an exhaustive classification of compounds by using their internal relations have to fail. Classifying compounds is strongly related to knowledge representation and this can only be an approximation to the complex conceptual structure of the mind. Surely, there are main classes of relations, based on prototypical functions, spatial relations, and other basic distinctions. The use of prototypical properties fully specifies NN-compound denotations. But we cannot expect to give the full range of possible relations for novel compounds, just for the reasons above.
The second point in underdeterminacy of compound meaning is related to the quantification of the modifier. In isolation, many NN-compounds tend towards a generic quantification of the modifier. The generic meaning allows exceptions. For example, Blumenfan ('flower fan') denotes the class of entities related to a set of flowers by the fan-of relation. So in 'Peter is a flower fan' Peter is not a fan of all existing flowers in the world but of flowers in general. The same is true for compounds with other underlying relations as Tischsäge ('table saw'), Theaterstuhl ('theatre chair'), Skulpturenausstellung ('sculpture exhibition') or Direktorenversammlung ('director meeting'). In all cases the modifier denotes an unknown number of representatives of a kind more than one. Compounds with a proper name as a modifier do not get a generic interpretation for the modifying noun. So SPD-Fan is someone who stands to the party 'SPD' in a fan-of relation. Münchentourist ('Munich tourist') probably is a tourist who visits Munich. The same holds also if there is no problem with quantification: an Institutsdirektor ('institute director') is the head of a certain institute. It has to be pointed out what this generic interpretation means and when quantification can be stated more precisely. So the fourth requirement for a semantic theory of novel noun-noun compounds can be stated as follows: Within an adequate semantic theory of novel NN-compounds, plausible relations based on prototypical concept properties must be computed for a compound. This theory also has to explain why the modifier tends to generic quantification and what genericity means.
1.5
The Choice of the Representational Language
The last point does not concern semantics of novel noun-noun compounds, but the choice of the meta-language for such a semantic theory in order to deal with intensional and ontological aspects. The former is one of the main problems of semantics, while the latter is based on the use of the logical language. Both are problematic not only for clauses or discourses but for words as well. Intensionality was introduced in semantics by Gottlob Frege for getting an explanation why coreferential expressions can differ in their informativeness. He introduced the concepts 'reference' and 'sense'. So expressions can have the same reference but a different sense (e.g. morning star and evening star differ in sense but denote the same). In intensional logics intensions are handled as functions from sets of indices to extensions. For example, within the coordinate approach (e.g. Lewis (1974)), Lewis mentioned some indices or coordinates for intensional semantics: A possible world coordinate, time coordinate, place coordinate, speaker coordinate, hearer coordinate, previous discourse coordinate, indicated objects coordinate, prominent objects coordinate, causal-history-of-aquisition-of-names coordinate, and a delineation coordinate (for vagueness of adjectives). It is obvious that this set of coordinates is not complete for a general intensional treatment of meaning. After all, context is such a wide field of different factors that it should be treated differently from an approach working with indices and more related to information inferable from lexical and situational knowledge. As Barwise (1989:89) puts it: "If there is one thing we have learned in semantics in
10
the past ten years, it is the enormous and infinitely varied effect context has on the interpretation of utterances. It is now totally implausible to suppose that there is any fixed set of contextual features that can be set up once and for all." A second point concerns the use of an unstructured ontology, consisting of a set of possible worlds and a set of objects. By that, one- and two-place predicates are considered as belonging to the same class although they are expressed in natural language as belonging to different categories (e.g. schlafen (to sleep) and Tasse (cup) are both oneplace predicates; sehen (to see) and Rest (rest) are two-place predicates). A more realistic semantic theory from a cognitive point of view has to work with a structured ontology supporting inferences, considering context, and enabling deduction6. These objections hold especially for Montague-Grammar. In addition to these challenges the use of type logic in Montague-Grammar also leads to the problem of computational complexity. The use of variables for every type, hence the use of higher-order logic, is too complex for deduction, such that it cannot be the basis for an implementation or a model of human reasoning. Thus, intensionality should not be handled by indices and type logic should be avoided for semantics. Instead of using indices a more cognitively oriented approach is used. As it was already indicated, actual context can be any kind of information. Therefore the ad-hoc manner of representing context by indices should be replaced by a more systematic treatment of contextual elements7. Discourse types determine relevant contextual factors. By definition, novel noun-noun compounds appear in texts. Hence, both the linguistic context and background knowledge give sufficient information for their interpretaion. Certain contextual factors, as e.g. background knowledge of the producer, or possible influence of gestures, are excluded as factors for interpretation. The domain the text is about has to be represented because background information is always implicitly available in language comprehension in general. Novel compounds in most cases are interpretable only with the help of world knowledge. I will support the hypothesis of the unity of context and content (Hans Kamp). This means, represented knowledge of a previous text is the background not only for a representation of a current sentence but also of a current word. Previous knowledge does not only include representation of the cotext but inferred knowledge as well. Although this is a modified and enlarged version of what Kamp might have in mind, it does not contradict the original idea of interpreting discourses with the help of contextual relations to elements of a previous text. From now on I will use the concepts 'cotext' and 'context' interchangeably. As an additional requirement we can state that a semantic 6
See also Pinkal (1989) for this topic. The general discontent with & representation of context by conceptually less satisfying parameters leads e.g. to a radical new treatment of context in situation semantics (Barwise fe Perry (1983)), where situations are interpreted as small portions of the world. Barwise (1989) argues that the set of contextual elements or indices just represents a situation. Situations are seen as individual elements that stand to other elements (like ordinary objects) in some relations. Contextual reference means involving all information that is associated with an utterance. But how this works and interacts with background knowledge in situation semantics is not yet quite clear. 7
11
theory should work with a structured ontology that enables inferences and allows deduction. From a cognitive point of view, intensional aspects should be handled by ontological differentiations and inferences, not by intensional type logic. The logical language I will use exhibits these features; it will follow the language used in Discourse Representation Theory (DRT).
1.6
Discussion and Conclusion
I pointed out diverse problems that are all related to compound interpretation and that have to be handled by a semantic theory of novel NN-compounds. Like other words, NN-compounds are context-sensitive with respect to their reference. Their constituents do not denote single concepts but families of concepts that are related to each other. Contextual information selects the relevant concept. The connection of the concepts of the constituents is afforded by a relation that in most cases is based on world knowledge. Based on the observation that NN-compounds need a prototypically induced relatibon rather than very specialized ones, it was argued that within a knowledge base prototypical features should be represented. On the other hand, representation of prototypical features does not guarantee every possible interpretation since knowledge representation involves approximation of conceptual structures. Finally, a close look has to be given to the relationship between discourse semantics and compounds for getting an answer which inferences rule out possible, but irrelevant relations, or even create a relation. Taken all requirements for a semantic theory of novel NN-compounds together, one gets a certain pattern how utterance meanings of novel NN-compounds should look like. The constituents are subject to meaning variability. Isolated NN-compounds get possible meanings by a set of fitting relations, but in context relational ambiguity of the compounds disappears. Additionally, the relation chosen can determine specific extensions of the constituents, but this is no necessary corollary. So utterance meanings of novel noun-noun compounds are fixed by two factors: Utterance meanings of novel NN-compounds consist of the contextually specified extensions of their constituents and a contextually relevant relation. Both parts are not independent from each other but interact in a systematic way. Developing a theory that meets the given requirements and corroborates the statement above is the aim of this work.
2
Previous Work on Word Semantics and Compounding
The requirements for a semantic theory of novel NN-compounds shown previously will be the basis for a judgement of already existing theories on word semantics and interpretation of compounds, respectively. Structure and meaning of compounds are investigated virtually in all fields handling natural language, namely linguistics, psycholinguistics, and language-oriented ΑΙ-research. But the focus of attention differs considerably within each field. The phenomenon of unpredictability of compound meaning is the core of all semantic theories in linguistics as well as all conceptual models proposed in psycholinguistics. Researchers working on natural language comprehension systems in Artificial Intelligence try to determine rules which use a knowledge base in order to enable a system to compute the underlying meaning of compounds. Within this field, compounding is regarded principally as a conceptual problem. In this chapter, the 'state of the art' in all three fields will be discussed, showing the problems of prevailing approaches and judging them with respect to the requirements given above. I will start with general approaches on word meaning and then turn to the problem of compound interpretation.
2.1
Theories of Word Meaning
The relationship between semantic information in the lexicon and contextual variation of word meaning is explained in the literature in two principal ways. Either the semantic component of a lexical entry is a representation of default knowledge such that context is able to overwrite the information. The semantic component may also contain contextindependent information such that context is able to specialize the meaning of that word. The former direction and its consequences for compound semantics is discussed in chapters 2.1.1 and 2.1.2, the latter is discussed in chapter 2.1.3. Two kinds of default representations are proposed. They differ with respect to the status of this representation. It can have the status of a preferred meaning such that non-monotonic inferences cause changes of that meaning, or it represents a lexical stereotype which does not reflect a preferred meaning. The former idea is proposed by Franks and Braisby (1990) and Braisby (1989). They assume a default representation as semantic information in the lexicon that can be overwritten by contextual information. The latter idea is proposed by Bosch (1985) and Bosch (1988). 2.1.1
Default Information as a Basis for Dynamics of Word Meaning
As pointed out in Braisby (1989), any theory of word meaning has to take into consideration that the meaning has to be represented economically and that categories cohere. A further point is the existence of central tokens and peripheral ones as suggested in
13
prototype theory. The last issue is the context-dependent character of word meaning. Therefore, all meanings of a word cannot be represented in one data structure or single lexical representation and we have to distinguish between core meanings and peripheral ones. In Braisby (1989); Myers, Franks, Braisby (1989), and Franks and Braisby (1990) the distinction between lexical concepts - this is the semantic contribution of a lexical entry - and the different specialized meanings -they call it 'generated sense'- is discussed. According to Myers, Franks and Braisby (1989) only one general lexical entry for all possible meanings of a word seems to be implausible, because on the one hand meanings can be very specific and on the other hand also indeterminate. Extending the lexical concept generates the sense. The lexical concept is a partial description of entities the word describes. It contains the central properties of the entity denoted. The semantic information is stored for each item in the lexicon. But any of these properties can, in principle, be defeated in a generated sense. This is the non-monotonic aspect of their theory: the generated sense can have properties that contradict those of the lexical concept. The generated word meanings define the so-called 'perspectives', which are constrained by situational factors, "local context and the informational requirements of the agents involved." (Myers et al.:10) Now there are two views of lexical representation. The first view has been developed by Braisby (1989): word meaning is presented in feature-value pairs. The core meaning or lexical concept is represented in a 'WORM', which is a combination of feature-value pairs. The different peripheral meanings are related to the core meanings in a particular way. They are represented by combinations of WORMs, so-called COWORMs. They capture theories underlying word uses and are constructed in context. So context-sensitivity of word meaning is pointed out. For example, the word meaning of lion is represented by a feature-value set PI: WORM(LION, PI). Now the word meaning of stone lion is given by the combination of the WORMs for lion and statue (WORM(STATUE, PI, P2)): this relates two kinds of data structures. This leads to a COWORM where some values of the WORM of lion (e.g. 'animate') are changed. The meaning of a word is described by different data structures on different occasions of use. Another view of concept combination is supported by defeating some properties of the head concept in a head-modifier relationship. In this view, central-essence properties are assumed to be a subset of all properties. They are related to the ontological or functional essence of an entity. Furthermore, the 'perspective' is defined as a pair, consisting of the categorized sortal noun and the properties suitable for categorization. A phrase that describes an entity leads to two perspectives: either the categorized entity has the central-essence properties of the categorized sortal noun or no central-essence properties are incorporated and the categorization is based on the narrow range of properties connected with appearance. An example is fake gun. The central-essence properties of gun are 'internal mechanism for propelling bullets', 'colour and weight of metal', 'barrel for directing bullets' and so forth. These properties are stored in the lexical concept. The lexical concept oifake will defeat all these properties and some appearance-properties will be assigned. But the exact kind of property cannot be specified since the concept is not
14
precise enough; a fake gun might be a toy gun, replica gun, or model gun. A perspective without central-essence properties is called Type II perspective, contrary to Type I perspectives that contain these properties. The mechanisms of changeable extension and evaluating properties with respect to a different noun concept captures the semantic flexibility and specificity of a wide range of combinations in noun phrases. While these approaches very clearly show the dynamics of word meaning, there is a problem for the procedure of sense generating. Let us take an example. Franks and Braisby (1990) claim that a shift of the meaning for newspaper from the physical object to the institution that produces this object can be explained by generating the institutionsense from the underlying lexical concept. But these are two different entities from different domains, although they are related to each other. Defeating properties and adding new ones to the information structure cannot explain this relationship since both entities are characterized by different features. The 'producing' relation between newspaper as an information mediator and the institution reading is responsible for the shift of meaning, other features are not considered. Consequently, it cannot be concluded from the article why a lexical item gets a particular sense and not a different one, i.e. the interaction of contextual influence and world knowledge with a lexical representation is not lined out. The proposal also runs into difficulties as an approach to compounding, even though the author(s) use, among others, compounds as examples for concept combination. Stone lion is paraphrased as 'a lion made of stone'. In the representation of Braisby the relation between the constituents is not given clearly. But it is only the relation which mainly determines the meaning of the compound. Our world knowledge determines what kind of relations can be correct for a compound. If we consider a non-lexicalized compound like water lion, Braisby is not able to predict that it is clearly not a lion made of water (whatever it could be) because he cannot use a WORM for statue which explains the meaning of stone lion. This means, there is an inference from the representation of the compound extension to 'statue' and not the other way round. Furthermore, the interpretation of compounds is a knowledge dependent process and Braisby can only use the features included in the word meaning, no extraordinary knowledge or inference is drawn from inherent properties. But Braisby's approach elegantly shows the effect of concept combination by means of the properties of lexical concepts and hence the way of deriving different word meanings. The final point concerns the distinction between central-essence and other properties. This distinction is drawn intuitively, there is no exact definition for them. Therefore it is not possible to distinguish between both types of perspectives in every case. A plastic rose can be a rose with respect to appearance (type II) or a rose with all properties of this flower (type I), although this seems to be not very plausible for me. But what is a plastic car if the compound would be used for denoting a car with a motor made of special kinds of synthetic material? It does not have all properties of normal cars. The phrase could also be used for referring to cars that need synthetic material as fuel (e.g. in a science fiction story). Again probably a central property is violated, but we are not referring to an entity that appears as a prototypical car. So it is problematic to distinguish between different sorts of properties without having grounds for that.
15 2.1.2
Lexical Stereotypes as a Basis for Dynamics of Word Meaning
The second approach working with default information is proposed by Bosch (1985), Bosch (1988). Bosch's proposal is characterized by the lack of a core meaning or preferred meaning forming the basis for derived and specialized meanings. According to him, the problem of context-dependent reference makes the number of meanings of a word potentially infinite. The main idea of his proposal is to replace the context-independent 'concepts' as a classification mechanism by context-dependent 'contextual notions'. These are contextdependent constructions which contain the result of an interaction of semantic information of the recent lexical item with other items occuring in the clause. Furthermore, situational knowledge, general world knowledge and knowledge of the preceeding text are involved. The reason for the replacement is grounded on the lack of identifiability. Bosch demonstrates this by contrasting an object with other ones that differ from the former in various properties. Therefore the object is named (hence classified) in different ways. Obviously contextual notions are always single representations, constructed anew for each context. Since there is an infinite number of contexts, there is also an infinite number of contextual notions. The information stored in the lexicon is named 'lexical stereotype'; this is a set of properties learned by experience. Bosch regards all properties related to a lexical item as default ones. So in principle every property can be overwritten by further information; there is no analytic truth. This is the common feature of Bosch's proposal and the approach reviewed above. However, word meaning is "the contribution the word makes toward the inferences that can be drawn from the sentences in which the word occurs in a particular context of use." (Bosch 1988:62) Thus, word meaning is a contextual notion, therefore a context-dependent representation. For being more concrete, let us have a short look at this described event: A speaker at a conference on nuclear physics is giving a talk. He promises to hand out a paper and then distributes a paper with recipes. If we do not take into consideration the relationship between the mass 'paper' and countable 'papers', the semantic specification in the lexical entry of paper will specify what a stereotypical paper is. These might be properties as 'having text on it', 'having a rectangular form', 'being thin', and 'being writeable on'. But what is the conceptual notion of 'paper' with regard to this discourse? It is a different, though related to the stereotypical, representation that is built stepwise during parsing. The context of being a speaker at a conference on nuclear physics leads to a certain interpretation of 'paper' in the clause: it is expected that an abstract of the talk is distributed for the audience. Presumably the talk is related to the topic of the conference. But this expectation is violated by the act of handing out recipes. Thus, in this example the contextual notion of paper is based on the conference-context plus the knowledge of stereotypical behaviour of speakers on conferences. Bosch's approach is based on the idea of creating principally an indefinite number of word meanings (= contextual notions). Within each context - which is a single event - a
16 new word meaning exists for a lexical item. Although his argumentation on the interaction of different knowledge types resulting in a specialized contextually relevant meaning is cogent, there are some problems. The system seems to be too dynamic, restrictions are necessary. If a lexical stereotype is based on experience, it could be changed each time the item is used in a discourse. But it is not clear at which point contextual information gets such a weight that it induces property changing. So what is stored as relevant information and what is judged as irrelevant? Furthermore, within this theory it is possible that a lexical item can get a completely different lexical meaning after enough relevant usages within discourses because every property can be changed; there is no stable semantic information in the lexicon. So actually a word can be used for denoting everything, there is an infinite number of word meanings after all. The conclusion is: Bosch's proposal has to be restricted for avoiding unrestricted creation of meaning that leads to unwarranted arbitrariness in meaning definition. Another problem arises if the proposal would be applied to novel noun-noun compounds. Object knowledge is assumed to be the main factor that determines a relation for the compound. But still the actual context also plays a role. However, leaving unanswered for the moment the question on the number of possible relations for a compound, there are for sure standard interpretations for each compound. So Fensterfabrik (window factory) will be interpreted most naturally as 'factory (as the whole conceptual complex) that produces windows'. In most discourses this compound will be used with this relation because the stereotypical relation of producing is a salient one. There may be variations for the constituents, but the relation is not attached by them. For example, in (1) and (2) respectively, the head gets a building-reading (1), or it is interpreted as an institution (2). Nevertheless, the relevant relation is 'producing'. 1. Gestern besichtigte der Bürgermeister von Astadt die neue Fensterfabrik. (Yesterday the mayor of Atown had a look round the new 'window factory'.) 2. Gestern stimmte der Bürgermeister von Astadt für die Subventionierung der neuen Fensterfabrik. (Yesterday the mayor of Atown voted for subsidizing the new 'window factory'.) The relatively static behaviour of the relation contrary to the dynamic meaning shift of the constituents can hardly be explained with 'conceptual notions'. However, the interaction of various knowledge sources for getting a specialized meaning in a certain context principally seems to be a useful working hypothesis also for novel NN-compounds. 2.1.3
The Two-Level-Approach for Semantics
Contrary to proposals that suggest a treatment of the dynamics of word meaning by changes of default information, one approach has been developed that presupposes the existence of a core meaning as semantic information in the lexicon (e.g. Bierwisch (1983), Bierwisch and Lang (1987)). The existence of internal structures of basic grammatical
17 elements and their variation in different contexts lead to the assumption of an intermediate level between the logical form (the output of the grammar) and the conceptual structure. This is the level of semantic form SF. The interaction of semantic information and conceptual representations has been for the first time clearly pointed out in Bierwisch (1983). He shows intersentential variations of the extension of nouns like theatre, museum, or parliament. They can be used for denoting different concepts in certain contexts that are related to each other. For example, the noun Theater (theatre) can be used as an argument of predicates that can be applied to entities of different domains: 1. Das Theater brannte ab/ wird renoviert (The theatre burned down/ will be renovated) 2. Erich will ans Theater / Erich geht ans Theater (Erich wants to go to the theatre) 3. Das Theater war langweilig/ dauerte drei Stunden (The theatre was boring/ went on for three hours) 4. Das Theater macht eine Tournee durch Belgien (The theatre was touring through Belgium) 5. Dos Theater tobte/ jubelte/ lachte (The theatre went wild/ cheered/ laughed) 6. Wir gehen heute abend ins Theater (We are going to the theatre tonight) 7. Ich vergaß im Theater meinen Regenschirm (I forgot my umbrella in the theatre) 8. Das Theater zahlt schlecht/ will keine Gewerkschaftler beschäftigen (The theatre does not pay well/ is not willing to employ unionists) Theater seems to refer to four main concepts: 'theatre-building', 'theatre-institution', 'theatre-performance', and 'theatre-audience'. But, as 4-8 show, these four concepts are either linked to other concepts or have an internal structure. In (4), only the stage performers are touring, in (5) only the audience is laughing. If we are going into the theatre, we are entering the building, more specifically we are going into the auditorium (the whole phrase ins Theater gehen has a specialized meaning which explains a complex event, and the word Theater just contributes its part to the meaning of the phrase). If someone forgot his umbrella in the theatre, he forgot it normally in the cloakroom. And if the payment at the theatre is not well, the management of the institution does not pay much. This is not an exhaustive list of all possible denotations, but it explains the relatedness of the four main concepts others. The context-dependent behaviour of this noun cannot be generalized to all common nouns. While some nouns have a high tendency for meaning shifts, like e.g. Buch (book), Oper (opera), Kaffee (coffee) and others, nouns on the other end of the 'meaning shift scale' only have a distinction between a generic and a non-generic reading. Most natural kind terms are examples for this class. This strong variation of nouns is different from contextual variation of verbs: while the former denote concepts of different domains, the latter are specialized for certain events.
18 So to cut can be used among others as in 'to cut hair', 'to cut grass', or 'to cut paper'. These are all cutting-events which are specialized for the argument. For these reasons Bierwisch argues for two operations that map the semantic form of nouns and verbs, respectively onto a contextually relevant concept: conceptual shifts for nouns and conceptual differentiations for verbs. These operations are based on the semantic information that represents the semantic part of a lexical entry. This is the invariant, underspecified meaning of all non-metaphorical possible meanings. Conceptual operations map this information onto fully specified concepts. These operations are given as conceptual Schemas which are directly related to knowledge of certain entities. On the other hand, similar variations appear for nouns belonging to the same word field. So some shifts seem to represent more general knowledge structures. The Schemas belonging to the school-concept can be represented as:
1.
XP\x[BUILDING(x)kP(x)]
2.
\P\x[INSTITUTION(x}kP(x}]
3.
\P\x[PROCESS(x)bP(x)}
4. \P \x [PRINCIPLE(x) & P(x)] with 'P' as variable for the semantic form. The selection of the relevant schema is driven by contextual factors. Anyhow, the systematic variation can hardly be explained without the assumption of a level of semantic form1. Other facts support the assumption of SF being distinct from conceptual structures: (i) Proper names are represented at SF as labels for particular male or female persons, but in the conceptual system CS all knowledge about a particular person is stored. This knowledge individuates the person with his/her name and separates him/her from other persons with the same name. (ii) Normrelatedness is another indication of differentiating between conceptual structures and semantic form. Adjectives applied to nouns often compare the extension of the noun with a conceptually determined norm that is either a standard norm of an entity or it is related to other entities. For example, elephants are tall compares elephants with the standard height of animals. The scale is determined by human perception. But Clyde is a tall elephant compares the particular elephant Clyde with the normal height of grown-up elephants. Within the semantic form SF, the schema for a comparison is given and it is carried out within the conceptual system CS. (iii) The final point concerns the difference between knowledge systems and lexical knowledge with respect to modification and flexibility. While knowledge of entities and the relationships between them is principally flexible - it can be changed by new information - lexical knowledge is more rigid and not necessarily affected by these changes. SF is 'Jackendoff (1984, 1989) denies the existence of SF. Syntactic structures are directly mapped onto conceptual categories. However, if he tried to explain possible shifts and differentiations, he would get problems with a direct map from syntax into conceptual structures.
19
represented by semantic primitives that are no longer analyzable from a semantic point of view. They are constituents of a lambda-categorial language that constitute the semantic part of lexical items. So functor-argument structures constitute compositionality. Syntactically motivated thematic roles are represented as sequences of lambda-operators binding variables in the semantic form. The advantage of semantic decomposition with regard to other kinds of representation is the explanation of the semantic structure of the lexicon. The complex structure is defined by the primitives lexical units share with each other. For example, the semantic representation of aufwachen (to wake up) can be described roughly by using the primitives CHANGE, TO, SLEEP, and NOT in a certain configuration, provided by the categories the primes belong to: λ χ
[ CHANGE SLEEP χ 0/0 0/1
TO NOT SLEEP χ 0/0) 0/0 0/1
This semantic description shows the direct connection between aufwachen and schlafen (to sleep). A lexicon, where the semantic part of the lexical entries is built by semantic primitives, needs no explicitly given semantic relations which explain the semantic structure, as done by e.g. meaning postulates2. The semantic structure is explained by the lexical entries itself. So renunciation of semantic decomposition means renunciation of expressing regularities by the lexical entries. But decomposing the semantic representation requires 2
Alternatives to this approach could be the use of meaning postulates or semantic nets. However, the use of meaning postulates is uneconomical since a lot of rules additionally from lexical entries are necessary to define the semantic structure of the lexicon. Another point is the fact that the semantic description of a word is fixed. The dynamic aspects of word meaning cannot be handled in this representation. Meaning postulates are ad-hoc notations, there is no system for explaining relations between word meanings. Semantic nets could be used for representing lexical meaning. Within these nets, words are regarded as nodes in a graph and are defined by links to other nodes. But, as Johnson-Laird et al. (1984) correctly critizise, these networks are Only connections' that cannot explain why a certain word has a meaning. It is just related to other words and there is no relation to concepts or the external world. Apart from the lack of reference to extensions, this treatment of representing word meaning is inadequate for the rigid representation as well. But recent developments of knowledge representation formalisms, although based on the same idea, are terminological models of a domain, therefore different from the approaches critizised above. With these formalisms, a certain domain can be modelled by composing basic expressions to complex ones. The basic expressions are mapped into a domain, therefore all expressions have a defined meaning. Knowledge representation formalisms based on the representation formalism KL-ONE will be discussed in chapter four.
20
an answer to the question what the semantic primitives could mean. They have to be interpreted as well. This means, without a map in conceptual knowledge, the primes are meaningless and do only stand to other primes in certain relations via the category they belong to. A second remark is necessary on the status of the primitives within a lexical theory. The assumption of certain primes is intuitive and ad-hoc. There is no system for a theory of lexical meaning yet, although there is some work on lexical decomposition of verbs which suggest operators as CAUSE or BECOME (e.g. Bierwisch (1989)). The existence of certain primitives should be grounded on empirical results and/or theoretical considerations. What kind of information is provided by the logical form (LF), semantic form (SF), and conceptual structure (CS) and how are they related to each other? LF represents the syntactic structure that is relevant for semantic interpretation. CS represents the intermediate level between different information systems, as e.g. the linguistic system, the perceptual, and the motoric system. It is the central system for cognition. SF represents the grammatically determined structure for semantic interpretation. It differs from LF in three respects: first, it consists of configurations of primes. Second, SF is determined by LF, namely by the assignment of thematic roles (θ-roles) in syntax. Thus, the relation between LF and SF is given by the internal structure of lexical entries and q-role assignment. Finally, SF is delimited from LF by the kind of configurationality: while SF is determined by functor-argument relationships, LF is determined by X-bar syntax. Therefore concepts as head-of, government and complement-of are defined on this level but not in the semantic form. SF will be interpreted in CS, i.e. by a mapping from primes to concepts. Usually this is not a one-to-one mapping, but more a function that maps primes onto configurations or complexes of conceptual units. Evidence for distinguishing between semantic form and conceptual structures is given above. So SF is autonomous because the elements and configurations of elements in SF represent the interaction of conceptual structures that is relevant for the meaning of natural-language expressions. Henceforth SF mediates a mapping of these configurator onto constituents in the syntactic structure (see Bierwisch and Lang 1987:650). What does the semantic form of the constituents provide for the semantics of NNcompounds in a two-level-approach? First of all, the core meaning of the constituents is given. This may also include a relation in SF of the head noun that is used for compound interpretation. In that case, this relation should lead to a preferred reading of the compound since world knowledge does not have to be taken into account. So if a relation is salient in many modifier-head combinations with the same head, it can be judged as evident that this relation is part of the semantic form of the head. Examples might be thief-with a 'stealing'-relation or knife with a 'cut'-relation. A judged distinction between SF and CS should explain strong differences between preferred reading(s) and other possible readings of novel compounds. Utterance meanings of compounds are determined by a contextually relevant relation and a contextually relevant reading of the constituents. Since the shifts from lexical meaning to utterance meanings are explained by conceptual operations, there is no arbitrariness in meaning determination as in approaches handling default representations, but well-defined variations.
21 To sum up, the approaches for treating the phenomenon of having a more or less stable semantic representation in the lexicon and having contextually induced semantic variations were discussed. The two-level-approach seems to be the most promising candidate as a framework for semantics of novel NN-compounds. Within this approach, semantic variations can be explained by mappings from semantic primes to conceptual structures. Moreover, patterns of variations can be assumed that extend semantic information by conceptual knowledge. Furthermore, the distinction between semantic form and conceptual structure allows a forecast if a relation leads to a salient interpretation of a novel compound. After discussing the advantages and disadvantages of different approaches on word semantics I come to various approaches on the interpretation of noun-noun compounds. All proposals that will be discussed below are dealing with the problem of finding a suitable relation. Compounding rules are given such that each compound can be interpreted by at least one rule. Variability of word meaning - the topic of the previous subchapters - is not taken into consideration in these approaches. Therefore they cannot meet the requirement on explaining variability of compound meaning. But the approaches are based on different assumptions on the status of world knowledge for compound interpretation. There is also some work on contextual influence in compound interpretation. Therefore it will be discussed whether the approaches do meet the requirements for a semantic theory of novel NN-compounds except the requirement on variability of word meaning.
2.2
Conceptual Categories for Compounds
Many attempts have been made to classify patterns of compound meanings or to use a fixed set of categories for deriving possible compound meanings, e.g. in Downing (1977), Levi (1978), van Lint (1983). These classes might mirror general principles of concept combination. Within descriptive linguistics, the pattern found in lexicalized or novel compounds are documented (e.g. van Lint (1983)). Within generative treatments, rules for creating compounds from underlying structures by certain operations are given (Lees (1960), Levi (1978)). Levi suggests a set of 'recoverable deleteable predicates' (RDPs) as the set of underlying relations for all(!) 'complex nominale' (CNs). This term includes noun-noun compounds as well as [N[N + er]] compounds and AN-phrases with a nonpredicative adjective. All meanings are drawn from this set of relations (p. 76f.): cause (causative) tear gas, onion tears, drug deaths have (possessive) picture book, apple cake, lemon peel make (productive) honeybee, snowball, daisy chain be (essive/appositional) soldier aunt, snowball, cactus plant use (instrumental) steam iron, nuclear weapon, machine translation
22
for (purposive) horse doctor, arms budget, picture album in (locative (temporal or spatial)) morning prayers, field mouse, family problems about (topic) tax law, love song, linguistic journal from (source) olive oil, sea breeze, store clothes 'Similarity' is explicitely ruled out by Levi because of its pragmatic nature: it can only be determined in a given context in which way the head concept is similar to the modifier concept. Can the predicates above be the basis for all possible compound meanings? The categorization alone does not give enough information to determine the meaning of a compound in many cases. For example, the 'in'-relation designates a location of the head object with respect to the modifier object. But it is still an underdetermined relation. It can be stated more precisely as a location on, at, in, near to an object, and some other locative relations are possible. Tischtelefon ('table telephone') for example primarily could mean 'telephone located on a table' since a table is a supporter. Still other location meanings are also possible. Thus, we see: the more abstract a category, the more ambiguous an assigned compound can be. It seems to be impossible to eliminate ambiguity of compounds as long as abstract categories are used. The abstraction from compound relations to categories also leads to some strange clusters of compounds: temporal and spatial locatives are united to one RDP. This leads to a classification with morning prayers and field mice belonging to the same category with regard to their underlying relation. Thus, Levi's set of basic relations is too general and therefore too abstract for being the basis of all compound meanings. Abstract categories cannot really help in interpreting compounds. Rather a fine-grained analysis of conceptual properties is a promising approach for avoiding ambiguity and for getting interpretations that are based on prototypical properties.
2.3
The Interpretation of Nominal Compounds in MontagueGrammar
A treatment of NN-compounds in Montagovian Grammar is proposed in Fanselow (1981). Synthetic compounds (i.e. nominal compounds with a derived head) are analyzed within the same logical framework in Hoeksema (1985). Since I am interested mainly in the interpretation of compounds which are formed by a combination of two simple nouns, Hoeksema's approach will not be discussed here. Fanselow's proposal is based on an assignment of German nouns to types according to Montague's approach. The nouns are subcategorized by morphosyntactic and other features. A quadruple of features is associated with each noun. Its components are features of the morphological form (M); features of the semantic type (S); an element of the set {0,1}; a set of pairs < a, 6 > with a € {0,1,2} and b € λί. The features of the morphological form are:
23 ws stem vs modifier in singular form vp modifier in plural form v modifier in general pi plural g genitive The features of the semantic type are: i individual common noun gr group (ibmmon noun ma mass noun en2 proper name (group) enl proper name (individual) χ place holder The third and fourth component are defined for one- and two-place stereotypical relations, respectively. These relations are part of the denotation of a noun. According to Fanselow, the third component is Ό', if there are no one-place stereotypical relations, and it is '!', if these relations are defined. The fourth component is more difficult to explain. For this element Fanselow defines three kinds of stereotypes: stereotypical two-place relations that do not effect an object are marked by the pair < 1, {1} >, stereotypical two-place relations that result in an object are marked by the pair < 1, {2} >. If two-place relations have both properties, they are marked by < 1, {1,2} >. Stereotypical relations belonging to a certain word are acquired while learning the meaning of that word. As an example, learning the meaning of 'water' involves learning properties as 'colourless', 'tasteless', 'liquid' and so on. Two-place stereotypical relations of a noun are represented such that they can be applied to the denotation of the second noun in a compound by functional application. According to each concatenation of two nouns, a syntactic rule and an interpretation rule is given, as usual in Montague-Grammar. For example, compound formation of two relational nouns which do not bind the same argument is supplied by the 15th rule in Fanselow's system (p.Ill): If A is from P Ä2, where K\ contains the syntactic structure of S\ and K2 contains the syntactic structure of S^. (2) If a configuration of the form [[Det N]MP VP]s is being processed, it triggers this rule application: Introduce in UK a. new discourse referent 'x'. Introduce a condition 'N(x)' in ConK. Substitute the NP by 'x'. (3) If a configuration of the form [[fV0]tf/> VP]s or [V [Pro\Np]vp is being processed, it triggers this rule application: Look for an accessible and suitable antecedent 'x'7. Introduce in UK a new discourse referent 'y'. Introduce in COTIK a condition 'y = x'. Substitute the NP by 'y'· (4) Lexical insertion is defined as substituting a configuration of the form 'N(x) 1 by 'noun(x)'. Applying stepwise these rules to the conditional (A) leads to DRSs K\ with reducible DRS-conditions and results finally in a DRS K:
7
Acceseibility is defined below. However, while there is a clear definition in DRT for accessibility, it is not quite clear what a suitable referent is. Although agreement seems to be an important factor for determining suitable referents, there are also other factors for determination. The problem of finding suitable referents does not only hold for pronouns but for anaphoric noun phrases in general. This topic is a difficult area still requiring fundamental insights. In chapter six, while discussing NN-compound utterance meanings, anaphoric references of the compound's constituents to antecedents will play a crucial role for figuring context- dependent meanings. How the antecedent is selected as a suitable referent cannot be explained in this work, however. Therefore I will not deal with this problem but just state that an anaphoric link between constituent and antecedent is established.
68 a farmer owns a donkey
Κ!,:
Κ:
=>
he beats it
=>
he beats it
=>
zw z=x w=y beats(z,w )
X
farmer(x) χ owns a donkey
xy f armer (x) donkey(y) owns(x,y)
According to the DRS-construction algorithm, K is the abridged form of this structure: xy farmer(x) donkey(y)
zw z=x w=y
VP
VP V
owns
w
beats
The central definition for correct equations of discourse referents is 'accessibility'. Referents of pronouns can only be equated with accessible referents8, so the location of referents determines what a possible antecedent for a pronoun is. The definition of accessibility is given in the next subchapter. The lexical insertion rule has to be modified slightly since words are no longer regarded as simple predicates, but their meaning variability is focussed upon in this work: instead of simple predicates, lexical DRSs of single or complex words are inserted. I will characterize lexical insertion in chapter 4.5. Since the universe of lexical DRSs may contain more than 8
It will turn out however that the notion of accessibility, as defined below, is too restrictive because it cannot account for several data. One general problem is described in v. d. Sandt and Geurts (1991:25f.;fn. 8): new processing units have only access to referents in the main DRS so that they cannot get markers from subordinated DRSs. This issue holds also for processing of definite NPs. The problem will appear for DRSs given in chapter six as well but avoided by a simple trick in order to keep this nevertheless useful constraint.
69 one discourse referent, it has to be defined where discourse referents are located. Semantics of the DRS K can be stated informally as follows: whenever the DRS-conditions in K\ of a form K\ =^ KI are true - i.e. an extension function f maps the referents in UK\ onto their extensions such that the conditions in Con^ are satisfied - there is an extension function g that is an extension of f such that g maps also the referents of ΙΙκ2 onto their extensions in the model in view of the DRS-conditions in COn#2. So DRS-construction means building a partial model that will be embedded into a total model.
4.3
Syntax and Semantics of DRLC
DRLC is a modified version of the standard DRL as developed by Kamp (1981). It provides the framework for knowledge representation in conceptual DRSs. The task of lexical and discourse representation will be offered by another language developed below, namely DRLiex. Syntax and semantics of DRLC are explained in this chapter9. Syntax of DRLC: Let V be a set of variables, the discourse referents. Let P be a set of n-ary predicates. The connective symbols are: -·, =Φ·, Ό, V, The function symbol is: Card DRLC is the set of all conceptual DRSs. A conceptual DRS K is a pair < t/c,COnc >, with Uc C V: a finite set of discourse referents Conc : a finite set of DRS-conditions A conceptual DRS-condition c is one of the following forms: P(ZI,...,z n );n > 1 is a DRS-condition Card(x) < n;n € JVf = {1,2,3,...} is a DRS-condition Card(x) > n; n G λί = {l, 2,3,...} is a DRS-condition if A"' is a DRS, ->K' is a DRS-condition if ΑΙ, ΚΊ are DRSs, KI =>
"2 is a DRS-condition
if ΑΊ, K2 are DRSs, ΑΊ & K2 is a DRS-condition if Ki(l < i < n) are DRSs, KI V .. V Kn is a DRS-condition 9
In developing DRLe, I benefited from Kamp and Reyle (1988) and Roberts (1989).
70 These are the only DRS-conditions. Semantics of DRLC: Definition 1 A Model M. for DRLC is α pair < I/, [.] > consisting of: (i) U: a non-empty set; the universe (ii) [.]: a function mapping predicates ρ to their extensions in Μ Definition 2 An embedding function g from a DRS K to M. is a function with Dom(g) = UK and Ran(g) C U Definition 3 A conceptual DRS-condition c is satisfied in a model M. for an embedding function g (M \=g c) iff c is verified by g in M. Definition 4 An embedding function g of K to U is extended to an embedding function h of ΑΊ to U iff h: (Dom(g) oUKl)^U An embedding function g verifies a DRS K in M. iff g verifies each of the conceptual DRS-conditions c in Conc. g verifies a conceptual DRS-condition c in M iff:
M h,p(xi,...,x n ) iff < g(xl)...g(xn) >6 \p]
M \=a Card(x) < n iff g(x) = a & |{a € U}\ < n M K Card(x) > n iff g(x) = a fe |{a € U}\ > n M. \=g -'K1 iff there is no extension h of g to UK' such that M \=h Kl M \=g KI => KI iff for every extension h of g to I/«·, such that M \=h K\ there is an extension j of h to UK? such that M. ^j KI M \=3 KI O KI iff for every extension h of g to U^ U Ι/χ,, M \=h KI and M\=hK2 M \=g KI V ... V K„ iff for some i (1 < i < n) there is an extension Λ,· of g such that M (=^ K{ The concepts 'subordination' and 'accessibility' can be defined (See Kamp and Reyle (1988:104)): Definition 5 (Subordination of DRSs) Subordination is a relation between DRSs: 1. a DRS Kl is immediately subordinate to a DRS K2 (ISUB(Ki,K2) ) iff either (a)
KI is an element of the DRS-condition Km V ... V Kn in Con^ or
71 KI is identical with K3 in a DRS-condition ->K3 in Οοηκ2 or KI is identical with Km in a condition Km KI is a DRS-condition in some COTIK, or
ft)
KI *& KI is a DRS-condition in some COTIK2. a DRS ΛΊ is subordinate to a DRS ΚΊ (SUB(Ki,K2) ) iff either or
ISUB(Ki,K3) and SUB(K3,K2) Definition 6 (Accessibility of discourse referents) Accessibility is a relation between discourse referents. In a DRS K, a discourse referent χ is accessible for a discourse referent y (ACC(x,y) ) iff 1. x,y € UK or
2. there is a DRS K^ such that SUB(Ki,K) and χ € UK and y € UKl Definition 7 (Accessible set of DRSs) The definition of the accessible set makes use of the partial order on sets of discourse referents induced by ACC: The accessible set UK* of a DRS Κ is the set of all discourse referents in UK &nd universes of DRSs subordinate to K: UK. = {x\x € UK V ι € UK < andSUB(Ki, K)} This implies an accessibility tree for discourse referents from subordinate DRSs to the main DRS. By way of illustration, in the DRS K below there are the following immediate subordination relations and accessible sets: χyz wv P(w0
t
v = x| V
Q( z )l
=>
R(w) P(t)
Ks
—l
1ιγ O < {}, {r(y r' is the role that has to be defined.
79 £[(compr!..rn)] = ε[τ·ι] ο φ2] ο ... ο e[r„] T[(compri..rn)] =< {xl5 χη}, {< {}, {r'(x x ,x n )} e[(range re)] = e[r] Π (D χ e[c]) r[(ran$erc)] =< {x,y}, {r(x, y), e[(notrole r)] = (P χ Ρ) \ φ] r[(no/er)] =< {x,y}, {-. < {},{r( e[self] = {< x, y >€ I> x I> : χ = y) r[se//] =< {x}, {r(x,x)}> e[( C n r,)] = {(x, y) € P x D : r[( Cn r,)] =< {x,r/}, {< {}, {
V : (x, z)
, 2) €
: (y,z) € e[r2]
ra)] = {(x,j/) € r,)] =< {x,y},
The translation function provides complex DRSs as constituents for definitions of terms and roles. For example, the definition of 'car' in the previous subchapter can be directly translated into an extensionally equivalent DRS KcaT: X
car(x)
•ΦΦ·
vehicle(x)
y _
driven-by(x,y) Card(y) = 1
=» combustion-engine(y)
z steered-by(x,z)
=>
V
transporting(x,v) w steered-by(x,w)
=»
driver(z) person(v) transporting(x,w)
Card(x) = 1 is the abbreviation of Car 1. The extension function ε for U and the verification function g for DRLC both denote the same extension in a domain Ί). For instance, let T> = {a, b, c, d, e, f, g, h, i, dl, d2, d3, pi, p2, p3, p4, p5} and e[vehicle] = {a, b, c} e[driven-by] = {, }
80 Thus, the set of combustion engines has at least f and g as its members. e[steered-by] = {, , , } Thus, the set {dl,d2,d3} is a subset of the set of drivers. e[transporting] = , iff [.] is equal to ε. As mentioned above, an embedding is a function f whose domain is included in the set of reference markers and whose range is included in the universe: f: UKC„ ·-» P. f verifies Kcar iff f verifies each of the conditions belonging to COnjccar in M.. And f verifies the condition c in M. iff it maps it into elements or η-tuples of elements of T>. Therefore we get f(x) = {a,b} for the same reasons as above. The distinction between primitive roles and concepts and defined ones is reflected through the locus of a simple DRS-condition in a DRS Kc: Definition 8 (primitive and defined concepts): 1. A primitive concept Cp in a DRS Kc is (a) A simple DRS-condition in Conxc or (b) A simple DRS-condition in a DRS K\ of a DRS-condition K\ =» KI in CoriKc. 2. A defined concept cj in α DRS Kc is a simple DRS-condition in a DRS KI of a DRS-condition KI O KI in COUR,.. Modelling a domain in DRSs is different from constructing a DRS step-by-step while processing a discourse. A crucial feature of the latter is accessibility of discourse referents. Anaphorical relationships between the discourse referent introduced by a pronoun and discourse referents of antecedents are restricted by the accessibility relation. Now a logical reconstruction of a domain in DRSs does not violate this condition, but actually it is of no use: whenever a concept is not introduced as a primitive one, it is defined in a DRS K. The set of conditions in Conjf never contains conditions with arguments of superordinated DRSs. But accessibility to the referent introduced by the described concept is necessary in DRS-conditions of the form KI =>· KI or K\ & K^· To sum up, knowledge representation is the logical representation of aspects of a domain. If a monotonic logical language is used for knowledge representation, prototypical properties of concepts can be described. DRLC is powerful enough to be the logical framework for knowledge representation. Therefore defining the translation function τ from KL-ONE expressions to DRSs lays the foundations for relating discourse representations to conceptual representations. The conceptual basis for the interpretation of lexical
81
items can be represented in conceptual DRSs. It is verified by an embedding into a total model. This view is strongly related to the view of concepts as partial representations of real objects, based on experience. By means of a transmission of the two-level approach to DRT, conceptual DRSs are the extensions of a confirmation relation from semantic information of lexical entries. This semantic information is represented in lexical DRSs. The formal treatment of this aspect will be explained in the next subchapters.
4.5
Lexical Meaning of Nouns: Approaching the Two-Level Semantics in DRT
Nouns, like other lexical categories, have context-dependent extensions. This work is an approach to explain dynamics of single noun and NN-compound meanings by lexical decomposition. Lexical meaning is, as the semantic component of the corresponding lexical entry, the context-independent representation that subsumes all other non-metaphorical meanings. It is represented as lexical DRS. Utterance meanings, on the other hand, are context-specific extensions of the lexical meaning. But what might a lexical DRS for nouns look like? It is thought to be the contribution of the lexical entry to the utterance meaning of this lexical item. Therefore the information in the lexical DRS is the minimal information that will always be used while processing an utterance which contains the word. Its meaning is, so to speak, underdetermined with respect to a contextually specified utterance meaning. Now the set of lexical DRSconditions has to represent the context-invariant core meaning of that noun. The universe of a lexical DRS contains - among other referents - discourse referents for the external and existing internal roles. The domain of all referents is confined by selectional restrictions. At first glance, conceptual shifts for nouns in single sentences, as described in Bierwisch (1983), seem to be determinations of particular concepts. The determined concept is related with other concepts to form the complex concept that provides all concepts a noun may refer to. In this case lexical meanings of nouns are extended to utterance meanings by relating them to one selected concept. But this is a simplified image for two reasons: often a single concept cannot be assigned. Furthermore, additional relations are necessary between concepts and the extensions of simple lexical DRS-conditions17. Hence contextual variations for nouns involve the restriction of a set of principally possible denotations to a set of actual possible ones by naming relations between these actual extensions and the denotations of simple lexical DRS-conditions. This means, there are contexts that determine a particular denotation plus a relation but there are also contexts with more possible denotations for the reading of a noun. Examples are: 1. Der Kaffee (Getränk) tropft auf den Boden. The coffee (drink) is dropping onto the floor.
17
Simple lexical DRS-conditions are conditions of the form p(*i, ...,
82 2. Der Kaffee (Pulver? Getränk? Bohnen? Frucht?) ist teuer. The coffee (powder? drink? beans? fruit?) is expensive. 3. Der Kaffee (Pulver? Bohnen?) wird im Supermarkt immer billiger. The coffee (powder? beans?) becomes cheaper and cheaper in the supermarket. Depending on the activated scenario18, certain readings for coffee are ruled out. For example, (2) is so general in its proposition that actually no selectional restriction can be given. But contrary to (2), in (3) the supermarket-scenario leads to the assumption of getting cheap coffee powder or beans, but not the drink or the plant. So the utterance meaning of single nouns is determined by a set of actual usages of these nouns. Hence disjunctions of concepts are denoted by the lexical DRS and corresponding disjunctions of relations. In the ideal case this may be also just one concept name; then the noun has a fixed reading. My hypothesis is that the lexical DRS of single nouns denoting artefacts is crucially determined by a purpose-operator . This operator relates the referential marker of the noun to a representation of the primary function(s) of the objects. The purpose, an object is designed for, is represented as lexical meaning. This view seems to be justified for two reasons: First, there is evidence from psycholinguistic research that concepts are based on context-independe and context-dependent properties (Barsalou 1982). As Barsalou remarks, context-independent properties may be those frequently relevant to human interaction. Examples he gives are apple with the context-independent property 'edible' and wallet with 'can contain money'. Artificial objects are made for a certain purpose, so human interaction is frequently given by primary functions of these objects. Secondly - as a short anticipation of chapter 5.3 - the assumption of primary functions as context-independent properties leads to an interesting prediction on the meaning of novel noun-noun compounds. If these properties are context-independent information, they are part of the compound meaning. Moreover, context-independent two-place relations may serve as relations for interpreting the compound. However, as context-independent properties they should not determine possible utterance meanings of the compound, especially particular concepts. This seems to be right. By way of illustration, let us have a look at the noun museum. There might be a context-independent property of 'exhibiting'. If this property is used for compound interpretation of, let's say, Marmormuseum (marble museum), it does not determine a particular reading of the head. Of course Marmormuseum might most naturally be interpreted as 'museum (building) made of marble', but 'museum exhibiting marble' is possible as well. Now the latter does not determine a fixed sort for the head noun: 1. Das Marmormuseum (Gebäude in dem über Marmor ausgestellt wird) liegt an einem See. (The 'marble museum' (building where an exhibition on marble takes place) is located at a lake.) ^Understanding a discourse, irrespective of its length, means activating frames or models with default expectations on what comes next. For a short discussion on recent assumptions of text processing see Eschenbach et al. (1990) and chapter 6.1.2.
83
2. Das Marmormuseum (Institution, die ber Marmor ausstellt) kauft alte gyptische Mei el. (The 'marble museum' (institution, exhibiting on marble) buys old Egypt chisels.) The assumption of primary functions as context-independent properties might explain why conceptual shifts in noun-noun compounds are possible, if this primary function is the relevant relation in a certain context. This consideration just scratches the surface of the basic idea. A more elaborated formulation and its consequences for a theory of novel noun-noun compound interpretation will be discussed in detail in chapter 5. The lexical meaning of nouns denoting artefact concepts is crucially determined by an operator χ relating the referential marker of that noun to a representation of primary functions the object is designed for19. This is represented as a lexical DRS. Lexical DRSs are pairs < U{ex,Coniex >. However, lexical DRSs differ from conceptual DRSs in two respects: reference markers in Uiex may be bound by a preceding sequence of lambdaoperators. Its order is motivated by the relationship between semantic arguments and their realization in syntax. Thus, this sequence just represents the θ-grid of a lexical item with all the features thematic roles have, as described in chapter 3.5. Second, discourse referents in Uiex are to be limited to certain domains. Often O\ex of nouns does not contain referents restricted to specific domains since nouns tend to be sortally underdetermined with respect to their utterance meanings. But context makes the domain(s) explicit by selectional restrictions. Finally, there is a particular DRS-condition p = K' in Con\ex of artefact nouns, equating a variable ρ with a DRS Κ'. Κ' consists of a set UK· of discourse referents and a set GOTIK· of conditions representing just the primary functions. Now there is a relation from lexical DRSs into conceptual DRSs for identifying the complete concept the lexical DRS denotes. The complete concept consists of all available knowledge of the artefact. This includes all principally possible domains as well as all known typical properties. So the set of properties in a conceptual DRS is much wider than the set of properties in Can\ex of a lexical DRS. There is a simple relation between primes in Coniex and properties in the corresponding conceptual DRS: every prime has as its extension a set of tuples, determined by predicates in conceptual DRSs. Therefore the primes may also be considered as salient relations for the definition of word meaning which are 'transferred' from conceptual knowledge into the semantic part of a lexical entry. This view also supports the general consideration of having no clear distinction between semantic and conceptual knowledge. The link between primes and conceptual DRSs is given by semantics of the language that provides the constituents for lexical DRSs, DRLiex. Its syntax and semantics are given in the next subchapter.
19
A similar idea is reported in Pustejovsky and Anick (1988). They use a metalogical operator in order to represent the purpose of an object. However, one problem arises in their formulation: lexical representations of artefacts are bound to specific domains so that conceptual shifts do not occur.
84
4.6
Syntax and Semantics of ORL\ex
Lexical DRSs will be inserted into the resulting DRS of a discourse whenever a lexical insertion rule is applied. Lexical DRSs are formulas of DRL\ex. DRLiex differs from DRLC in these respects: lambda-expressions are possible and some lexical DRS-conditions are different from conceptual DRS-conditions. Semantics of DRLiex is given by a confirmation relation between lexical DRSs and the conceptual basis. The relation must provide all possible concepts the lexical item can denote in different contexts. The weaker concept of a relation, rather than a function, is the price for using a two-level semantics that separates between context-independent and context-dependent information. Without contextual information, only a relation can be given between a lexical DRS and conceptual DRSs. I will come back to this issue below. The conceptual basis modelled in DRLC is the model for lexical DRSs and context supports selection of certain concepts from the complete concept denoted by a lexical item. Semantics of the ^-operator provides a mechanism for generating the set of possible concepts the noun can denote. This mechanism is based abduction. The consequences of leaving deduction as a basis for semantics are far-reaching. The necessity of abduction in a two-level semantics is also discussed below. Syntax of DRLiex: Let Vje* be a set of variables Let Piex be a set of n-ary predicates The relation symbol is: = The operator symbols are: χ, !, λ, [] DRLiex is the set of all lexical DRSs for nouns and NN-compounds. A lexical DRS K\ex is a pair < f/ienGOn/e* > consisting of Uiex: finite set of variables {ii,...,x„} Con/βχ: finite set of lexical DRS-conditions A lexical DRS-condition (= Z)-RSi«-condition) is one of the following forms: A(XI,. ..,in), n > 1, is a lexical DRS-condition χ = y is a lexical DRS-condition χ(ζ,ρ) is a lexical DRS-condition If K' is a lexical DRS and ρ bound by the operator χ, ρ = Κ1 is a lexical DRScondition If K' is a lexical DRS with a referent x € UK·, '[s]^' is a lexical DRS-condition If K' is a lexical DRS with a referent x 6 UK·, [x]K' is a lexical DRS-condition
85
These are all .D 5/er-conditions. They are the constituents of COn/«r. A lexical DRScondition r(x 1 ,...,x n ) is also called simple lexical DRS-condition. If x\...xn 6 UK of a lexical DRS K, \xn,..\x\K is the semantic component of a lexical entry.
Semantics of Semantics of DRLiex is given by a relation r from lexical DRSs to a conceptual basis Kc in DRLC. For the following, the notion of accessibility will play a crucial role in interpretation. Accessibility (AGO) is defined in chapter 4.3. as partial relation on referents in DRSs. This relation holds for conceptual DRSs as well as lexical DRSs. The relation to be defined needs some comments on the structure of a knowledge base Kc modelled by means of conceptual DRS-conditions. Kc is a conceptual DRS. The set of conditions Conc of Kc is built by the following DRS-conditions: P(x), A(x, j/), K< ^· Kj, Ki O Kj. The first two conditions are simple conditions, introduced whenever a primitive concept or role is determined. The other ones are complex conditions introduced for specifying and defining concepts and roles respectively. Therefore the universe Uc contains only those referents which are introduced by the primitive concepts and roles respectively. Other referents are located in universes UK, of subordinate DRSs A",. Hence the range of the relation is included in the accessibility set UK· rather than UK · UK· is structured by accessibility, as defined in definition 6. This means the elements of UK· can be represented in tree-structures, given by the partial order ACC. A confirmation relation r is a relation with dom(r) = t//« and ran(r) C UK·· t confirms a DRS Kiex in Kc iff r confirms each of the DRS-conditions c in COn/r,«· s(xi,...,x n ) is confirmed by r iff < r(xi)...r(x n ) >€ [s] χ = y is confirmed by r iff r(x) = r(y) \[x]K' is confirmed by r if by default for every extension g of r to {x} g confirmes K' \x\K' is confirmed by r if there is an extension g of r to {x} such that g confirmes K' χ(χ, q) q = K' is confirmed by r iff there is a relation r' with Dom(r') = UK· that confirmes K' and there are conceptual conditions Ki =>· Kj or Ki Kj in Conc of the knowledge base Kc =< Uc, Conc > such that the range of r' is included in UK, °T its subordinated conceptual DRSs and there is an Oj with r(x) = Oi and C/κ; = i a i} and ConK, = (Pifa)} such that ACC(a,, i\UK=< an...a\ > and K is confirmed by r.
86 The operator χ requires a certain syntactic structure in Kc: the extension of Κ' in the lexical DRS-condition ρ = Κ' has to be located in conceptual DRSs Km which are subordinate to the conceptual DRS A,· containing the extension of the external referent of the lexical DRS. Hence conceptual DRS-conditions Ki =*· Kj and A", & Kj are taken into consideration where conceptual DRSs Km are located in Kj. Ufci contains as the only element an extension of the external referent of the lexical DRS and Οοηκί as only condition a term. Now semantics of the purpose-operator χ ignores semantics of the relation symbols =>· and &. It rather takes some referents and conditions in Kj as minimal conditions in order to generate the hypothesis that the referent in Ui^ is denoted. This is not deduction but an abductive inference. This means, a hypothesis that might turn out to be false is generated. The reason why abduction becomes the crucial mechanism for extension determination is founded in the use of the two-level semantics with its decomposition requirement. The link from DRLiex to DRLC cannot be a function because context-dependent extensions cannot be represented, in the framework given by the two-level semantics, by a mapping. Furthermore, since lexical meanings are assumed to represent context-invariant information that is related to concept descriptions, semantics for lexical meanings must take into consideration this information in order to get the domains the noun can denote. Information in lexical meaning is by definition just partial information. Finally, current formal models of concepts are based on the paradigm developed for KL-ONE, i.e. the description of necessary and (in the ideal case) sufficient properties concepts have. These concept models support certain reasoning mechanisms on conceptual knowledge, but the link from lexical knowledge to concept knowledge is not attached by them. The transformation of KL-ONE operators into DRS-conditions given above results in a certain structure which knowledge bases modelled in DRLC have. This structure, in combination with partiality of lexical meanings and non-applicability of a homomorphic function, backs the use of an abductive inference as generating possible extensions. As I mentioned above, a function from lexical DRSs into conceptual DRSs cannot be given for denotation assignment in a two-level-semantics because one lexical meaning is linked to several different conceptual denotations. Such a link is given by a relation. Now, the postulation of a two-level-semantics turned out to be useful in a number of analyses; for example Lang's (1989, 1990) investigations of dimensional adjectives and Herweg's (1991) analysis of temporal conjunctions20. Both authors make fundamental assumptions of representations as lexical meaning and the corresponding parts of the conceptual system. In Lang (1989) the link between lexical meaning and concepts is given by a matching procedure that matches under certain conditions parameters in lexical meanings with entries of the same name in conceptual representations. In Lang (1990:76), he characterizes the procedure as ein-mehr-deutigef..] Abbildung (one-to-many mapping) of semantic primes to their conceptual correspondents. This is not a function but a relation. Herweg's presupposition how lexical representations are related to conceptual 20
See also the literature cited by Herweg( 1991:54, fn.5) on additional work in the paradigm of the two-level-eemantica.
87 structures seems to be based on the same idea. This brief characterization of other approaches working with the same underlying idea demonstrates that replacing extension functions by the weaker notions of relations seems to be necessary if abstract models are replaced by complex conceptual structures. As a matter of fact, isolated lexical items do not denote one concept; they do not necessarily denote exactly one concept even in context. Context-dependent extensions of single nouns are determined by a process called conceptual shift. This mechanism selects from possible domains, a noun can denote, contextrelevant domains. This is explained below and I will argue that shifts can be triggered by script-conditions, i.e. expectations on certain courses of events and the objects involved. Instantiating a script, however, is equivalent to an abductive inference, as Charniak and McDermott (1985:555ff.) point out. In general, it is assumed that at least parts of discourse comprehension are founded on abductive inferences because causal reasons must be inferred from a discourse in order to generate explanations for courses of events. Hence abductive inferences play a role in natural language comprehension. In this approach, such an inference is applied on the conceptual level in order to search for possible denotations nouns can have. Selecting contextually relevant domains from the set of possible domains can be an abductive inference as well. Definition 9 (complete concept of an artefact noun): The complete concept a lexical DRS for artefact nouns denotes is determined by the set {< αιη...αι, >,...,< am„...ami >} the \-expression has as its value. Then the predicates Pi', (1 < * < "0 having in Kc ^„,.,.,α^ or ... or a m „,...,a mi as arguments are forming the set of principally possible denotations.
Suppose a lexical DRS has the form \x K with Κ =< {χ}, {χ(χ,ρ)ρ = Κ'} > and Κ' =< {zi,...,zn},ConK> > and r(x)=a, g(zi)= &,·(! < » < « ) · Then all relations having < a, 6,· > as elements are links between extensions of simple lexical DRS-conditions and the domains of possible utterance meanings. A concrete example may be the relation from the lexical DRS for Buch (book) to the conceptual DRS Kc below. Kc represents some knowledge of books. The lexical DRS Kbook is: λ χ
χ y ζ el e2 publishing-company-institution(z) mediating(e2) P= publishing(el) theme(e2,y) agent(el,z)
88 Outline of the conceptual DRS Kc: object (a) book-object(a)
ί
y
info-carrier(a,y) V
serves-for(a,v)
=Φ·
entity(y)
=>·
mediating(v) ζ theme(v,z) ^
a l(z)
w theme-of(a,w) =ϊ publishing(w)
agent (wj) u distributed-by(a,u) m sold-in(a,m)
publ-co mp-inst(j)
* publ-comp-inst(u) =>
building(m)
information(b) book-info(b)
c written-by(b,c)
=>
s available-by(b,s)
= > P ublishing(s)
person(c)| V
Β agent(s,g) e agent(b,e)
=>
institutem(c) |
=?
' publ-con ip-inst(g)
mediating(e) f theme(e,f) ^
Γ
selected-by(b,r) d theme(b,d)
all( f)
=>· publ-comp-inst(r) =*
all(d)
The conceptual DRS Kc = < UKC, ConKc > functions as a model < UK·, [·] > providing all known entities by UK· and the relationships between them by the predicates in COUKC. The elements of UK· are partially ordered by accessibility. Thus it has a certain tree-
89 structure. It consists o f { a b c d e f g j r s u v w y z } which is ordered as follows: ACC(v, z), ACC(w, j), ACC(a, {y, v, w, u, m, z, j}), ACC(b, {c, d, e, r, s, g, f}), ACC(s, g), ACC(e, f). and [.] consists of the pairs: < book — object, {a} > < object, {a} > < book — info, {b} > < info — carrier,{< a,y >} >< entity,{y} > < serves — for, {< a,v >} > < mediating, {v, e] >< theme, {< υ,ζ >,< b,d >,< e,f >} >< institution, {c} >< all,{z,d,f} > < theme — of,{< a,w >} >< publishing, {w, 3} >< agent,{< w,j >,< s,5 >, < 6, e >} >< publ — camp — inst, {j,u,g, r} > < building, {m} > < distributed — by,{< a,u >} >< person, {c} > < information, {b} > < available — by,{< b,s >} >< selected — by, {< b,r >} >< written — by, {< b,c >} >< sold — in, {< a,m >} > . Actually only some of the predicates are necessary in order to determine the extensions of the lexical DRS Kt,00k· The Α-operator provides the set of elements or tuples denoted by the DRS. The operator χ restricts denotations by conditions on certain conceptual structures. According to the confirmation relation, we get: r(x) = a, b; g(e2) = v, e; g(el)= w, s; g(z) = j, g; g(y) = z, f. All relations having < α, ν >, < a,w >, < a,j > and < α, ζ > (a = a V 6) as elements are links between extensions of simple lexical DRS-conditions and the domains of possible utterance meaning. For instance, the utterance meaning of book as physical object requires these additional conditions: serves-for(a,v) and theme-of(a,w). The complete concept denoted by Buch (book) with respect to the knowledge base Kc is given by the predicates 'book-object' and 'book-info'.
4.7
Lexical DRSs for Nouns and NN-Compounds
According to chapter 3.5, a lexical entry for nouns is a quadruple, consisting of a phonological form, a set of grammatical features, θ-grid and the semantic form SF. I call SF lexical meaning. The interface between syntax and semantics is crucially determined by the θ-grid. This is a sequence of lambda-operators binding variables in SF that have to be satisfied by arguments in syntax or morphology. A θ-grid < \xn...\Xi > of a noun consists of usually optional internal arguments < Az n ...Ax2 > and the external argument Αΐχ. θ-grid and lexical meaning of nouns are represented as sequences of lambda-bounded variables and a lexical DRS KICX. The general schema for a lexical DRS of a noun, together with its θ-grid, looks as follows: λχ η ....Αχι set of DRS-lex conditions The DRS-conditions involved in a lexical DRS for nouns are determined by the ontological type of the noun extensions as well as by the form of the θ-grid. For example, the difference between natural kind terms and nouns denoting artefacts is mirrored in the occurrence of χ as being a DRS-operator. But the existence of internal θ-roles in the universe of a DRS /((„ has implications for the set of conditions as well: since sortal
90
nouns are used for classifying entities, they do just have an internal θ-role, as e.g. chair, car, flower, or sugar. Therefore there are no conditions in Conjex expressing relations between the external θ-role and internal ones. Relational nouns, on the other hand, do not classify entities but denote relations between a specific entity and a second one. Examples are rest, side and many deverbal nouns. Coni^ of their lexical DRSs contains conditions for expressing this relation. The distinction between sortal and relational nouns has implications for the DRS-construction algorithm for NN-compounds: whenever the θ-grid of a head noun contains internal θ-roles, the modifying noun may satisfy one of them by discharging a θ-role. If it does not satisfy the outermost role λιη but a role \Xi\i ^ n; all preceding roles < \xn..\Xi+i > must be satisfied on the syntactic level or they have to be existentially bound. The relationship between morphology and syntax is strongly linked to these operations. However, I will not discuss this subject (I have already touched on it in chapter 3) but focus on non-derived relational nouns. Their θ-grid is constituted by two θ-roles so that the modifier may satisfy the internal role. The DRS-construction for NN-compounds is based on morphological rules proposed by Selkirk (1982) and described in chapter 3.2. Selkirk's approach is characterized by simple phrase structure rules for generating complex words. Synthetic and root compounds are generated by the same rule. They differ with respect to satisfaction of a head argument by the modifier. If the modifier satisfies an argument of the head noun, the compound is synthetic. So tree eater is a synthetic or a root compound, depending on whether the modifier satisfies the argument of the head ('eater of trees') or not (e.g. 'eater of something who is localized in a tree'). Feature percolation is defined from the righthand terminal node to the top node for NN-compounds. I adopt the use of one phrase structure rule for DRS-construction of NN-compound representations with sortal or relational head. The interpretation of NN-compounds with relational head makes use of its relational nature. Satisfaction of the internal argument of the head noun by the modifying noun will trigger a particular compounding rule. The word formation rules are given in chapter five. The structure of NN-compounds is the following: [+N -V; num = a, gen = /?]
[+ N -V; num = δ, gen = 7]
[+N -V; num = a, gen = β]
The rule generating this morphological structure triggers the embedding of the lexical DRS of the modifying noun into the lexical DRS of the head noun. The default template s:
91
lexical DRS-conditions of the modifier lexical DRS-conditions of the head noun
By means of embedding the modifier-DRS into the head-DRS as particular DRS-condition, the modifier gets by default generic quantification. This is marked by the genericityoperator '!'. In discourse however, genericity can be replaced by context-dependent kinds of quantification, as will be shown in chapter six. The precise construction is subject to NN-compound interpretation rules that are given in the next chapter. For example, nonsatisfaction of an internal head argument by the modifier forces the use of a semantically or conceptually motivated relation in order to interpret the compound. The schema above just provides a lexical DRS K(ex =< Uiex,Caniex >, demanding a particular lexical DRScondition that is built by the lexical DRS of the modifier. The modifier-DRS contains a distinguished marker in its universe that was bound as external marker. The definition of lexical DRSs for NN-compounds allows the construction of lexical DRS-pattern as representations of longer concatenations of nouns. Structural differences as e.g. between [[NN]N]- and [N[NN]]-compounds are reflected in different embeddings of lexical DRSs of their constituents. By way of illustration, both lexical DRSs possible of three concatenated sortal nouns are given. [[NN]N]-compounds:
Az
z le xical DRS-conditions of head noun of [[NN]N] [y\ 1exical DRS-conditions of head noun of [NN]
! |
M.
lexical DRS-conditions of modifying noun of [NN]
[N[NN]]-compounds:
Az lexical DRS-conditions of head noun of [N[NN]]
[y]
lexical DRS-conditions of modifier of [NN]
N
lexical DRS-conditions of modifier of [N[NN]]
92
Thus, the rule N —>· N N triggers this lexical DRS combination rule: If the lexical DRS /Ovj of the head noun has the form < UK , Οοηχ > with: UK = {yn,—,yi} ConK = {Δι,...,Δ,} and the lexical DRS K^ of the modifying noun has the form < UK*·, COUK* > with:
. = {ΣΧ,...,ΣΓ} the lexical DRS Α#,#2 has the form < ί/#»,6Όηκ» > with: UK- = {j/n,..-,yi} ConK> = {![ This template provides the most general form of lexical DRSs of NN-compounds. Their actual forms are determined by additional conditions. Lexical DRSs of simple and complex words are subject to lexical insertion rules. Since θ-role assignment is a crucial feature for morphological and syntactic theories, simple lexical insertion rules as sketched in chapter 4.2 are not sufficient for inserting lexical DRSs into a main DRS while constructing a representation of a discourse. Lambda-operators must be discharged by assigning θ-roles to governed phrases. Unassigned optional roles have to be changed into existentially bound variables. Therefore lexical DRSs cannot be inserted into a main DRS as a whole but elements of their universes have to be shifted into universes of sub-DRSs such that θ-role assignment is possible and accessibility still holds. A complete theory of insertion of lexical DRSs would go far beyond this work. Therefore I will give the minimal requirement on lexical insertion of simple and complex words: Reference markers bound by θ-roles are shifted into universes of DRSs such that they can be used for assignment. Some lexical DRSs of simple nouns are presented below. Their meaning should be self-explanatory. These are lexical DRSs for museum and fan: museum: λχ
X
χ(χ ,P) yel e2 exhibiting(el) Ρ = informing(e2) theme(el,y) theme(e2,y)
fan: Αχλ y
93
4.8
Conceptual Shifts of Single Nouns
I argued for the necessity of drawing a distinction between lexical representations (lexical meanings; given as lexical DRSs) and contextually determined denotations. Within the two-level approach, utterance meanings are extensions of lexical meanings by contextually specified information. An approach to integrating the two-level semantics into the current DRT has been made. Semantics of lexical DRSs is given by a relation between referential markers of lexical DRSs and the accessibility set of a knowledge base built by conceptual DRSs. The set of possible concepts a lexical DRS can denote is determined by the purpose-operator χ. Conceptual knowledge is interpreted by the embedding into a total model. A lexical DRS is a pair < Uiec,Caniex >, consisting of a set of referents and a set of £>A5/er-conditions. The latter set represents the context-invariant kernel of all contextually specified utterance meanings. So to speak, lexical DRSs represent an underdetermined meaning that has to be established. In a discourse, the set of domains will be specified. Contextual variation of single nouns means mainly variation in the set of contextually possible domains that may be assigned to a noun. Fixation of contextually possible domains also fixes relations between domains and extensions of lexical DRS-conditions. In this chapter I shall explore how domains are contextually determined. The confirmation relation between lexical DRSs and conceptual DRSs just identifies all possible domains that might be contextually relevant. Further information is needed in order to identify contextually relevant domains. Establishing contextually relevant domains is done by two kinds of conditions that make use of selectional restrictions: 1. Subsumption of a required domain in the semantic form of θ-role assigning elements and domain of the noun, θ-role assigning lexical categories are verbs, prepositions, adjectives and relational nouns. 2. Subsumption of a domain given in a script or frame and possible domains of the noun. A script represents relationships between concepts and draws out default orders of course of events. I will explain both conditions on utterance meanings of single nouns. (1) Contextual variation of [-N]-categories is fundamentally different from variations of nouns. Verb meanings are not conceptually underdetermined as referring to different domains but they can be specified within one domain. For example, to cut refers to the event of cutting, but it can be specified by different objects that can be cut: cutting hair, grass, metal plates or paper indeed need specialized cutting events which can also involve different instruments. The same argument holds for prepositions as well if we assume that local prepositions are not a basis for temporal usage of them. For example, m is ambiguous, it can be used for temporal and spatial descriptions, as in Hans lives in Berlin and Hans will come in two hours. Both kinds of prepositional use however allow variation in their domain with respect to their application. So the in-localization of The flowers in the vase is different from The carpet in the room. These examples show that nouns behave differently from
94 verbs and prepositions in one fundamental aspect: they can denote different domains while verbs and prepositions can be specified within one domain. Adjectives and relational nouns might specify the domain of their arguments as well. For example, only persons are able to be miserly and only acids are corrosive. A Fan (fan of) might be a fan of everything, there are no selectional restrictions. But a chancellor is the chief of an institution; and a player is a player of games or music. This are intersentential conditions on utterance meanings of single nouns. General conditions on utterance meanings may be provided by scripts: (2) Scripts and frames21 respectively determine stereotypical courses of events. They represent conceptual knowledge of events and scenarios. Scripts do not only order events but they also have influence on the set of possible inferences that can be drawn from sentences and their lexical material for coherence of a discourse. A script consists of three components: it declares the entities involved in a particular course of event; it determines the default order of events; and it gives conditions for entering the script. An example for a discourse with violated coherence by violation of a sterotypical expectation is given in a discourse described by Bosch (1985): There is an announcement at a conference that there will be coffee at half past ten in the entrance hall of the building. When the break starts, coffee beans are located in the hall. From a logical point of view the text is true but there is a pragmatic fault. Obviously the conference-script is violated. But what does the violation mean for the meaning of the lexical item coffee! Coffee might have a semantic form characterizing the purpose of providing a liquid. The concept which goes with it represents the plant with its fruit and all the mediate concepts that are related to coffee powder, beans, the drink and so on. There is no selectional restriction in the first sentence of the discourse for the meaning of coffee but we had a selected concept in mind when the sentence was processed, namely the coffee-drink concept. In this case, background knowledge of stereotypical conferences drives the selection of the relevant concept. So general knowledge of what might happen in a course of events determines what kind of inferences are drawn and which ones have to be avoided. Scripts can be represented in slightly different DRSs as well, as it is shown by Bartsch (1987)22. According to Bartsch's translation of frames and scripts into DRSs, a conference-script might be represented like (< is the temporal order relation between events and scenes): "Scripts (e.g. Schank and Abelson (1977)) are more event-oriented while frames (Minsky (1981)) are object- centered. 22 Bartech's proposal is also handling problems of word meaning, their underlying concepts, and the relationship to discourse representations. However, there are some fundamental differences between Bartsch's approach and the integration of the two-level-semantics into DRT I am working out. First, Bartsch is not dealing with contextual variation of word meanings but with possible inferences in a discourse based on word meanings. Second, her main goal is to contribute to a compositional semantics for DRT, partly based on scripts and frames.
95
e conference(e)
=>.
default: a b c d el participating(e ,a) location(e,b) participant(a) conference- room(b) talk(c) hearing(a,c) cofFee-break(d) c»e Seite, und zwar des Schränke, ist gelb. Seite (side) is relational with an optional internal argument. 1. Der Biologieprofessor klonte seine Studenten. (The biology professor cloned his students) 2. Der Professor für Biologie klonte seine Studenten. 3. Für Biologie der Professor klonte seine Studenten. 4. Der Professor aus München für Biologie klonte seine Studenten. 5. Fur twos ist er Professor? 6. Der Professor, und zwar für Biologie, klonte seine Studenten. Professor is a sortal noun, therefore without an internal argument. The compound Biologieprofessor is not formed by functional application. Thus, German relational nouns can in most cases be distinguished from sortal nouns by a detachment test that separates arguments from adjuncts. Compounds may be generated by means of argument satisfaction if the head noun is relational. The lexical meaning of the head noun determines possible candidates for argument satisfaction by selectional restrictions. If a relational head noun is combined with a modifier that does not meet the selectional restrictions, the compound still is relational and a relation between head concept and modifier concept has to be inferred. The internal argument will not be discharged. For example, detached from the head while arguments have to be governed by the head: 1. Der Eintopfkoch hat es nur gut gemeint mit dem Salz 2. Der Koch des Eintopfes hat es nur gut gemeint mit dem Salz 3. * Vom Einiopf der Koch hat es nur gut gemeint mit dem Salz 4. *Der Koch aus Ungarn des Eintopfes hat es nur gut gemeint mit dem Salz 5. * Von was ist der Koch, der es nur gut gemeint hat mit dem Salz? 6. *Der Koch, und zwar des Eintopfes, hat es nur gut gemeint mit dem Salz According to this test, Koch is relational with an optional internal argument. So tests on word order and insertion of adjuncts seem to correspond to the intuitions on arguments and adjuncts. Deverbal nouns as Koch (cook) or Bäcker (baker) are more likely relational nouns than aortal ones.
108
the lexical meaning of Computerbruder (Computer brother) may be sketched in this way: \y \x [brother — o/(i, y) & [3 z, R : computer(z) & A(x, z)]] Satisfaction of the internal argument is still possible: 1. Peters Computerbruder (Peter's 'computer brother') 2. Das Aluminiumteil des Autos (The 'aluminium part' of the car) 3. Der Eisenkopf der Laterne (The 'iron head' of the streetlight) So the compounding rule for novel NN-compounds with a relational head can be stated as follows: If the lexical DRS of the head noun JV2 is K^ = Ay2 Ayi < {j/i, y2, ···}, {lexical DRSconditions of K^t} > and the lexical DRS of the modifying noun is A//, = (Ai2) Azi < {(x2), xit—}, {lexical DRS- conditions of K^} > and KN, refers to the complete concept C-t = {Qi(bi,ci), ..., Qm(i>m,Cm)} and KN^ refers to the complete concept C\ = {Ρι(αι),..., Pn(an)} and there are selectional restrictions -Ri(ci); (1 < i < m) and for some ij (1 < j < "), subsumption(A,, P,·) holds, then the lexical DRS K' of [Νι ΝΊ] is: yii/2 lexical DRS-conditions of K2
[xl] x2 ... lexical DRS-conditions of K\ xl=y2
Information used for this compounding rule is provided by the θ-grid of the head noun and selectional restrictions on the concepts denoted. This rule results in the only representation of NN-compounds that is mainly triggered off by grammatical features. All other representations are constructed by means of conceptual and discourse information. The lexical DRSs of Fan (fan of) and Buch (book) are given in the previous chapter. Then B cherfan (book fan) is representable by means of the compounding rule given above because the head provides an internal argument. Concepts denoted by the modifier are subsumed by the concept of the head's internal argument. Thus, the lexical DRS Kbookfan 13=
109
fan-of(x,y)
N ?) v w e3 e4 publ-comp-inst ( v ) publishing(e3) mediating(e4) theme(e4,w) agent(e3,v)
P=
z =
y
Notice that this lexical DRS denotes all modifier readings provided by its complete concept. The concepts provided by the complete concept denoted by Buch are all subsumable by the concept of the internal argument of the head which is to be the top node of the concept lattice. Fans can be fans of everything, there are no selectional restrictions.
5.2
Possible Sources for Relations in NN-Compounds
Throughout all chapters I argued for a distinction between the representation of lexical meaning and representations of utterance meanings. I also gave the compounding rule for novel NN-compounds with a relational head. This rule is based on the argument structure of the head and selectional restrictions for the modifier; the rule results in the lexical meaning of the compound. If argument satisfaction takes place, selectional restrictions in the lexical meaning of the head noun determine the reading of the modifying noun. Conceptual shifts in the sense as defined above are not possible for non-derived relational nouns4. So Museumsseite (museum side), Bücherfan (book fan), Bäckermutter (baker 4
However, pragmatic and quasi-metaphoric interpretations of non-derived nouns are still possible. For example, if I know that someone invites my girlfriend to a party for becoming better acquainted with her and I meet him, I might say: Du bist also der Partyeinlader (So you are the 'party inviter'). Then I do not mean 'the person who invites to a party' but 'the guy who invites my girlfriend to a party for certain intentions'. In that case there is a certain meaning of the compound that is fixed to a very specific situation. Such pragmatic meaning variability is not subject to my work although pragmatic and conceptual meaning shifts of course overlap. An example for quasi-metaphoric interpretations is Soldalenvater (soldier father) with the meaning 'someone who cares for soldiers like a father'. A typical behaviour of fathers is adopted to the extension of the head noun. However, this is no conceptual shift but transmission of attributes. Moreover, as Bierwisch (1989) points out, derived nouns, as e.g. event nominalizations, allow a shift to the result of the event and there is also a shift from agent nominalizations to the tool used for the action of an agent. An example for the former is Isolierung (isolation); an example for the latter is Bohrer (driller). Often compounding cannot make the relevant extension explicit. For example, Rohrisolierung ('tube isolation') can denote the material used for tube isolation as well as the event of isolating tubes. The underlying mechanisms for conceptual shifts in nominalization are quite unclear and I will not go into this problem.
110
mother) with the meaning 'x-of y' do not allow contextual variation from outside the compound for the extension of the constituents: the domain of the modifier concept is determined by selectional restrictions provided by the head noun and the lexical meaning of the head noun is its utterance meaning at the same time. As we will see below, other meanings are possible on conceptual grounds rather than θ-role assignment. But if the head noun assigns its internal role to the modifier, no contextual variation is possible anymore. θ-role assignment is just one kind of compound formation. It is even the only grammatical process supporting the interpretation of novel NN-compounds. Now within the previous chapters I argued for the existence of three levels of representation: lexical meaning as representation of context-independent information a lexical item contributes to a discourse; representation of typical properties entities have at a conceptual level; and representation of discourse where lexical meanings and discourse structure come together. These three representational levels may serve as sources for interpretations of NN-compounds. Each level has its specific implications for saliency of a relation and contextual variability of the constituent extensions. Taken these three levels and the grammatical property of being a relational noun together, there are four sources the relation in a compound can derive from: 1. The grammatical process of thematic role assignment 2. The relation belongs to the lexical meaning of the head 3. The relation is based on conceptual knowledge 4. The relation is based on discourse knowledge Level 1 as a grammatical process results in the most salient interpretation. Level 2 comprises relations that are involved in forming lexical meanings of lexical items. Therefore these relations lead to the most salient interpretations of compounds with a sortal head. An additional consequence: such a relation does not determine context-specific extensions of the constituents. Since lexical meaning is defined as the context-independent representation of all possible non-metaphorical meanings, relations from this representation are not subject to contextual variations. Such relations may not cause conceptual shifts. Reasons for these shifts are thus founded in discourse. Level 3 provides information on typical properties of entities. By its large amount of knowledge it results in many relations possible in NN-compounds. Selectional restrictions for the arguments of these relations lead to specific extensions of the constituents so that conceptual shifts are determined by these sortal restrictions. Interpretations based on conceptual properties are ordered with respect to saliency as well. The final level is the level of discourse representation (4). Discourse representation (DR) is understood as script-and syntax-driven representation with lexical meanings of lexical items and additional knowledge of discourse structure. DRs are, parallel to lexical items, mapped into conceptual structures in order to determine utterance meanings and
Ill
inferences to be drawn. Therefore discourse representation and concept representation can have characteristic functions for the interpretation of novel NN-compounds: (a) In DRs, antecedents of the compound's constituents may be given and a relation holding between the antecedents. The context-dependent conceptual representation of this relation overwrites all other meanings possible in the compound. It is the actual meaning of the compound in the discourse. (b) In the conceptual system, relations founded on conceptual information are triggered by discourse information and inferred conceptual knowledge. Inferences on previous knowledge result in certain relations that have to be chosen from all possible sources. Chapter 5.4 is about conceptual properties of objects and substances and their contribution to certain compound interpretations. Conceptual shifts of novel NN-compounds are discussed and explained in 5.6. The contribution of discourse representation to NNcompound interpretation is explained in chapter 6.
5.3
Relations from Lexical Representations
Based on the level ordering proposed above, I will focus my attention on each level the relations may derive from. The process of θ-role assignment was explained in 5.1 so that in this chapter I am able to turn to relations that belong to the lexical meaning of nouns. I pointed out that lexical DRSs of artefact nouns represent context-invariant core meaning by means of the purpose the artefact is designed for. Two-place relations in lexical DRSs may be used in compound interpretation. For example, B chermuseum (book museum), Tassentisch (cup table) and Wassertasse (water cup) are interpretable by means of the purpose the head noun concept is designed for. Then the first compound means 'museum exihibiting on books'; the second one gets the meaning 'table where cups are located on' and the third one means 'cup with water in it'. These meanings are always the preferred ones. So the conditions for usage of relations from lexical DRSs for compound interpretation must be stated in this chapter. Primary functions of artefact objects are not part of conceptual knowledge proper because they are constituted by language-specific conditions. Lables for artefacts are only usable as long as it is known what the purpose of the artefact is. For instance, an object may be named hammer if it is known that the object maximizes power for pushing a different thing into some material. Therefore the lexical meaning of artefact nouns is determined by the primary functions of that object. Since these functions are located on the language-specific level of lexical representations rather than the level of culturally determined conceptual structures, they are more salient than conceptually founded relations. If the head noun does not assign a θ-role to the modifier, they lead to the most natural interpretation of NN-compounds. By way of illustration, let us have a look at the lexical DRSs of Tasse (cup) and Tisch (table):
112 Αχ
Ax
X
x(*, P) ys containing(s) P = contained(s,y) container(s,x)
χ Χ (*Λ)
q=
ys supporting(s) supporter(s,x) supported(s,y)
Two-place relations may be used for compounding if selectional restrictions are met. Compounds as Teetasse (tea cup) and Wassertasse (water cup) are interpretable by the use of the containing-relation. For reasons of simplicity, I will sketch the lexical meaning of water as simple lexical DRS without decomposition. Its lexical DRS is Ax < {x}, {water(x}} >. The combination process is triggered by sumsumption. If the domain of a referent appearing in the purpose-box subsumes the domain of the modifier's external discourse referent, the referent of the modifier will be equated with the referent that appears also as argument of the subsuming predicate. However, for a uniform representation meeting the accessibility condition for discourse referents, the suitable discourse referent also has to be equated with a referent that is newly introduced in the universe of the lexical DRS. Thus one gets for Wassertasse: Ax
X V
χ(χ, P) ys containing(s) P = container(s,x) contained(s,y) y =V
|
[w] water(w) w=v
If we have a closer look at an example with a modifier represented by a more complex lexical DRS, the same mechanism is applicable. For instance, the lexical DRS of Tassentisch (cup table) is:
113 λχ
χu
q=
ys supporting(s) supporter(s,x) supported(s,y) y=u
[w] X(w,p)
P =
ab containing(a) contained(a,b) container(a,w)
w= u
Hence I can state the compounding rule for NN-compounds with a sortal head where a two-place relation from the lexical meaning of the head noun constitutes a meaning of the compound: If the lexical DRS of the head noun Λ^ is Kfjt = Ay < {y}, {\(z,p),p =< {ΓΙ, ...,zn}i {conditions with: r(j/,Zj)} >} > and the lexical DRS of the modifier NI is ΑΆ/, = (\x2)\Xi < {χι,ι 2 }, {lexical DRS-conditions of K^} > and K^t refers to the complete concept Ci = {Qi(fri), •••,Qm(bm)} and K^ refers to the complete concept C\ = {Pi(ai), ..., Ρ η (α η )} and for some $,·(&,·), their concept description contains the denotation of r(y, z,·) that requires a concept C(ci) for its second argument and for some i (1 < 11 < n), subsumption(C, J) holds, then the lexical DRS K' of [JV, N2] is: y w ...
x(y,p) p=
conditions with: =W
[χι]ι 2 ·lexical DRS-conditions of Kl x\ = w
Apart from the saliency of the relation chosen, another implication becomes visible. Dynamics of noun meaning is treated in this approach mainly by domain assignment. Domains are determined either by the role-assigning element or selectional restrictions of
114
the relation. Hence in this case the domain of the modifier is determined by the relation from the lexical DRS of the head noun but the domain for the head still is open for specialization. For instance, compounds like Tassenmuseum (cup museum) and Aztekenmuseum (Aztec museum) are interpretable by the 'exhibiting' and 'informing'-relations belonging to the lexical meaning of Museum. The value of this relation is not restricted to a specific domain. Therefore the lexical meaning of the compound may still be subject to conceptual shifts. For instance, the first compound may be inserted with the same meaning of 'exhibiting and informing on cups' into clauses that fix different concepts of the head noun: 1. Das Tassenmuseum entschied, eine neue Plastik zu erwerben. (The cup museum decided to acquire a new sculpture): museum (institution) exhibiting on cups (complete concept). 2. Das Tassenmuseum wird renoviert. (The cup museum will be renovated): (museum (building) exhibiting on cups (complete concept)
5.4
The Conceptual Basis
In this chapter I will characterize conditions for NN-compound interpretation that are given by the structure of their denotations, i.e. knowledge of prototypical properties of certain entities. The knowledge-dependent character of NN-compound interpretation makes it hard to determine fixed rules because object knowledge is fundamentally different from linguistic knowledge. Object knowledge is attributed to much more intersubjective variation, depending for the most part on education and intelligence, than linguistic knowledge. It is highly interwoven which makes it hard to fix relevant properties on the knowledge representation level. Therefore this subchapter has a strong hypothetical character. Furthermore, only properties and relations are indicated that are used for certain compound interpretations. This action naturally leads to limitations with respect to applications of interpretation rules because complete concept representations, covering all properties an object has, are impossible to give. I will however have a close look at several conceptual domains. Furthermore, we will focus on location relations for compound interpretation because detailed analyses of this domain are provided in the area of preposition semantics. The development of a conceptual basis for compound interpretation is also intended to be a first attempt to go beyond the ad-hoc nature of knowledge representation. I will try to give typical properties in order to determine properties of entities that are essential for them. The conceptual basis however should not reflect scientific theories of properties of materials or objects, but underlying knowledge coming from our experience. This is exactly what is postulated under the paradigm of 'naive physics' (Hayes 1978, Hayes 1985a). We should not write down scientific theories of the world but naive theories
115
of how objects are located to each other, how they are made, how they can be moved, destroyed, grasped and so on; i.e. all the knowledge people have, based on perception. To start with, I will assume that the conceptual system is structured into basic domains, each underlying specific principles and clustering specific subdomains. The conceptual system may consist at least of the basic domain Ο of (abstract and concrete) objects, a domain ΛΊ of masses or substances, a domain Τ of time intervals, a domain £ of events and a domain C of location units5. Apart from the domain of time intervals, each domain might be structured as a semi-lattice in order to be modelled in term-describing languages as e.g. KL-ONE. However, domains like Ο, Τ and S have a ground level of smallest elements whereas the domains M. and £ might be ungrained, i.e they always have elements that have proper parts of the same kind of element. For instance, the domain of vehicles - a subdomain of Ο - provides a ground level of individual vehicles that do not have parts of the same sort. So a bicycle is made of a frame, wheels, rims and so on but not of bicycles. Contrary to this example, masses do not have a level of smallest elements in a naive theory. In addition to the assumption of certain basic domains, inference pattern are postulated in order to represent causal relationships. The postulated basic domains are interwoven with each other by fundamental ontological links. Among these links, there are the important functions of spatial and temporal location, mapping elements of certain domains into places and time intervals respectively. I will assume that the function of spatial locations loc is defined time-dependent for O, S and £, but not T: loc:: T x O, S, C,M —»· C. This means, objects, events, masses and places are localized at a certain time interval on a place. The temporal localization function may be defined for events and time intervals. Furthermore, there may be a function mapping substances into objects, and a function mapping objects to events. Although this is a rather general characterization and thus leaving a lot of open questions, I will not go into a more detailed description6. The purpose of this short characterization of basic domains of the conceptual system is to fix the fundamental framework for the characterization of subdomains with respect to inherent properties for NN-compound interpretation. For instance, we will see that location plays an important role in interpreting NN-compounds. Then the localization function in question just mirrors the general link between objects and the place they occupy (which can also be the surface of another object or part of the space it occupies). Eventually the question might come up whether writing down conceptual knowledge is scientific work at all. Hayes' (1985a:35) answer is: "Doing this job is necessary, important, difficult and fun. Is it really scientific? Who cares?" I think I will follow this statement; especially just because NN-compound interpretation is based largely on conceptual knowledge.
5
I will not go into the discussion whether events are entities or configurations of space and time units. For a short discussion, see Bierwisch (1988). 6 As a matter of fact, this would also lead to an endless discussion.
116
5.4.1 Prototypical Properties of Substances This subchapter is concerned with mass concepts and their contributions to novel NNcompound interpretation. I will not go into the literature on mass expressions because I am interested in prototypical properties of substances and not in semantics of mass expressions7. Novel NN-compounds with both constituents denoting substances are preferably interpretable by various kinds of mixing-relations. The mixing-relation chosen depends on properties of the substance concepts denoted. For example, Zuckermilch (sugar milk) denotes milk with sugar dissolved in it, but Wassermilch (water milk) denotes milk that is watered down. The contribution of various prototypical properties of substances to certain compound interpretations is to be worked out. Substances have various characteristic properties that distinguish them from classes of individual objects. Physical objects can be divided into their proper parts while substances are principally divideable into smaller amounts of the same substance: every part of sand is also sand but a part of a computer is not a computer. Substances can be materials that objects are made of. They can be roughly subdivided into solids, pastes, powders, liquids and gases. Each subtype has various properties resulting from physical laws and separating them from each other. Solids are typically formable just with tools and with enormous effort, depending on degree of hardness. The effort for forming objects out of substances decreases from solids to pastes. Pastes do also have a certain degree of formability. Powders are no longer stable but they still can form heaps while liquids always take on the form of the space they occupy in their container. Substances can be mixed by various processes: diffusing(liquid, liquid); separating (liquid, liquid); dissolving(liquid, solid); forced-mixing(paste, paste) or forced-mixing(oil, water)8. With respect to novel NN-compounds with two constituents denoting substances, composition of substances and typicality of substance combinations determines the relation between them. Composition means: substances can be made of other substances, naturally or man-made ones. For instance, let us have a closer look at Mehlteig (flour dough) and Wassermilch (water milk). The former has a prefered interpretation as 'dough made with flour' while the latter probably would be interpreted as 'milk that is watered down'. Although doughs consist of flour and milk of water, the compounds get different interpretations. The difference is founded in composition: doughs are made by human beings with certain substances, milk is perceived as a substance not made by human beings; our naive physical knowledge determines the interpretation rather than scientific theories.
7
There are analyses working with structured domains however. With respect to semantics of plurals and mass terms, Link (1983) considers the domain of interpretation structured as semi-lattice. He provides a materialization function from individuals and sums to their material in order to distinguish between individuals and their 'portions of matter'. Link also introduces a part-whole relation. For semantics of mass terms, see also Pelletier (1979). 8 See Hager (1985) for a formalization approach to properties of substances and Hayes (1985b) on properties of liquids.
117 Other kinds of mixing relations that can be used for compound interpretations are 'included in' and 'dissolved-in'. 'Included in' is used if two atomic substances are combined: Zuckermehl (sugar flour), Sandmehl (sand flour) and Honigsaft (honey juice) will be interpreted in this way; the constituents denote substances of the same type. Sandteig (sand dough) and Sandmilch (sand milk) will also get this interpretation because sand is not a proper part of doughs and it cannot be dissolved in liquids. Thus, features for distinguishing the mixing relations are at least: composition of substances by means of typical combinations of atomic substances, dissolvability of solids and heaviness of liquids. Relations other than mixing-relations are applicable if one of the constituents denotes an object concept. These are either object-specific relations between the object class represented by the concept and the substance concept or localizations. For example, factories are designed for producing artefacts or man-made substances with raw material as fundamental material. Therefore Zementfabrik (cement factory) is interpretable as 'factory producing cement' as well as 'factory building made with cement' and 'factory using cement'. Sandfabrik (sand factory) however is not interpretable as 'factory producing sand' because sand is not a man-made substance. Pieces of substances occupy a place in space, they have an inner region and a surface with characteristic features. The inner region may be a place for objects, as well as the surface. The surface can be smooth or rough, plain or uneven etc. Therefore a mass noun in a modifier position can also be used for denoting the place the head extension occupies. Then we might get a location-in or location-on relation: Wassermüll (water garbage), Sandkrebs (sand crab) and Schieferabdruck (slate imprint) are interpretable as 'x located in/on the mass y'. Whenever the substance cannot be a mass the head extension might be made of, a 'made-of relation is unrealistic and a location relation can be used. Our knowledge if an object is made of a certain substance is related to properties of the object, its function and typical form and physical properties of the substance, whether it is rigid or flexible, whether it has high or low density. By way of demonstration, let us take an example: hammers. We know that there are hammers of various forms and made of various materials, depending on their function. Hammers are used for knocking nails or holes into objects that are made of specific materials. The prototypical hammer has a wooden handle and an iron head as proper parts. But there are also hammers used for, let's say, aligning trottoir slabs. Appropriate to this function, such a hammer has a head made of rubber. We use knowledge of this kind for interpreting compounds like Holzhammer (wooden hammer) as opposed to Betonhammer (concrete hammer) or even Gummihammer (rubber hammer). The first compound and the third one get their interpretation by a made-of relation that exists between a part of the object and the material. The made-of relation between different materials and parts of the object is possible because there are different purposes the object is designed for. The second compound gets a different interpretation. Concrete is not a material typically used for making parts of hammers. Concrete is more likely a material objects (e.g. walls) are made of that are treated by hammers. Therefore we get an interpretation like 'hammer for knocking against concrete'.
118 Conceptual DRSs representing this knowledge must provide a specific structure of domains in order to guarantee correct usage of relations. Certain subtypes of substances and relationships between them, as described above, must be given. If particular properties are given for both constituent concepts representing substances, there is a relation between both substances that expresses their typical relation to each other. Second, functions of objects can determine relations between these objects (or parts of them) and the materials in question. We should keep in mind that the lexical meaning of an artefact noun as hammer is crucially determined by the purpose of this object. The predicates making up the purpose are mapped onto their counterparts in the conceptual structure such that all other conditions given in the lexical meaning are fulfilled. The map constitutes the typical material object parts are made of because this is stored knowledge. The representation of knowledge about substances and the relationships between substances and objects is given by these conceptual DRS-conditions: x
atomic-substance(x) z made-with(y,z)
composed-substance(y)
substance(v)
=»
substance(z)
atomic-eubetance(v)
V
w provide-place(v,w)
=Φ·
u provide-space( v ,u )
=>·
al absorbing-subetance(al)
compoeed-eubstance(v) surface(w) inner-region(u)
substance(al) bl absorbes(al,bl)
=>
liquid(bl)
substance(a) b formed-with(a,b) c formed-to(a.c)
=>·
=>
tool(b)
reproduction(c) d
reproduction-of(c,d)
=>
object(d)
119 substance(e) f formed-with(e,f)
=>
g formed-to(e,g)
tool(f)
hand(f)
V
reproduction(g) h reproduction-of(g,h)
subetance(i) J formed- with (ij)
tool(j)
k formed- to(i, k)
=*>
V
object(h)
hand(j)
heap(k)
substance(l) n located-in(l,n)
11 dissolving-liquid(l 1)
=>·
container(n)
liquid(ll) ml dis8olves(ll,ml)
substance(ml) =»
->
liquid(ml)
These conceptual DRSs split the domain of substances into various subdomains and provide links from subdomains to other domains. The relations expressed in these concept descriptions are rather general but usable in NN-compounds if modifier concepts are subsumed by the relation domains. In addition to these concept attributes, entailments for concluding relations are parts of conceptual knowledge as well. Rule-based knowledge of this kind however goes beyond the power of terminological logics that are used for structural analyses of concepts rather than providing inferences on relations. Inferences of this kind however are also involved in NN-compound interpretation9. Conditions provided by underlying concepts of the constituents have to be met for inferring a particular relation. We should note that these are not specific concept formation rules for NN-compounds, but general conditions for using the relation in the conceptual system. Exceeding the borders of terminological languages for drawing inferences is unproblematic in DRLC. The reason for using a KL-ONE language as a starting point for knowledge representation is the feasibility to represent typical properties of objects. Thusfar a terminological language is justified. 9
The possibility of drawing inferences is a crucial factor in language understanding in general. Therefore knowledge representation languages as L-LILOG (see Pietät and v. Luck (1990)), for example, offer language elements for term descriptions but elements for representing inferences as well.
120
Now extending such a language by representations of inferences is possible in DRLC. One only has to state the conditions for drawing the conclusion that a certain relation meets. Mixing-relations of two substances may depend on various properties of the substances involved. Let us have a look at these compounds: Mehltteig (flour dough), Wassersand (water sand), Eimerwasser (bucket water) and Metallsäure (metal acid). All four compounds are interpretable by the same procedure, namely the head concept provides a link, inherited from general concepts as given above, to generalizations of the modifier concept. For instance, the first compound has a head denoting a composed substance which typically contains flour. Therefore it is interpretable as 'dough made with flour'. Likewise the other compounds: sand is an atomic absorbing substance so that the second compound can mean 'sand that absorbs water'; a bucket is a container water can be located in and the last compound can mean 'acid dissolving metal'. The procedure sets up the search for suitable relations between two concepts. Its precise formulation is given at the end of chapter 5.4. In addition to searching for relations, inferences are necessary as well. I will develop certain inferences that must be drawn in order to interpret NN-compounds with at least one mass noun as a constituent. To start, let us consider the meanings of these compounds: Rosinenteig (raisin dough) vs. Sandteig (sand dow); Lederbuch (leather book) vs. Sandbuch (sand book) and Malzbier (malt beer) vs. Limonadenbier (lemonade beer). In all three cases, the former compound is more easy to interpret than the latter. Although the latter is not very likely, compounds of this type appear, as examples in chapter six will demonstrate. The first pair shows the difference between compounds expressing a composed substance made with another substance and impossibility of this relationship, respectively. Sandteig (sand dough) does not denote dough made with sand just because sand is not a typical component of dough. One can imagine however that the compound denotes dough including sand. Hence a conclusion must be drawn on the basis that the modifier extension is not a component of the head extension. The head noun however does not have to denote a composed substance. It can also denote an atomic one and the same relation holds, as Goldsand (gold sand) illustrates. The compound denotes sand with gold included in it. The inference for drawing the inclusion-relation is given in (1): xy atomic-substance(y)
—i
compoeed-subetance(x)
V powder(x)
made-with(x,y)
atomic-substance(y)
=>·
included-in(y,x)
The second pair of compounds demonstrates application and non-application of the madeof relation. Its application is discussed in chapter 5.4.4 but its non-application implies drawing inferences for an alternative interpretation10. Sandbuch (sand book) might denote
121
books located on or in sand because the modifier concept provides places where the head concept might be located. The conclusion that a location relation holds is drawn on the premises that made-of is not applicable to a non-decomposable part of the object and the modifier concept provides a certain place for location: χyzw phyeical-object(x)
made-of(y,z)
=>
substance(z) location-on(x.w)
has-part(x,y)
undecomposable-object(y) surface(w)
provide-place(z,w) χyzw physical-object(x)
made-of(y,z)
=»
hae-part(x,y)
substance(z)
location-in(x,w)
undecompoeable-object(y)
provide-space(z, w)
inner-region(w)
The last pair of NN-compounds demonstrates mixing of two substances that are not impervious or when a liquid is the component of the other one. Malzbier (malt beer) denotes beer made with malt but Limonadenbier (lemonade beer) can denote beer diffused with lemonade because beer is not known as made with lemonade and neither liquid is impervious. Thus the inference one gets is: xy liquid (χ)
10
diffusing(x,y)
—i
impervious(x,y )
=*>
liquid(y)
—i
made-with(x,y)
=>·
liquid(y)
Of course both compounds are preferably interpretable by means of the purpose books are designed for, namely by informing about the topic given by the modifier. But this is no alternative concept-based interpretation.
122 To sum up, there axe namings of necessary properties belonging to a concept as well as entailments for concluding relations holding between two concepts. Both kinds of information are usable for NN-compound interpretation. A two-place relation in the concept description belonging to the head noun is usable for NN-compound interpretation if the domain of its argument subsumes the domain of the concept that goes with the modifier. By way of illustration, let us have a look at the underlying concepts of the mass nouns Mehl (flour), Teig (dough) and Saft (juice). I assume that flour and juice are perceived as man-made atomic substances, while dough is a granulated man-made substance. Object-specific properties of the concepts and more general concepts are: atomic-subetance(x) powder(x) basis-for-food(x) made-from(x,y) miller(z)
V
factory(z)
person(o)
V
factory(o)
yeast(d)
V
baking-powder(d)
granulated-substance(a) paste(a) basis-for-food(a) made-with (a,b) made-with(a,d) made-for(a,c)
bread(c)
V
cake(c)
V
pastry(c)
Several novel NN-compounds are interpretable by access to these concept descriptions. There are relations explicitly given holding between the concept being defined and other concepts. If the head noun denotes the concept defined or its generalizations and the modifer denotes the other concept or one of its generalizations, this relation can be used for interpretation. For example, Roggenmehl (rye flour), Birnensaft (pear juice), Fabrikmehl (factory flour), Kuchenteig (cake dough) and Backpulverteig (baking powder dough) are
123 all interpretable by this procedure. Their interpretations can be virtually read from the concept descriptions. Additional relations can be inferred. More general concepts subsuming the denned concept are either defined or described as well. The relations used for definition and description, respectively, are used for interpretation. For example, the 'dough'-concept is a specification of the 'paste'-concept described above that contains a 'formed-with' relation to tools. A compound as Mixerteig (mixer dough) then is interpretable as 'dough formed with mixer' because the modifier concept is subsumed by the 'tool'-concept. The same underlying procedure ensures the interpretation of Zuckersaft (sugar juice) as 'juice dissolving sugar'. However, not every relation in a concept characterization or concept definition is a candidate for NN-compound interpretation. I pointed out that a compound with a modifier denoting a necessary part of the head noun extension is odd since compounds have to express a specialization of the head noun extension. So only relations between the head concept and either a disjunction of concepts or a concept with subconcepts are usable for compound interpretation. For instance, 'flour' - as specialization of powder is formed to heaps. 'Heap' does not subsume any other concepts. Therefore a compound as * Haufenmehl (heap flour) is odd. If no relation of the head noun concept fits with the modifier concept, other interpretations might be provided by inferences of the kind given above.
5.4.2
Spatial Functions of Three-Dimensional Objects
Some novel NN-compounds are primarily interpretable by location relations. As I pointed out above, localization is a time-dependent function mapping objects, events, location units and substances to location units. Actually each NN-compound with constituents denoting concrete objects may be interpreted by a location of one object with respect to the other, although this might not be the primary interpretation. Schranktasse (cupboard cup), for example, might denote a cup located with respect to a cupboard. In this case, cup is the located object and cupboard is the reference object. Moreover, the location relation can be specified. There are places provided by the modifier concepts that are more natural places for cups than others. Presumably the most natural interpretation is given by an in-location. But the cup might also be located on the cupboard: then it is on the top. Preferences of certain localizations with regard to other localizations may be represented in a hierarchy of localization relations: location-in < location-on. A Sofamaus (sofa mouse) may be a mouse located on, under, behind, or in a sofa. The latter would constitute a specific 'in'-region: a hollow space in the object 'sofa'. Here the probable hierarchy of relations might be: location-under < location-behind < locationon, location-in. Such a weak hierarchy exists for other compounds with location relations as well. For instance, Büchertasse (book cup) is interpretable by these localizations: location-on < location-at.
124
Why does such a hierarchy exist? The hierarchy is not completely based on spatial properties of objects and possible static locations of these objects (or places of the objects), depending on these properties. It is rather based on spatial functions of objects in combination with the kind of location, dynamic or static. Paraphrases describing the meaning of the compounds above will make it explicit: Sofamaus can mean 'a mouse that lives in/ under/ behind a sofa, runs on/ behind a sofa, has a nest in/ behind a sofa' and so on. So if the meaning of a compound is described in a paraphrase, local prepositions in combination with verbs denoting states or actions are used. What I want to demonstrate with these examples is that two kinds of conceptual information are needed for location relations in compounds: 1. What kind of place may be provided by the reference object for the located object? It has to be checked whether the former object is conceptualizable as a supporter of the latter object or whether it is a container or protector. 2. How is the located object located with respect to the reference object? It can be a static or dynamic location. The choice for a static or dynamic location is determined by the properties of the located object. Animates are localizable dynamically by their kind of locomotion while non-animates are dynamically localizable just by physically determined movements as flowing or falling. Hence possible places, provided by objects, must be represented in the conceptual system as well as kinds of locomotion. I assume that knowledge of typical actions of animates and locations of objects determine the kind of location. This approach is completely conceptbased11. For developing such an approach I will start with general spatial properties related to concrete object concepts. Spatial properties of concrete object concepts are constituted by size, shape, position, and place of an object. I will explain these properties briefly12. Size of an object can be determined principally in three ways. An object can be compared with another object or with its inherent normal size or by the use of a metric system. Size plays an important role in compound interpretation since possible characteristic regions of objects may be occupied by objects of a certain size. Shape is characterized by topological properties, canonical positions, parts, and extensions of dimensions. The position of an object is determined by a relation to another object or surface. If there is a default position of an object, it is called the canonical position. Canonical positions are only possible if the object has an internally structured shape. The precondition for canonical positions is having an inherent vertical axis in order to determine perceptually top and bottom. The inherent vertical axis may coincide with the maximal extension but it does not have to. Furthermore, a maximal extension may exist but not an inherent vertical axis. If there is an inherent horizontal axis, there 11
Concept-based analyses of spatial expressions are developed in ΑΙ-research on natural language understanding systems. For instance, a concept-based approach for semantics of German local prepositions in and bei is proposed by Pribbenow (1989). "I benefited from Wunderlich (1986b).
125
is also an inherent vertical axis and an inherent in-front-of/ behind distinction. If there are such inherent distinctions, parts of objects can be localized as well. The place an object is located is also determined relatively to another object. Often the object is located in a characteristic neighbourhood of another object. This is explained below. Now it is of interest for NN-compound interpretation how spatial properties of objects constitute location-interpretations and what the conditions are for selecting different locations. For this a short look into semantics of a certain type of spatial expressions, namely local prepositions, will provide insights into spatial concepts. The reason for this is that semantics of local prepositions is assumed to come very close to the structures of local concepts. So if it is known how locations are expressed and conceptually represented, these insights may be transmitted into the analysis of locations in NN-compounds. Hence the digression coming next may help in determining spatial properties of concepts. 5.4.2.1 A Short Digression into Semantics of Local Prepositions. I will review the main theories of the semantic form (i.e. lexical meaning) of local prepositions and their implications for the structure of spatial concepts. A critical evaluation of recent theories of local prepositions and spatial cognition however cannot be given. This would go far beyond my work. Semantics of local prepositions cannot be used for localization-relations in novel NNcompounds either: while the former provides lexical items for expressing spatial relations, the latter is completely determined by conceptual knowledge. But the map of semantic forms of local prepositions into the conceptual system might give some insights into spatial properties of space and objects. There are three slightly different proposals on the semantic form of local prepositions13: 1. \y\x[Loc(x] C REG(y)} Bierwisch (1988) 2. \y\x[Loc(x) € REG(y}\ Klein (1990, 1991) 3. \y Xx[Loc(x, REG(y))} Wunderlich (1986b, 1990), Herweg (1988) The differences in characterizing localizations are founded on two differing views: • Objects are localized with respect to places (3) vs.: Places of objects are localized with respect to places of objects(l,2). • The place of the localized object is a part of the place of the reference object (1) vs.: The place of the theme is element of the set of possible REGions provided by the preposition (2). 13
For sake of simplicity, I ignore differences in the time-dependent nature of localizations in these proposals.
126 For evaluating these views, some comments on conceptual structures of space are helpful. REG is a variable that is satisfied by preposition-specific regions. So on constitutes ONregions of places, in IN-regions and so forth. But how are these regions defined in space? Certainly there is more than one concept of 'space', visible in phrases as: 1. Die Tasse ist im Schrank (The cup is in the cupboard) 2. München liegt in Bayern (Munich is in Bavaria) 3. 'iV ist in Beispiel (2) kursiv gedruckt und unterstrichen ('in' is printed in italics and underlined in example (2)) Klein (1990:34), Klein (1991:94) defines a three-dimensional basis space concept from which all other space concepts should be derived. The basic space concept enables the definition of spatial relations for characterizing extensions of local prepositions. These are dimensional relations as HIGHER-THAN and LEFT-OF and topological ones as CONTAINED-IN, IN-CONTACT-WITH and NEAR-TO. Based on these spatial properties, the meaning of local prepositions can be determined by these properties, combined by Boolean operations (see Klein (1990:35), Klein (1991:97) and Wunderlich (1982:10ff.)). Here are some examples: 1. in (» in): CONTAINED-IN 2. an (w at): IN-CONTACT-WITH 3. auf («on): HIGHER-THAN & IN-CONTACT-WITH 4. über (» above): HIGHER-THAN & -(IN-CONTACT-WITH) Klein argues that the place the localized object covers is an element of all possible prepositionally fixed regions, since differences between local prepositions are determined by these properties. The difference between auf and an, for example, is determined by neccessity of contact between theme and reference object viz. no specification of contact. Therefore the AUF-region is a proper subplace of the AN-region such that everything that is on an entity also is at this entitiy. But we cannot say *Die Garderobe ist auf der Wand (*The wardrobe is on the wall) or *Die Hecke ist auf dem See (*The hedge is on the sea). Replacing inclusion by the element-of relation solves this problem: there is not one AUF-region and one AN-region but sets of regions such that the place of the theme is element of one of these regions. Thus, Die Lampe auf dem Tisch (The lamp on the table) is represented as Loc(Lampe) G AUF(Tisch) with: AUF(y) = \o\oHIGHER - THANLoc(y)boIN - CONTACT - WITHLoc(y)} Die Lampe an dem Tisch (The lamp at the table) is represented as Loc(Lampe) 6 AN(Tisch) with: AN(y) = {o\oIN - CONTACT - WITHLoc(y)} Wunderlich (1990) argues against Klein's and Bierwisch's conception of semantic forms of local prepositions representing places of objects standing in particular relations to other regions. In his view it is a necessary condition of concrete entities to be located in time
127 and space. Therefore the place of an object does not have to be explicitly mentioned in the semantic form of localizations. Moreover every object has characteristic neighbourhoods. They are either inherent neighbourhoods or they are determined by the person's perspective, and these neighbourhoods are used for localizations. So, according to Wunderlich, the semantic form of a localization is [LOG (x, REG(y))]. This means: in the prepositionally characterized region of y is x. Wunderlich also points out some problems of Klein's approach. In my opinion the most striking problem of Klein's work is the localization of 'entities' that are not individuals as e.g. masses and states. Klein would have to determine the place of being dark in a clause like Unter dem Tisch ist es dunkel (Under the table it is dark). In Wunderlich's analysis it is just stated that the property of being dark exists in the UNDER-region of a table. Before I end this short - and therefore in many parts incomplete - essay into the semantics of local prepositions and space concepts and draw conclusions for location relations in novel NN-compounds, I will point out some problems of the analysis of local prepositions in terms of spatial relations like IN-CONTACT-WITH. Approaches suggested for handling these problems are relevant for interpreting NN-compounds as well. The definition of the meaning of local prepositions in terms of spatial relations, that again are derived from a space concept basis, does not cope with all uses of these prepositions, über (above) is defined as the located entity HIGHER-THAN the reference object and not IN-CONTACT-WITH the reference object. However, while the former spatial relation seems to be a neccessary condition, the latter can be violated: 1. Die Decke über dem Bett (The cloth 'above' the bed) 2. Die Schüssel über den Äpfeln (The dish 'above' the apples (if the dish is turned up and has contact with the apples)) Klein (1991) proposes to interpret these non-standard uses of local prepositions as functional or visual reinterpretations of the standard use. Functional reinterpretation would mean that corresponding to a position there is a typical function. So related to auf there is the function of 'support' and related to unter there is the function 'cover' or 'protection'. Visual reinterpretation would be constituted by the line of vision: visible objects are on a reference object, non-visible objects are under a reference object, if the reference object is in the line of vision14. 5.4.2.2 Location Relations in Novel NN-Compounds. To summarize, we can record that spatial constellations are describable in terms of simpler spatial relations. Functional and visual reinterpretations of these relations in terms of e.g. support and protection are possible. Finally, characteristic neighbourhoods of objects are used for localizations as well. The trip into semantics of local prepositions now leads to four observations related to spatial functions in novel NN-compounds. 14
Vandeloise (1984) analyses French spatial prepositions in terms of functionality. He even regards the functional meaning as basic in order to derive locality from spatial functions
128
First, localized object and reference object are not fixed by the modifier or head position. In Tischlampe (table lamp) the head denotes the localized entity and the modifier denotes the reference object. In Lampentisch (lamp table) it is just the other way round. In the simplest case localized and reference object are identified just by their size: the smaller object is the localized object, the larger object is the reference object. There are however problematic cases where the size of both objects cannot easily be determined. For instance, one can image that a long bar is located on a table so that just a part of the bar has contact with the top of the table. One can refer to the bar with the compound Tischstange (table bar). The maximal axis of the bar is longer than the horizontal or transversal axis of the table; though the size of the bar is perceived as smaller than the size of the table. Size involves extension into two (for places) or three dimensions (for physical objects). An object seems to be perceivably smaller than a different one although one of three axes may violate the condition of being smaller. The condition seems to be that the other two axes of the located object are substantially smaller than the corresponding axes of the reference object. Moreover, localizations in NN-compounds are computed with respect to canonical positions of the objects in question since typical properties of objects are involved in NN-interpretations. For instance, if one interpretes a compound like Flaschenstuhl (bottle chair) as a chair some bottles are located on, it is certainly not interpreted as a chair laying on one of its sides with bottles standing up side down on it. The use of canonical positions is another indication that prototypical object properties determine compound meanings. Second, spatial functions in novel NN-compounds are completely concept-based if the head is not a noun used for spatial expressions (as e.g. Spitze (top, point) or Seite (side)). The located object is localized just in a part of the characteristic neighbourhood of the reference object. The task of local prepositions to provide conceptualized PREP-regions for localizations is shifted for NN-compound interpretation to the underlying concepts. The concept has to provide possible PREP-regions of objects. For instance, it must be known what possible IN-regions of sofas are, what possible ON-regions of sofas are and so on. The third observation concerns functional reinterpretation of spatial relations. Conceptualized PREP-regions are the basis for definitions of spatial functions like support, containment and protection. These spatial functions provide the basis for localization interpretations of NN-compounds. For example, the interpretation of Tellerschrank (plate cupboard) as a cupboard containing plates rests on the spatial function of a cupboard being a container. The spatial function, in turn, rests on the IN-region of a cupboard but this place does not directly determine the compound meaning. Fourth, NN-compounds interpreted by locations support Wunderlich's assumption of objects having characteristic neighbourhoods that are used for localizations. Indeed inherent 'areas' and perspectively determined 'areas' respectively lead to certain interpretations in NN-compounds. For example, Rathausklingel (town hall bell) might be interpreted either as a bell in a town hall or a bell at a town hall. The difference is founded in the use of different inherent regions of town halls for the localization of bells. It does not seem to
129
be plausible for me to say this compound actually means 'the place of the bell is located with respect to particular places related to the town hall'. A static localization of one object with respect to another one is determined by their typical positions to each other. Let us consider Tischtasse (table cup) and Tischstuhl (table chair). Cups, chairs and tables have canonical positions, but only chairs and tables have an inherent horizontal axis (i.e. a left/ right distinction) and therefore an inherent infront-of/ behind distinction. Furthermore, tables are prototypical supporters and chairs are supporters as well. Cups are containers. We interpret Tischtasse as a cup on a table on the basis of the support function of tables and the size of both objects (compared with Tischhalle (table hall) where the size of the container 'hall' is big enough to contain tables.). Tischstuhl (table chair), on the other hand, seems to be strange. It can also be interpreted as a chair on a table, but an interpretation as a chair in front of a table seems to be more relevant, if at all. The basis for this interpretation is the typical position of both objects to each other. Such a position is given if both objects often appear together. Typical positions can coincide with positions based on spatial functions, as in Tischtasse, but this does not have to be the case. Compounds of the type Tischstuhl are rare. If they appear, they are mainly used for comparisons (as e.g. Tischstuhl (table chair) vs. Wandstuhl (wall chair). There seems to be a restriction in the generation of compounds: Compounds do not denote two objects in a typical position to each other if the typical position is not identical with the primary function. So in Schranktasse (cupboard cup) and Regalbuch (shelf book) the container function of the modifier concept is used for the typical position of the head concept, but ?? Tellerbecher (plate mug) with the meaning 'mug beside the plate' is strange, although in normal situations mugs are located beside plates on a table. If an object has more than one spatially founded function, the functions can be ordered with respect to saliency. So although a cupboard is a container and has a support function, it is primary a container. Beside support and containment the third functional concept is adherence (or adhesion)15: Adherence involves cancelling out gravity by fixing an object on a vertical surface. Compounds as Flaschenetikett (bottle label), Wandplakat (wall poster), Laternenzettel (streetlight notice) are interpretable as 'x fixed to y'. It is implicitly known that labels and posters are fixed by an adhesive or nails to an object. The kind of fixing depends on the properties of the substance the object is made of. Again the canonical position of objects is relevant for interpretation: Flaschenetikett is interpreted as 'label fixed at a bottle' because the bottle is standing in its canonical position. Finally, protection means that an object is in the line of vision of the observer of an entity. If the entity is an animal, protection may coincide with typical behaviour of the species. Prototypical containers and supporters are objects that have the purpose of containment or support. This means their primary function is a spatial function that is represented as lexical meaning. 15
It is not indisputable that adhesion is an independently motivated spatial function. It can also be defined as a form of support.
130 Now I will state in the conceptual system what being a container, supporter, and vertical supporter means; and what the implications for NN-compound interpretations by locations are. Since typical properties are of interest, the definitions I will state are restricted to properties in canonical positions; otherwise the spatial functions of the objects might get lost. Size of objects is important for localizations. The located object must be smaller than the reference object; otherwise one gets counterintuitive localizations. For example, a compound as Tassentisch (cup table) can be interpreted as 'table with cups on it' just because cups are smaller than tables. If the size was ignored, the compound could be interpreted as 'table in a cup' since cups are containers. Therefore the conceptual condition χ « y will be introduced with the meaning 'x is significantly smaller than y'. The definitions of container and supporter are as follows: container(x)
SD-object(x) hollow-space(y) V
provide-space(x,y)
inner-space(y)
contained-in(z ,y)
This definition takes the predicates 'inner-space' and 'hollow-space' as primitives. They can be defined as well, however. So a hollow space is a space that is completely surrounded by sides and has either small openings or doors while inner spaces have at least one open side so that entities can be put into the inner space. Since this is not directly related to the definition of containers, I will not go into details. The definition of supporters:
y eupporter(x)
SD-object(x) * « Λ top-ofix.y)
-^ =>
tOP(y)
horizoni tal(y)
ζ in-contact-with(z.y)
3D-object(z) Z « X
or supporter(x)
liorizontal(x) place(x) ζ
in-contact- with(z.x)
=»·
SD-object(z) Z
«X
131 Adhesion at a vertical supporter requires contact by an adhesive: vertical-supporter(x)
yz 3D-object(x)
has- vertical-surf ace(x,y) in-contact- with(z.y)
=^· vertical-surface(y) w Object(z)
=£·
Z « X
contact-by(z,y,w) adhesive(w)
or
y vertical-supporter(x)
place(x) vertical(x) in-contact-with(y,x)
=>
w Object(y) y « χ contact-by(y,x,w) adhesive(w)
Analogous to the relations between two substances given above, one can now state conditions that have to be met by the concepts for different localization relations. The localizations are time-dependent. Spatial relations like 'in-contact-with' and 'containedin', used in combination with Boolean operations for semantics of local prepositions, are also taken as basic relations for localization relations in NN-compound semantics. Containment involves partial inclusion of the located object in space provided by the reference object. This is given by the definition of 'container'. The rule for containment with respect to NN-compound interpretation is: xy 3D-object(x) container(y) x«y
containment(t,y,x)
Support requires a horizontal place the located object is in contact with: xy SD-object(x) supporter(y) x« y
support(t,y,x)
The supporter can be a three-dimensional object or is a horizontal place. Vertical support is given by contact of the localized object at a vertical surface. The definition of vertical supporter given above presupposes contact by means of fixing:
132 xy object(x) vertical-supporter(y) χ «y
t
=> adhesion(t,y,x)
These are quite general spatial and functional relations between two objects. Whenever the conditions in the premise are fulfilled by both concepts denoted by a NN-compound, the spatial or functional relation may be concluded. NN-compounds are used for expressing spatial constellations of two objects. In order to demonstrate application of the rules, I will give some examples of NN-compounds. The rules will turn out to be the main basis for locations expressed in NN-compounds. Let us have a look at (1) Tassentisch (cup table), (2) Apfelsaflstuhl (apple juice chair), (3) Vasenfensterbank (vase window-sill), (4) Blumenmuseum (flower museum), (5) Tassenschrank (cup cupboard), (6) Museumsschrank (museum cupboard), (7) Teppichzimmer (carpet room), (8) Teppichtreppe (carpet stairs), (9) Zimmerteppich (room carpet) and finally (10) Laternenzettel (streetlight notice). In all cases, we will have a look at possible localization relations used for their interpretation. Other interpretations, although more salient or more natural ones, are not of interest now. For the sake of simplicity, we can also ignore the fact that some head nouns denote artefacts with the purpose of having a spatial function. In this case the spatial function is represented as lexical DRS and mapped onto conceptual knowledge. Located and reference object are independent of their position in these compounds. In (6), (9) and (10) the located object is denoted by the head noun and the reference object by the modifier; in all other cases it is the other way round. In (1), (2), (3) and (8) the compounds are interpreted by means of support. Tables, chairs, windowsills and stairs are objects with a horizontal surface in order to support. The objects denoted by the modifiers fit onto the surface so that support is applicable and usages of other localizations are blocked. The definition of a three-dimensional supporter requires a horizontal top which is the place for support16. In case of prototypical supporters, their function is given by their lexical meanings; i.e. the lexical DRS. (4), (5), (6), (7) and (9) are examples for containment. They illustrate three characteristics: first, the conditions for containment given above indeed must meet in order to get containment as interpretation. For example, a cupboard provides a hollow space for containment and cups provide inner space. (5) is not interpreted as 'cupboard in a cup' because cups are perceived as significantly smaller than cupboards. Second, (4) and (6) demonstrate that conceptual shifts disappear if localization relations are used. For instance, flowers can be located just in the museum building, not in the institution. So conceptual shifts triggered by the role-assigning item or scripts are not possible if localization relations are used. I will pick up this observation and discuss it in more detail in chapter 5.6. The last characteristic concerns typicality of containment. In (6), (7) 16
In case of (2) however, the top of a chair is on its back-rest but its seat is the place for localization. This implies further investigation on spatial functions provided by object properties which cannot be done here.
133 and (9) the spatial function of containment can be stated more precisely in order to get the typical place of the located object with respect to the reference object. For instance, Teppichzimmer (carpet room) is a room containing carpets probably on the floor and not on the wall. Carpets are most naturally located on floors in our 20th-century western culture, so Teppichhalle (carpet hall) is also interpreted as 'hall with carpets located in it on the floor'. Bilderzimmer (painting room) however denotes rooms with paintings in it, presumably on the wall. Typicality of spatial constellations also plays a role for typical spatial constellations. Hence whenever an object is typically located in a certain manner with respect to the reference object, this location relation must be available for NN-compounds. This holds mainly for containment. An object located in a container is located on its floor, on a wall or in the inner space without contact, depending on typicality of spatial location of the located object in the reference object. This means, the following inference pattern must be added to the entailments for locations given above: xy z R container(x) physical-object(y) typicaHoc-function(z,y,R)
R(z,y)
place(z)
I will not go into the problem whether typical locations might be inferred or extensionally given. Finally, (10) is an example of vertical support. The concept for Laterne (streetlight) provides a vertical surface for adhesion. Its categorization as vertical supporter involves contact between the located and reference object by means of adhesion. By way of illustration, the relevant properties of the underlying concepts of Zimmer (room) and Teppich (carpet) will be represented in order to demonstrate how Teppichzimmer (carpet room) and Zimmerteppich (room carpet) are interpretable by localizations: container(x) part-of(x,y)
has-part(x,z)
has-part(x,v)
134 supporter(a) SD-object(a) d made-of(a,d)
=»
wool(d)
V
synthetic-material(d)
Size determines which one of the objects is the reference object RO and which one is the located object LO: LO « RO. Additionally, it must be known that the typical location of carpets in rooms is on a certain part of it, the floor: yz room(x) carpet(y) typical-loc-function(x,y,support)
part-of(x,z)
By means of the conditions for containment given above, Teppichzimmer (carpet room) is interpreted as 'rooms containing carpets'. Adding the inference pattern to the containment condition, one gets the interpretation 'rooms containing carpets on their floors'. Interchanging the constituents forms the compound Zimmerteppich (room carpet). Since the reference object and the located object are still the same, the spatial function of carpets to support cannot be applied to the compound. Therefore the same containment condition is applicable. So the interpretation by localization is 'carpet contained in a room'. Again the same inference may be triggered that the carpet is typically located on the floor. To sum up, spatial functions are used for NN-compound interpretation if the concepts exhibit certain sizes and belong to specific functional categories as e.g. the container category or supporter category. Furthermore, containment can be stated more precisely by typicality of location in the container. 5.4.3
The Made-Of, Part-Of and H as-Par t Relations
While discussing meanings of NN-compounds with two mass nouns as constituents, I have already sketched the application of the made-of relation. I argued that 'made-of' is a conceptually founded relation between parts of objects and solids. We will have a close look at conditions for applying this relation and 'part-of and (has-part' for NN-compound interpretation. Compounds as Bronzelöwe (bronze lion), Holzhammer (wood hammer), Plastikauto (plastic car), Porzellantasse (china cup) or Eisenlaterne (iron streetlamp) all have the most salient interpretation of 'n2 made of nl'. There seems to be no problem with NN-compounds having head nouns denoting a concrete object concept and modifiers expressing solid substances. So one could state that 'made of is the preferred interpretation of a configuration of the type [solid concreteobject], i.e. a map of physical objects into substances. However, we need to take a closer
135
look at these examples: a china cup might be a cup that is completely made of only china since cups are typically made of one material. But iron streetlamps or wooden hammers are objects which have a proper part that is made of iron and wood respectively: the bar and stick, respectively. Things become more complicated if we have a close look at compounds like Bronzelöwe: we are not talking about lions made of bronze but about a statue that is made of bronze having the shape of a lion. Finally, a plastic car is in its most natural interpretation not a complete car made of plastic but a car with a plastic bodywork. So 'made of does not assign a solid substance to a concrete object, but to a non-decomposable concrete object that may be part of a concrete object or is a reproduction of it. Concrete objects are either made of a solid or they are an assembly of other physical objects that are either made of a solid substance or have proper parts and so on. 'Made of may only be assigned to a concrete object that cannot be decomposed into proper parts. Now the problem is to determine which part of a physical object is meant in a made-of interpretation. First, typicality of being made of the substance in question plays a role in selecting the proper part of a concrete object. For instance, typical simple hammers have a wooden stick and an iron head but there are many other hammers, depending on the purpose they are developed for. My specialized knowledge of hammer types is restricted to further types with steely sticks and iron heads for carpenters and those with rubber heads and wooden sticks for adjusting tiles. The interpretation of NN-compounds with a modifier denoting a solid concept and hammer as head reflects this knowledge: 1. Holzhammer (wooden hammer) : hammer with a wooden stick 2. Gummihammer (rubber hammer): hammer with a rubber head 3. Eisenhammer (iron hammer): hammer with an iron head/ hammer for knocking against iron 4. Betonhammer (concrete hammer): hammer for knocking on concrete Since the purpose of hammers is maximizing power for pushing nails into or knocking against pieces of stuff, this may be in conflict with a made-of interpretation. This can be seen for Eisenhammer. If a part of the object is not typically made of the solid, there is a switch to other interpretation feasibilities. This can be seen in (4) where the purpose of the object leads to the interpretation. Other switches might be given by a localization of the head object with respect to a place provided by the modifier substance, as e.g in Holzloch ('wood hole'). I will come back to switches later. Second, a shift from physical objects to reproductions of them in the form of statues or pictures is possible. Then of course typicality of the made-of relation between the statue and the substance is relevant. Therefore Bronzereiter (bronze rider), Marmorengel (marble angel) and the like are interpreted as 'statue of an that is made of y'. Such a shift is possible for plastic car as well.
136 The shift from concrete objects to reproductions also allows a made-of function from reproductions of objects to powders as e.g. sand or pastes like e.g. snow. Examples are Sandburg (sand castle), Teigmännchen (little dough man), Gipsleuchter (plaster candelabra), Schneeblume (snow flower). Finally, I assume that the 'shell' of a physical object, i.e. the part determining shape of the outer surface, is the primary object part for the made-of relation. For example, NN-compounds like Plastikcomputer (plastic computer), Holztelefon (wooden telephone), Lederschuh (leather shoe), or Plastikauto (plastic car) are objects with a shell made of a certain material. Moreover, the visible part of the shell is important: a plastic computer might have a metal bottom, and a leather shoe may have a plastic sole. Since both concepts the NN-compounds refer to are interpreted with respect to their typical position (i.e the canonical position if there is one), the (intrinsic) bottom is the non-visible part. To sum up, the application of the made-of relation is restricted by this knowledge: 1. Reproductions of objects as statues are bits of solids or pastes 2. Only non-decomposable objects are made of solids or pastes, but not objects that are assemblies of other objects 3. If the object is an assembly, the domain of made-of is that part which: (a) is typically made of this substance or (b) the shell without bottom So the made-of relation is applicable only to pieces that are part of more complex physical objects. It is generally assumed that the natural-language term part of is a superterm for a whole class of conceptual relations, each having its own characteristic properties. For instance, different part-whole relations are discussed in Winston et al. (1987) and Iris et al. (1988). Winston et al. propose six kinds of conceptual part-whole relations: 1. A part-whole relation between complex objects and their components 2. A relation between collectives and their members 3. A relation between an area and a place 4. A relation between masses and portions 5. A relation between an activity and a phase and finally 6. A relation between a substance and an object The use of 'part-of and 'has-part' in this analysis of conceptual structures is restricted to components of complex objects. Interestingly enough, the other relations are expressed in NN-compounds as well, except the relation between collectives and their members:
137 Collective - element: * Waldbaum (*forest tree) * Herdenantilope (*herd antelope), *Geflügelfasan (*poultry pheasant), *Straußrose (*bunch rose). Area - place: Oasensee (oasis lake), Waldschonung (forest plantation area), Wüstenberg (desert mountain), Campuswiese (campus grass). Mass - portion: Zuckerkörnchen (sugar grain), Wassertropfen (water drop), Kuchenstück (cake piece), Wurstscheibe(s&usage slice) Activity - phase: Einkaufsbezahlung (shopping pay), Marathonendspurt (marathon final spurt) Collective-element concatenations are ruled out because they violate the trivial condition on NN-compounds that its meaning has to be a specialization of the meaning of the head noun. NN-compounds expressing a collective-element configuration have heads that denote proper parts of the modifier extension; no new information is added. On the other hand, if one interchanges the constituents, the compounds make sense if the collective noun denotes collections of different members. Thus, Antilopenherde (antelope herd), Rosenstrauß (bunch of roses)and Fasangeflügel (pheasant poultry) are correct because herds may be collections of different animates, bunches may be collections of any flowers one can think of, and poultry includes all kinds of edible birds. Often the collective noun is relational in order to assign the internal argument to the modifier. * Baumwald (tree forest) is ruled out since forests can only be collections of trees. The conceptual use of the part-of relation here is understood as the relation between objects and their components. It is a trivial, transitive relation between complex objects and its components. Compounds as Autolampe (car lamp), Hammerstiel (hammer stick), Fensterschraube (window screw), Laternenbirne (streetlight bulb) all have the preferred interpretation of 'n2 part of nl'. Its inverse relation (has-part) raises some problems for NN-compounds though. According to the specialization requirement for NN-compounds, the head noun extension is a generalization of the compound extension. A has-part relation between a head concept and a concept that represents a necessary part of the head concept does not specialize the head concept. For instance, * Blütenblume (*bloom flower), *Dachhaus (*roof house), * Gestellbrille (*frame glasses) are odd, but Schnurtelefon (lead telephone) and Sportsitzauto (sports seat car) are fine. 'Has-part' can be used for compounding only if the concept provides a disjunction of elements standing in the has-part relation: X
y hae-part(x,y)
=>
ci(y)
V...V
«My)
If there is only one component standing in a specific has-part relation of the complex object, this cannot be expressed by a NN-compound because no specialization is expressed. For example, while discussing localizations, I gave the concept for Zimmer (room). The
138 concept is mainly defined by means of the has-part relation holding between different concepts but no disjunction of concepts. Therefore a compound as * Fußbodenzimmer (*floor room) is odd although there is a has-part relation explicitely given between both concepts. To sum up, we record these rules in the conceptual system responsible for the application of the made-of, part-of and has-part relations for NN-compound interpretation: bit-of-solid(x)
physical-object(x)
y made-of(x,y)
=*· solid(y) V
assembly(x) physical-object(z) =»· Card(z) > 2
z has-part(x,z)
reproduction(r)
bit-of-solid(r)
q reproduction-of(r,q) SUV
physical-object(s) substance(v)
part-of(u,s)
V
bit-of-paste(r) ^·
physical-object(q)
139
5.4.4
Object-Specific Relations
Up to now various conceptual conditions and their contribution to NN-compound meaning have been stated. I examined mixing-relations, spatial constellations of two objects and conditions for applying the relations made-of, part-of and has-part to NN-compound meaning. All these relations are rather general ones and they are not bound to specific object concepts. Conceptual knowledge is represented in the form of concept characterizations and definitions, respectively, and inferences on certain conditions. Additional relations are possible however than general ones as e.g. location, made-of and has-part. Each object class is related to other object classes by simple and complex relations. These relations do not constitute the purpose of the object so that they would have counterparts in lexical DRSs; they are forming the concept description by means of links to other concepts. Therefore the use of object-specific relations parallels the use of concept namings for substances. A two-place relation in the concept definition or concept characterization belonging to the head noun is used for NN-compound interpretation when the domain of its argument subsumes the domain of the concept that goes with the modifier. So suitable relations are all relations holding between two concepts. These relations may be complex, however. By way of demonstration, I will represent conceptual knowledge constituting the complete concept underlying Stadt (city): community(x)
city-community(x)
y
=Φ·
location-in(x,y)
urban-area(y)
V
=»
has-habitation(x,v) w has-size(x,w)
habitation(v)
population(w) Card(w) > 100000
=»
city-conununity(r) 8
uves-in(r,8)
=3-
suburb(s)
t works-in(r,t)
=»
central-area(t)
side(al) a2 side-of(al,a2) m recreation(m)
*
=>
river(a2)
n has-subdivision(m,n)
=»
museum-collection(n)
140 place(a) city-area(a)
b consist8-of(a,b)
=>
central-area(b)
c consist»-of(a,c)
=>·
suburb(c)
d consiste-of(a,d)
=Φ·
periphery(d)
e has-part(a,e)
=>
city-area-place(e)
i hae-part(a,i)
=Φ
building(i)
location-in(a,f)
=Φ·
place-of-country(f) side(g) h
g location-at(a,g)
Side-Of(g,h)
=*" PlaCe
=>·
recreation(j)
=>
welfare(k)
governmental ) city-community(xl) city-institution(xl)
zl governs(xl,zl)
=>·
city-community (ζ 1 )
z2 elected-by(xl,z2)
=Φ·
city-community(z2)
The same procedure as for general relations holds for object-specific relations as well: if a concept stands in a certain relation to a different concept and the modifier concept is sub-
141
sumed by this concept, this relation is a candidate for NN-compound interpretation. The relation may be simple or complex. Complex relations are composed of other relations. Several novel NN-compounds are interpretable by access to this knowledge of cities, as e.g.: Uferstadt (bank city): city area located at a bank. The concept 'city area' is related to sides by 'location-at' and 'bank' is a subconcept of 'side' in the knowledge base above. Museumsstadt (museum city): city area with a museum building or city institution sponsoring museum collection. Both concepts 'city area' and 'city institution' are standing to concepts of the complete concept 'museum' in specific relations. Buildings are parts of city areas and 'museum building' is a subconcept of 'building'. City institutions are described as sponsoring recreation and 'museum collection' is standing to 'recreation' in the 'has-subdivision' relation. Bayernstadt (Bavaria city): city area located in Bavaria. The knowledge base declares city areas as located in countries. Pendlerstadt (commuter city): city area consisting of a suburban and a central area where commuters live in and work, respectively. Two composed relations hold between city area and commuters. These possible interpretations already imply that complex relations can be constructed by means of various procedures. All the procedures underly the same principle. Procedures and principle are stated at the end of chapter 5.4.
5.4.5
Conjunctive Compounds
Until now all interpretation strategies were developed for endocentric compounds. These are compounds denoting a proper subset of the head noun extension. However, there are exocentric compounds in German as well, the so-called conjunctive compounds or 'dvandvas'. These are NN-compounds interpretable by conjunction. For instance, Mördergeneral (murderer general) can be interpreted as 'x who is a general and a murderer'. Although such a compound is formed by the same morphological rule as every NN-compound, so that the head is the right hand element, the interpretation seems to treat both constituents as equal. Interpreting dvandvas by means of conjunction seems to involve no relation coming from lexical representations or conceptual knowledge. So these compounds do not seem to fit into the model proposed. However, a closer look on these compounds is necessary. I will demonstrate that conjunctive compounds are not formed by a separate compounding operation but they are subject to conceptual conditions as well17. The use of conjunction reflects compatibility of object functions. First, let us have a look at lexicalized dvandvas. These compounds are rather unusual. They arise as proper names Schleswig-Holstein, Baden- Württemberg) or as appella17
An approach that goes roughly into the same direction is Fanselow's (1981:116ff.) analysis of dvandvas. He analyses dvandas formed by two proper names for countries (as e.g. Schleswig-Holstein) by a part-of relation and not simply by conjunction. The former surely is a conceptually founded relation.
142
tives (Mördergeneral (murder general), Brauereigaststätte (brewery pub), Dichterkomponist (poet composer), Radiouhr (radio clock). This pattern is still productive but it is highly restricted by our knowledge of object functions and roles of role nouns. Dvandvas require constituents denoting entities of the same ontological type. For example, Theatermuseum (theatre museum) and Dichterfreund (poet friend) are interpretable as conjunctive compounds because their constituents denote entities that are both buildings/ institutions and persons, respective However, compounds as Computertelefon (computer telephone) and Tassenteller (cup plate) cannot be interpreted as dvandvas although the constituent extensions are of the same type. Nothing is known about objects that may have both functions that the single concepts have. So it is possible to have a friend who is a poet, but until now there is nothing that is a phone and a computer at the same time, although advances in new technologies might change this state in the near future. Incompatibility of functions for interpretations by conjunction becomes more obvious if two nouns denoting objects with completely different purposes, as Tassengabel (cup fork) or Hammerspiegel (hammer mirror) are combined. Apart from whether these compounds have any meaning at all, they surely cannot be interpreted as dvandvas. The compounding rule for novel NNcompounds interpreted by conjunction is: If the complete concept of N\ is C\ — {cx,...,^} and the complete concept of #2 is Ci = {di,...,dm} and for all c,,
y r(x,y) ==>
Q(y!
are directly available simple relations for referents χ and y if Q subsumes a predicate Pj! from the complete concept of C\. The compound Pendlerstadt (commuter city) gets its interpretation by a directly available composed relation. There is a directly available single relation 'consist-of for the referents a and b between 'city area' and 'central area' and a directly available relation 'place-of-working' for the referents ul and vl between 'central area' and 'commuter'. By viewing conceptual DRSs as models, both ul and b are elements of [central area]. The compounding rule for novel NN-compounds interpreted by a conceptual relation is as follows: If the complete concept denoted by K\ consists of the set of concepts {R\(a\}, ··, Rn(an)} and the complete concept denoted by KI consists of concepts {i), •••»Qm(^ m )} and for some i j (1 < i < n; 1 < j < m), a simple or complex relation r(fe,,a,) can be determined, the lexical DRS of the compound is:
146
yi y2...wR lexical DRS-conditions of lexical DRS-conditions of K\ ii = w R(yi,w) The algorithm for relation determination is as follows: Suppose the constituents of a novel NN-compound [N\ N^] denote the complete concepts C\ and 6*2> respectively, in a knowledge base Kc with d = {tfi, -> Lln ...,Kln -4 L ln } with R{(ai) € ConKl. and a, C2 = {#2l -» L 2l ,..., #2m -» L2m} with £,·(&,·) € Con*,, and 6, G For every L^ containing a DRS-condition MI —^ M2 with r(c,d) € Con\tl and P(d) € Con A/Z, test whether [P] C [β,] holds for some 1 < i < n. If it holds: 1. Take the directly available simple relation r 2. Search for composed relations between bj and a,; go stepwise to the superordinate concepts of Ci and C\ and search again for simple and composed relations until the immediately superordinated concepts of the top-concept are reached. 3. After searching, apply inferences meeting DRS-conditions of C\ and C2 in order to determine inferable relations. This algorithm provides all concept-based relations for interpreting NN-compounds. A compound \N\ N^} with the underlying complete concepts C\ and C2 can be interpreted by simple or composed relations ΓΙ, ..., rm , found or constructed by the algorithm above. The procedure explains the fact that conceptually founded relations allow no conceptual shifts for the constituents anymore. The relations hold between two specific concepts belonging to the complete concept. Hence no specialization can be required in contexts. I will come back to conceptual shifts below. Inferences for deriving the location relations are given, furthermore entailments on substance properties for getting various mixing-relations. The discussion of why conjunctive compounds get their interpretation shows that underlying inferences can be very complex. The richness of conceptual knowledge provides many relations possible in NNcompounds; this richness is also the reason why conceptual knowledge currently prevents a complete analysis.
5.5
A Network of Compounding Rules
Now I want to summarize the compounding mechanisms explained above and fit them into a network in order to connect standard interpretations with unusual interpretations.
147
Ignoring at the moment the possibility of introducing a relation in discourse, there are three levels the relation may derive from. These are the grammatical level of thematic role assignment, the level of lexical representations and the level of conceptual structures. They are ordered with respect to saliency of relations in compounds. So if θ-role assignment fails on the basis of selectional restrictions, relations that belong to the lexical representation of the head have to be chosen as preferred interpretations. If these relations fail on the basis of selectional restrictions, conceptually motivated relations have to be chosen as preferred interpretations. Entering a lower level is also possible in order to select alternative relations. This means, failure of or search for alternative interpretations enables the move to a lower level. Thus one gets this arrangement: Theta-role assignment | fail or search for alternatives relation from lexical DRS of the head noun fail or search for alternatives conceptually founded relations The conceptually founded relations are ordered as well since non-application of one conceptual relation leads to unusual interpretations. There is a distinction between specific conceptual relations of object classes and general conceptual relations, provided by general links between certain domains. At least the relations 'made-of, 'location', 'temporal duration', 'has part'/ 'part of belong to the latter. The former are relations between the object class in question and other specific classes. To start with, I will have a close look into the upper structure of the conceptual system, i.e. the general relations between basic domains and their usage for NN-compound interpretation. 'Made of is a mapping from solids or pastes to non-decomposable physical objects. However, individual pieces of substances have an inner space and a surface as possible places for locations. Thus, if the conditions for the made-of relation fail, there may be a switch to an interpretation by a location relation. Depending on the size of the individual piece of substance and object as well as the other properties I have described, location in the inner space of the substance or location on the surface of it is possible. Therefore, one arrives at the class of location relations. These are relations between a place and an object. This place may be a proper place or a place provided by a piece of substance or an object. If an object provides this place, it is conceptualized on the basis of spatial functions as supporting, containment or protection. 'Part of and its inverse, 'has-part', are relations between complex objects and their components whereby has-part is possible in NN-compounds only if more than one component may stand in this relation.
148
Object specific relations and relations from more general concepts are preferable relations for NN-compound interpretations if the modifier concept is subsumable by the relation's argument. The algorithm for searching simple and composed relations in a concept lattice looks for relations between the concepts denoted by the nouns and then stepwise relations between more general concepts can be found. If no object-specific relations and their generalizations are applicable, general relations as made-of, part-of and has-part should be tested. If these relations fail, a location may be used depending on the spatial functions the objects have. Thus one gets this figure for interpretation strategies from conceptual object-specific relations to general conceptual relations with respect to compound interpretation: | relations between concepts forming the complete concepts of C\ and C [4 p.m.] Thus, Marmormuseum might be understood in at least two ways: as a building made of marble or as the complete concept 'museum' exhibiting on marble. The former interpretation does not allow any contextual variation with respect to the constituent's extensions while the latter is open for variability of the head's extension. The reason for this difference is founded in the locus of the relation: conceptually founded relations seem to restrict both arguments to certain domains while relations coming from lexical representations only restrict the meaning of the second item to specific readings. Let us have a look at the second example, Museumsbuch (museum book). Both nouns are subject to contextual meaning variability. Buch can at least denote the physical object or the information mediated by that physical object. Its lexical representation is given above. Museumsbuch can have different meanings, much more than Marmormuseum. Some interpretations are: 1. book (information mediator) on museums (complete concept) 2. book (information mediator) written by museum (institution) 3. book (physical object) sold by museum (institution)
154
4. book (physical object) sold in museum (building) 5. book (physical object) located in museum (building) 6. book (complete concept) published by museum (institution) Again the use of a relation that belongs to the lexical representation of the head is always applicable, independent of the head's conceptual shift. For instance, it is possible to use the compound with the relation of having a theme: 1. Das Museumsbuch kostet 20 DM. (The 'museum book' (physical object) costs DM 20) 2. Das Museumsbuch ist sehr informativ. (The 'museum book' (book information) is very informative) 3. Das Museumsbuch ururde Margaretes Verh ngnis. (The 'museum book' (complete concept) was Margaret's undoing) As (3) indicates, rather general domains may be assigned, depending on the activated scenario. (3) is too general in its proposition for getting precise statements of what is meant with book although the meaning of the compound is fixed and there is a relation (undoing) between the compound and the proper name. DRS and conceptual representation are: χ νer xl'χ ,p)
Ρ
t
y z el e2 publ-comp-inst(z) publishing(el) mediating(e2) theme(e2,y) agent (el, z) y=v
w]
)i( !
w
i q)
q=
e3 e4 u exhibiting(e3) informing(e4) theme(e3,u) theme(e4,u)
v b(;come(e) ajjent(e,x) tlieme(e,r) Ulidoing(r) of (r,t) Μ argaret(t) W =
155
Its conceptual representation: object(a)
y
info-carrier(a,y) V
serves-for(a,v)
=Φ·
entity(y)
=>
mediating(v) ζ |museum-institution(z)| y theme(v,j= ] ^ | muserum-building(z) | v | museum-staiF(z) | y | museum-collection(z) |
w theme-of(a,w^ =>
Ϊmblishing(w) J agent(wj) =>
u distributed-by(a,u) m sold-in(a,m) =>
=>
publ-comp-inst(j )
publ-comp-inst (j )
building(m) V
information(a) c written-by(a,c)
^
person(c)
1
u
tnf*Tn*»ia η
.
.
^
|museum-institution(d)| V ' eum-staiF(d) | V
S
available-by(a,s)
^
r selected-by(a,r) e agent (a,e)
muserum-building(d)| V museum-collection(d)
'
^
^
publishing(s g agent(s,g) ^
publ-comp-inst(g)
publ-comp-inst(r)
mediating(e) f theme(e,f) =*
all( 0
[was Margaret's undoing] There is no information whether the object 'book' is meant or the information carried
156
by this physical object. However, (3) may be enlarged by further information in order to state the head concept more precisely: • Wenn Margarete das Museumsbuch in die Tasche steckt, wird es ihr Verhängnis. (If Margaret puts the 'museum book' into the bag, it will be her undoing) Given the same meaning for the compound, one now can determine the domain 'physical object' for the head noun based on selectional restrictions. The point is not that the DR-structure differs from the DRS for (3), but the ability to determine a specific concept for the head, triggered by the verb stecken (roughly: to put). One gets as DRS for (21)20: X
X
v m en
re' t k undoing(r) become(e') agent(e', k) theme(e',r) patient(e',t) t =m
;*,p)
y z el e2 publ-comp-inst(z) _ publishing(el) P mediating(e2) theme(e2,y) agent(el,z) y= v [w] X(w,q) e3e4 u exhibiting(e3) | 1= informing(e4) theme(e3,u) theme(e4,u)
|k = , V
|k = ^
=>·
w= v [»a die Tasche steckt]
This conceptual representation contains the concept for the physical object 'book' because only physical objects can be put into bags. Thus, a relation from the lexical representation of the head noun can always be applied if the modifier extension meets selectional restrictions provided by the relation. The relation's selectional restrictions trigger also the conceptual shift of the modifier. Specializing the head noun extension is effected by selectional restrictions provided by the theta-role assigning lexical item or script conditions. 20
Although the verb enables domain specification for the head noun of the compound, another specialization possibility occurs. It is not clear in this mini-discourse, whether the pronoun es refers to the compound or to the event of putting. Therefore the corresponding discourse referent is not an argument of a condition that fixes a particular domain. The referent can be stated more precisely either as an event or as a physical object. This problem of ambiguity seems to support the approach of domain specifications as specializing meanings during discourse processing. Predicates that are place holders for conditions from lexical DRSs are printed in italics.
157
I pointed out in chapter 5.1 that NN-compounds with relational heads do not allow conceptual shifts. Since I stated above that conceptual shifts are determined by discourse information and therefore restrict possible meanings of NN-compounds, I have to reverse this view: NN-compounds with relational head cannot be subject to conceptual shifts of the head. Conceptual shifts can also trigger the search for conceptually founded relations that are bound to the specific concepts determined by the shift. Saliency of a relation is determined by its locus in different knowledge sources. If meaning variability exists for the head and two-place relations are contained in its lexical representation, the two-place relations can be used for NN-compound interpretation. Therefore in discourse the most salient interpretation is always applicable, because it is part of the context-invariant representation. Conceptual relations are only applicable to specific extensions of the head that are determined by a conceptual shift. Hence conceptual shifts for novel NN-compounds are only triggered for the head noun by a lexical item or script conditions. Based on the concept(s) determined, conceptual relations connected to the concept(s) are available for compound interpretation as well as two-place relations of the head's lexical DRS or argument satisfaction.
5.7
Summary and Conclusion
A proposal was made for distinguishing between different kinds of compounding processes and their order with respect to applicability and saliency of interpretation. I pointed out that saliency of a relation in a novel NN-compound is determined by the level the relation derives from. The syntactic operation of argument satisfaction leads to the most salient interpretation of NN-compounds with a relational head. NN-compounds with a sortal head get their most salient interpretation from a relation belonging to the lexical meaning of the head noun. Conceptual relations are determined by object-specific properties and inferences on certain conditions. These relations are connected with each other so that non-applicability of one relation leads to alternative interpretation possibilities. The 'entrance' into the set of conceptual relations is given by the type of noun concatenation. For instance, a [solid object] combination leads to the made-of relation. To repeat, NN-compounds with a modifier denoting a concept that stands necessarily in a relation with the head concept are blocked, because NN-compounds must denote a specialization of the head noun. Utterance meanings of novel NN-compounds are determined by co-operation of lexical representations of the constituents with the actual relation and conceptual shifts. The exposition above explains what possible relations are, where they come from and why some are better relations for compound interpretation than others. I also pointed out when and how these possible relations are available after a conceptual shift of the head. There is only one possibility for applying relations independent of the shift: the relation of the compound is part of the head's lexical meaning.
158 Thus, the task of the next chapter will be to determine actual relation(s). We will see however that actual relations are not a proper subset of the set of possible relations. In a certain discourse, a NN-compound can get a meaning which would never have been assigned to it in isolation. But before I go into this topic, I will summarize the compounding rules stated in this chapter. The compounding rule for novel NN-compounds with a relational head is: If the lexical DRS of the head noun 7V2 is Kpi3 = Aj/2 Aj/i < {yi, yi,...}, {lexical DRSconditions of K^} > and the lexical DRS of the modifying noun is Ajy, = (Xx^) Xxi < {(x2), Χι,···}> {lexical DRS- conditions of K^} > and Kf{3 refers to the complete concept Ci = (Qi(bi,ci), ..., Qm(bm,Cm)} and Kfft refers to the complete concept C\ = {Pi(oi),..., Pn(an)} and there are selectional restrictions At(c,·); (1 < ii < m) and for some i,j (1 < j < «), subsumption(A,·, Pj) holds, then the lexical DRS K' of [Ni JV2] is:
Xy\
y\ !/2
...
lex ical DRS-conditions of K2
[xl] x2 ... lexical DRS-conditions of K\ xl = y 2
t
The compounding rule for novel NN-compounds where a two-place relation is included in the lexical representation of the sortal head noun: If the lexical DRS of the head noun NI is ΚΝΛ = Xy < {y}> {x(x,p),P =< {zi, ···> zn}, {conditions with: r(y,z,·)} >} > and the lexical DRS of the modifier NI is K^ = (λι 2 )λζι < {xi,xa}, {lexical DRS-conditions of K\t} > and K^ refers to the complete concept Ca = {Qi(b\ ),..., Qm(bm)} and KN^ refers to the complete concept C\ = {Ρι(αι), ..., Pn(an)} and for some Q,-(6,·), their concept description contains the denotation of r(t/, Zi) that requires a concept C(c,·) for its second argument and for some i (1 < i < n), subsumption(C, A,) holds, then the lexical DRS K' of [N\ NI] is:
Ay
y w ...
x(y,p) P=
conditions with: Z{ = W
lexical DRS-conditions of Kl Xl = W
159 Conjunctive compounds must meet compatibility of their conceptual functions. Their compounding rule is: If the complete concept of NI is C\ = {ci,...,c„} and the complete concept of ΛΓ2 is Ci = {