200 96 2MB
English Pages [273] Year 2013
e Semantic Representation of Natural Language
Bloomsbury Studies in eoretical Linguistics Bloomsbury Studies in eoretical Linguistics publishes work at the forefront of present-day developments in the field. e series is open to studies from all branches of theoretical linguistics and to the full range of theoretical frameworks. Titles in the series present original research that makes a new and significant contribution and are aimed primarily at scholars in the field, but are clear and accessible, making them useful also to students, to new researchers and to scholars in related disciplines. Series Editor: Siobhan Chapman, Reader in English, University of Liverpool, UK. Other titles in the series: Agreement, Pronominal Clitics and Negation in Tamazight Berber, Hamid Ouali Contact Linguistics and Corpora, Cedric Krummes Deviational Syntactic Structures, Hans Götzsche First Language Acquisition in Spanish, Gilda Socarras Grammar of Spoken English Discourse, Gerard O’Grady A Neural Network Model of Lexical Organisation, Michael Fortescue e Syntax and Semantics of Discourse Markers, Miriam Urgelles-Coll e Syntax of Mauritian Creole, Anand Syea
e Semantic Representation of Natural Language Michael Levison, Greg Lessard, Craig omas and Matthew Donald
L ON DON • N E W DE L H I • N E W Y OR K • SY DN EY
Bloomsbury Academic An imprint of Bloomsbury Publishing Plc 50 Bedford Square London WC1B 3DP UK
175 Fifth Avenue New York NY 10010 USA
www.bloomsbury.com First published 2013 © Michael Levison, Greg Lessard, Craig Thomas and Matthew Donald, 2013 All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage or retrieval system, without prior permission in writing from the publishers. The authors have asserted their right under the Copyright, Designs and Patents Act, 1988, to be identified as Authors of this work. No responsibility for loss caused to any individual or organization acting on or refraining from action as a result of the material in this publication can be accepted by Bloomsbury Academic or the author. British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library. EISBN: 978-1-4411-0902-6 Library of Congress Cataloging-in-Publication Data Levison, Michael. The semantic representation of natural language / by Michael Levison . . . [et al.]. p. cm. -- (Bloomsbury studies in theoretical linguistics) Includes bibliographical references and index. ISBN 978-1-4411-6253-3 -- ISBN 978-1-4411-0902-6 (pdf) -- ISBN 978-1-4411-9073-4 (ebook) 1. Semantics--Data processing. 2. Natural language processing (Computer science) 3. Knowledge representation (Information theory) 4. Computational linguistics. I. Title. P325.5.D38.L457 2012 006.3′ 5--dc23 2012030495 Typeset by Newgen Imaging Systems Pvt Ltd, Chennai, India
To Ann, Barbara, Shona and Angela
Contents
List of Figures List of Tables Preface Typographical Conventions 1
2
3
Introduction 1.1 What we are trying to do 1.2 e jewel in the crown 1.3 How to read this book
xi xii xiii xvi 1 1 3 12
Basic Concepts 2.1 Semasiological and onomasiological perspectives 2.2 Meaning and reference 2.3 Describing or creating reality 2.4 e functions of language 2.5 Semantic units and semantic relations 2.6 Language, knowledge and perspective 2.7 Anthropomorphism, minimalism and practicality 2.8 Desiderata
14
Previous Approaches 3.1 Lexical semantics 3.2 Conceptual structures 3.3 Lexical relations and inheritance networks 3.4 e generative lexicon 3.5 Case grammar 3.6 Conceptual Dependency 3.7 Semantic networks 3.8 Systemic grammar 3.9 Truth-functional perspectives 3.10 Computational tools for semantic analysis
32
14 16 17 19 22 23 25 27
32 35 39 43 44 46 48 51 52 64
viii
Contents
3.11 3.12 3.13 3.14 3.15 4
5
6
Models of text structure Narrative structure and narrative prose generation Knowledge representation and ontologies Models of reasoning: ACT* and ACT-R In sum
Semantic Expressions: Introduction 4.1 Background 4.2 Some caveats 4.3 Basic expressions 4.4 Semantic types 4.5 Semantic functions 4.6 e constant UNSPEC 4.7 Adjustments 4.8 Qualifiers 4.9 Relative qualifiers 4.10 Restriction versus description 4.11 Lists 4.12 Circumstances 4.13 Modifying completions 4.14 Adjustments versus functions 4.15 Modifying completions versus modifying actions 4.16 Combining completions 4.17 Representing semantics, not syntax
65 69 71 74 74 75 75 77 78 79 79 81 83 84 86 89 90 90 93 95 97 98 100
Formal Issues 5.1 Introduction 5.2 Properties of SEs 5.3 Semantic tree 5.4 e semantic lexicon
107
Semantic Expressions: Basic Features 6.1 Introduction 6.2 Generalized quantifiers 6.3 Count and mass entities 6.4 Quantifier granularity 6.5 Negatives 6.6 Only 6.7 Numbers
121
107 107 110 112
121 121 122 123 124 127 128
Contents
6.8 6.9 6.10 6.11 6.12 6.13 6.14 6.15 7
8
9
ix
On the granularity of snow Nobody and everybody Sets and set-difference Interrogatives Tense, aspect and modal verbs Sequencing Timestamps Specific and non-specific entities
129
Advanced Features 7.1 Co-referential relations 7.2 Entity constants 7.3 Other constants 7.4 Scope of definition 7.5 Lists 7.6 Connectives 7.7 Associativity and commutativity 7.8 Propagation of adjustments 7.9 Other programming features 7.10 Functional programming 7.11 Duelling quantifiers 7.12 Two-dimensional quantifiers 7.13 Arrays 7.14 Football fans 7.15 only Revisited 7.16 Mass entities 7.17 Speakers, listeners and speech 7.18 Vale, Caesar
146
Applications: Capture 8.1 What to represent? 8.2 Adventure 8.3 A guided tour 8.4 Instruction manual or recipe book 8.5 Topoi
186
ree Little Pigs 9.1 Preliminaries 9.2 e story
205
129 130 132 135 138 141 144
146 148 150 152 153 154 156 157 158 165 165 168 170 171 172 176 177 184
186 191 198 199 199
205 206
x
Contents
9.3 9.4
Alternative segment for bad_encounter Length issues
221 224
10 Applications: Creation 10.1 A hole in three 10.2 A Proppian fairy tale 10.3 Variant stories 10.4 Romeo and Juliet 10.5 In sum
225
Bibliography Index
236
225 226 232 234 235
245
List of Figures
1.1 1.2 3.1 3.2
e intersection of language, logic and computation e place of a semantic representation in linguistic processing A system network for the choice of English pronouns e relationship between nucleus and satellites in Rhetorical Structure eory 5.1 A semantic expression as tree 5.2 A more complex semantic expression
2 4 51 68 111 111
List of Tables
2.1 3.1 4.1 6.1 6.2 6.3 6.4 6.5 6.6 6.7 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11 7.12 7.13 7.14 7.15 10.1 10.2
Jackobson’s six functions of language Componential analysis of several motorized vehicles Some typical entity relations Some quantity terms in English and French Some examples of use of quantity expressions Some modal expressions and their meanings Events over time: the lasagna case Relations between two durative activities Relations between a durative and a punctual activity Relations between two punctual events Elements of e House that Jack Built Array for veg(man) A two-dimensional array Some possible array types Stabbing senators Generalization of block arrays Arsenal and Chelsea supporters array Array for ‘Brutus is a vegetarian’ Array for ‘Only Brutus is a vegetarian’ Array for ‘Only Brutus stabbed Caesar’ Array for ‘Brutus stabbed only Caesar’ Array for Brutus[only] Array for Brutus only stabber, Caesar only victim Array for only Brutus as stabber of only Caesar Array for only Brutus as stabber and only Caesar as victim Cast of characters for a folktale Semantic expressions for the sample folktale
20 33 86 124 124 137 139 140 140 141 162 167 167 169 171 171 172 173 173 173 173 174 175 176 176 226 230
Preface One of the primary roles of a university is to bring together scholars with different areas of expertise and different outlooks and opinions, allowing the creation of knowledge that they would be unable to derive individually. e research reported in this monograph provides an example of that. Michael Levison, born in the last decade BC (Before Computers), obtained an undergraduate degree in Mathematics, before moving to Birkbeck College in the University of London, England in 1958, to join the department headed by A. D. Booth, pioneer of computing and of computer applications to linguistic and literary problems. He subsequently moved to the School of Computing at Queen’s University in Canada. Greg Lessard, born in the first decade CE (of the Computer Era), is a professor in French Studies, especially lexicology and morphology, who has participated in a variety of projects in the ‘digital humanities’, including work on topoi, 18th Century book titles, Stendhal and orality in Canadian French. ey met (in 1986, so that at the time of writing, their Silver Anniversary has just passed) in the context of language learning. e intent was to design a computer program which would generate drill exercises in grammar for students learning various languages. e idea had been explored in small programs written earlier for two professors in Spanish and Italian: J. K. McDonald and D. Bastianutti. e programs, known collectively as Q’Vinci, were fairly simple. eir syntaxes were flat, their vocabularies very limited, and along with the necessary morphology, they were built directly into the programs themselves. In a meeting which brought together specialists from a variety of areas, a more ambitious project was defined, involving arbitrary languages, each defined by a syntax of context-free and transformational rules, a comprehensive lexicon and a formal morphology. e resulting implementation was, and is, a natural-language generation system, named VINCI in honour of the earlier work.
xiv
Preface
e ongoing development of VINCI continues to this day, and many side paths, as well as applications far beyond that originally envisaged, have been explored. In these, we have been assisted by many of our students, in Computing, French Studies, Linguistics and so on. Several of the applications of VINCI referred to above, including languagelearning, suggested the need for a practical formal representation of natural language meaning. It was perhaps our colleague David Lamb, in the 1990s, who first drew our attention to this, by asking whether VINCI could be used within one of the textual adventure games (Adventure, Zork, . . .) to generate the descriptions of “caves” in whatever language a user desired. In principle, of course, the answer is Yes. e meanings of the stored descriptions could be represented in a formal notation, and the sentences generated as required. And it was also preferable that the language exercise examples generated by VINCI should form meaningful utterances, rather than a random collection of grammatically correct words. And so, our concepts were born. Craig omas worked for the project as a Summer research assistant during some of his time as an undergraduate in Computing. He went on to complete his Master’s degree on the topic “A Prosodic Transcription Mechanism for Natural Language Generation” (omas 2002), which resulted in the production of spoken French as VINCI output. Aer a period away from university, he returned to obtain a PhD degree with a thesis on “e Algorithmic Expansion of Stories” (omas 2010), in which he followed up some of the applications proposed in Chapter 10 of this monograph. Matthew Donald joined our group as a Master’s student in Fall 2004, aer completing his undergraduate degree at the University of Waterloo in Ontario, Canada. His research produced a systematic study of the representation of several linguistic features, and his dissertation: “A Metalinguistic Framework for Specifying Generative Semantics” (Donald 2006) is the origin of some of the examples in later chapters of this book. It is sometimes believed, particularly in the humanities, that a lab is a sterile place filled with test tubes and arcane instruments. In fact, as anyone who has worked in a lab can attest, it is first and foremost a social place where ideas are played with and people interact. Over the years of our project, we have had the pleasure of interacting with a wide range of students, graduate and undergraduate, and with colleagues -- some of them from outside computing or linguistics -- who
Preface
xv
have listened, usually patiently, to our various attempts to explore this area. We thank them all here. e production of a book like this requires the contribution of many people. We would like to single out here for particular thanks the series editor, Siobhan Chapman, Gurdeep Mattu and Laura Murray of Bloomsbury, and the production team for their efficiency and care in the preparation of this volume. We also thank Anthony Brohan, a linguistics student at Queen’s, for his help in preparation of the index. We would also be remiss if we did not thank the Social Sciences and Humanities Research Council of Canada for their continued support over many years. Without this support, we would not have been able to construct our lab or involve the students it has been a pleasure to work with. Finally, a word to others in the field of computational semantics. We suspect that in some respects our work will seem unsurprisingly familiar -- yet another functional representation of meaning -- but in others it may seem distinctly odd -starting from meaning rather than form, lacking a truth-functional perspective, not discussing inference. We beg their indulgence and ask only that they consider whether anything of what we propose can provide a useful complement to current work in the field. Michael Levison Greg Lessard Craig omas Matthew Donald
Typographical Conventions In this document, we distinguish the following sorts of elements: • Strings in a natural language are represented either within double quotes,
as in: e sentence “Brutus killed Caesar” contains three words. or in regular font within numbered examples. • Meanings are contained within single quotes, following the usual linguistic
convention, as in: e traits ‘human’ and ‘capable of self-locomotion’ are both found in this word. • Semantic Expressions are represented in typewriter font, as in:
e meaning of this sentence may be captured by the expression stab(Brutus, Caesar). • Terms are represented in boldface font, as in:
e model we are using is based on principles of functional programming. • Emphasized items are represented in italic font, as in:
e transformation of mass to count entities is particularly complex. or occasionally in single quotes, as in: is is ‘occasionally’ the case.
1
Introduction
1.1 What we are trying to do Since the work presented here falls to some extent outside the usual frameworks, it is perhaps important to begin by showing how it relates to the dominant tendencies in the elds of logic, computation and language, and the research areas which span them, as shown in Figure 1.1. e arrows which link the ovals in the gure illustrate the three research areas we have in mind. us, the combination of computation and logic underlies the broad elds of theorem proving, based on the computational representation of logical formulae and manipulations based on these, as well as attempts to represent human activities and decision-making in a computational framework. e manipulation of natural language is not a primary consideration in such cases, although language may be touched upon. e combination of logic and language, on the other hand, has as its goal the representation of the meaning of linguistic phenomena, usually but not always single sentences, in some logical formalism, typically with a goal of permitting the calculation of truth values and logical implications. So, for example, Montague semantics (Dowty 1981) and its derivatives attempt to provide a calculus of the truth values of some particular set of linguistic devices such as quanti cation, tense and so on, as we will see in more detail in Chapter 3. Finally, the combination of computation and linguistics has as its primary goal the use of computational frameworks and devices to analyse or to synthesize See, for example, Robinson (2001). A good example of this can be found in the ACT-* and later ACT-R frameworks. For an earlier discussion, see Anderson (1983) and for a more recent overview of the formalism, see http://act-r.psy.cmu.edu/.
2
φ e Semantic Representation of Natural Language
Logic
Language
Computation
Figure 1.1 e intersection of language, logic and computation.
various sets of linguistic phenomena. Some of these frameworks make only limited reference to meaning (think of speech synthesis or dictation soware) but others, such as parsers, attempt to decompose linguistic utterances into their component parts, sometimes descending into the representation of meaning. Our work is situated in this third area. As a result, although we will draw upon discoveries made in other areas such as the combination of logic and language, our primary goal is not to arrive at a logical formalism as such. Similarly, we will occasionally take account of computational representations of logical relations, but we are not attempting to arrive at a framework whose primary goal is to demonstrate the truth of utterances. In fact, we will argue that a vast number of meaning phenomena make no reference to truth yet still require representation. Our approach also differs from most approaches to the representation of meaning in choosing an onomasiological perspective. In other words, rather than concerning ourselves with the question “What does the utterance X mean?”, we ask rather “How can we represent the meaning Y?” Of course, to be accessible to other humans, a meaning must be carried by some formal signal at some point, but we treat this problem as secondary. More precisely, the object of this work is to present one particular means of representing meaning. In some respects, our work sits at the edge of natural language generation (NLG) , in that we presuppose a chain between some meaning and the set of choices which eventually lead to production of some string in some natural language. However, we differ from most work in NLG, oen characterized by a focus on speci c sub-domains, by aiming at a general representation of meaning. In the ensuing text, we will use the acronym NLG to refer to both natural language generation and natural language generator.
Introduction
3
Although we have done considerable work in NLG,⁴ we will not be concerned here with showing in detail how some semantic representation is instantiated in some language. We also believe that the representation of meaning must be scalable, applicable to utterances of any length ranging from simple sentences to complex texts. For this to be possible, a semantic representation must, we will argue, draw upon a number of principles from the world of programming, principles which have made possible the construction of programs of some considerable complexity. Finally, we believe that, in order for a semantic representation to be available for use beyond the realm of specialists, it should be as simple and intuitive as possible. Our goal is to arrive at a formalism which could be used by a non-linguist, a non-computer programmer or a non-logician. We believe that in this way, we can achieve a certain democratization of work in these areas by drawing in others like literary specialists, psychologists, sociologists and those interested in representing some aspect of meaning.
1.2 e jewel in the crown One consequence of the approach we are proposing is to place meaning at the centre of the study of language, like the jewel in a crown. Figure 1.2 illustrates this by means of an idealized environment for the major computational applications involving natural languages. From right to le, L1 and L2 are typical natural languages which form the basis for texts in these languages. e meaning attributed to these texts will be affected by the vagaries of the particular language. It may be imprecise or even ambiguous, and can oen be understood only in conjunction with some knowledge shared between interlocutors.⁵ NLU1 and NLU2 are natural language understanders, that is, pieces of soware which take text in a natural language as input, and analyse it to extract its meaning. In our proposed model, this will be expressed as a collection of semantic expressions (hereaer, SEs).⁶ ere will presumably be an NLU for each natural language, though they will doubtless share many features. ⁴ For details, see www.cs.queensu.ca/CompLing. ⁵ An example of this complex interplay in the context of the interpretation of a short passage may be found in Hobbs et al. (1993). ⁶ We use the term semantic expression in a general sense here to refer to an expression in a semantic formalism which represents a segment of meaning. Later on (from Chapter 4 onwards) we will use it exclusively in reference to our own formalism.
4
φ e Semantic Representation of Natural Language
Human ideas
NLU 1
Text in language L1
Human generation Semantic expressions (SE)
NLG 1 NLU 2
Computer generation NLG 2 Requirements outline
Text in language L2
Various processes
Knowledge base
Figure 1.2 e place of a semantic representation in linguistic processing.
e converse of an NLU is a natural language generator (NLG). is takes some meaning, that is, a set of SEs, and realizes them in a natural language. ere may be an NLG for each natural language, or a single NLG with a lexicon and grammar speci c to each language. e desired output of an NLG is typically something close to that which a native speaker of the language might produce.⁷ We do not differentiate here between text in a written form and text in an oral form. We assume that an NLU may have a speech recognizer as its front-end to permit spoken input, while an NLG may be linked to a speech generator to produce oral output. Both are beyond the scope of the current discussion. We might also wish to include a ‘style’ parameter, telling the NLG what variety of output is desired in a whole sequence of productions: normative, literary, well-educated, hemingway-esque, popular, iambic pentameters and so on. See for example DiMarco and Hirst (1993). Or, perhaps stylistic variation will vary in a more ne-grained way, within speci c utterances, and must be captured in the semantic expressions themselves. For ne-grained stylistic variation, we ⁷ At a conference of computational linguists a few years ago, a participant characterized this task as “beyond the limits of human endeavour”, even “impossible”. Such a view, however, ignores the fact that some seven billion computers, each no bigger than a melon, are programmed to carry out this very task. We call them “human brains”, and they allow humans to speak to and understand others on a daily basis. In the current state of human knowledge, we may have only a limited understanding of their working, and do not know if the process they use is the most effective. But in y, or a hundred, or ve hundred, years? In the words of the Chinese philosopher Lao-tzu, suitably instantiated in English: “A journey of a thousand miles begins with a single step.”
Introduction
5
might want to introduce a controller, which is given a list of SEs, each with a style parameter. e controller sends a sequence of SEs to the appropriate NLG, accepts the output, then sends the next batch, etc. A major application of computational linguistics is the machine translation (MT) of natural language. is may be represented as simply the combination of an NLU and an NLG. e meaning of a text in the source language is captured and rendered into a semantic representation by an NLU. e meaning is instantiated in the target language by an NLG.⁸ At an academic conference, a participant challenged one of the authors with the statement “You are merely proposing an interlingua, an idea which was discredited 40 years ago.”⁹ As its name suggests, an interlingua is a language which serves as an intermediary in translating from one natural language to another. e idea arose out of a numerical observation which is still true today: that if there are n languages, n(n-1) MT programs are required in order to translate any language to any other. us 50 languages require 2,450 translation programs. If, however, we design and use an intermediate language, this can be reduced to 2n programs, 100 in our example. In effect, half of these play the role of NLUs, translating from each natural language to the interlingua; half, the converse role of NLGs. In the 1960s, the key question was what kind of language the interlingua should be. One argument claimed that it would have to include the (syntactic) features of every natural language. is, of course, defeats its purpose. e number of MT programs might be reduced to 2n, but each of the NLG-type programs in such a model must handle n times the complexity of a single-language MT program, thereby negating the savings! Another argument pointed out that one of the natural languages might itself serve as the interlingua, reducing the number of programs to 2n-2. It an interesting thought that we might translate French to Italian by rst translating it to Chinese! But, we may ask, aside from the numerical observation mentioned above, should an interlingua play a role in MT? Our belief is that a semantic intermediary is the only way in which MT can be fully successful. ⁰ In fact, this may be the ⁸ is is a very general description of the process. For a detailed description of an actual machine translation system, see Rosetta (1994). ⁹ is is not strictly accurate. We are proposing the form which a formal semantic representation should take. Our examples in later chapters illustrate what we have in mind; they are not themselves intended to be part of any speci c instance of an interlingua. ⁰ Yngve (1964) made a similar point almost half a century ago when he wrote: “Work in mechanical translation has come up against what we will call the semantic barrier ... we will only have adequate mechanical translations when the machine can ‘understand’ what it is translating ...”
6
φ e Semantic Representation of Natural Language
most cogent justi cation for a formal representation of meaning. To demonstrate this, consider the following: translation may be considered on three levels: the translation of words or phrases, the translation of syntax, the translation of meaning. e progenitors of MT had in mind nothing more than the rst. In effect, the computer was to play the role of a dictionary. It was soon appreciated that this was inadequate, and attention turned to translation at a syntactic level, spurred on by the work of Chomsky on the formalization of syntax (Chomsky 1956). In the years which followed, when research on MT was highly funded and at its zenith, almost all efforts were concentrated on syntax-based translation. Doubtless, this was in the interests of practicality – of achieving results within a reasonably short period. e fact is, however, that it ignores the approach adopted by every expert human translator. Typical expert human translators are expected to be familiar with the subject area of the document to be translated. ey read the source document once or twice to learn the gist of the meaning; then they reread it several times at ner and ner granularities until they have gained a thorough understanding. When they believe they have a good comprehension of the meaning, they express the meaning in the target language, usually paying attention to capturing the units of meaning found in the source language and representing them in the target language (Vinay and Darbelnet 1958). At the moment when they have captured this understanding, the meaning of the text is stored within their brains. It is therefore presumably represented by the state of, or connections between, some large set of neurons. is state may encapsulate a wide range of entities. If it includes segments of text, these will surely be not the originals but an abstraction in the nature of the semantic expressions discussed later. ough this state surely varies from translator to translator, it is in effect a personal interlingua used by that particular translator. To deny the value of such a concept, therefore, ignores all human experience in this eld.
e birth of MT in its computer context can be traced back to meetings between Andrew Donald Booth and Warren Weaver in 1946 (Locke and Booth 1955). e purpose of the meetings was to look at ways of persuading funding agencies to support the design and development of electronic digital computers. An obvious approach was to seek potentially valuable applications, one such being the translation of natural language. One of the current authors (Levison) was a doctoral student under Booth between 1958 and 1962. Some of the details here come from private discussions.
Introduction
7
And then there is Google. Google Translate uses what it refers to as a statistical machine translation system. is bypasses involvement with semantics by reliance on large glossaries of translated phrases or bits of text. As Jaap van der Meer, Director of the TAUS Data Association, puts it: e more data, the better the automatic translation. Data in this context means translation memories and glossaries. Translators using the Translation Toolkit ‘share’ their translations with Google. If 100,000 translators start using the service, Google will be harvesting 50 billion words of good quality translation data per year to help Google improve their automatic translation engines. (Van der Meer 2009)
Given the current state of formal semantic systems, this strategy may be inevitable, but in effect it emphasizes brute force over meaning, and poses the question: can huge glossaries alone ever lead to accurate translation? We raise here two issues. Google, as its name suggests, revels in high numbers, and we tend to be impressed by the mention of billions of words. In the context of human language, however, 50 billion words of text, or 50 billion phrases or longer segments, are far from high. e lemmatized form of Roger Mitton’s Computer Usable Version of the Oxford Advanced Learners Dictionary (Mitton 1986) created by the authors in connection with their research in NLG contains about 45,200 lemmas, including 27,500 nouns, 6,000 verbs and 3,000 adverbs. Collocations of a noun and a verb therefore number 160,000,000; those of two nouns, more than 750,000,000. Many English nouns and verbs have multiple senses, while more than a few words double as both. us, in the absence of further context or knowledge of the meaning of the surrounding text, even glossaries of these sizes will fail to render unambiguous translations of many two-word collocations. Adding even an adverb to the former group raises the number of these three-word combinations almost to 500 billion. To illustrate the explosive growth of these numbers, consider the sentence: “e projector chewed up the chase scene.” English-speakers familiar with the predigital cinema will quickly realize that the terms “projector”, “scene” and “chase scene” are linked semantically by their relationship to movie lms, and that the collocation of “projector” and “chew up” is a metaphor for a machine (the projector) destroying something which passes through it, namely the physical lm. ese humans will have no difficulty interpreting the sentence. What about a e name Google is derived from Googol, the number 10 raised to the power 100. See Kasner and Correa (1940).
8
φ e Semantic Representation of Natural Language
glossary-based translation? ere are four key terms, three nouns and a (compounded) verb, none of which can be replaced by a parameter representing an arbitrary noun if the meaning is to be determined. Based on the numbers above, the combinations of these four components total about 125 quadrillion (10 raised to the power 15). If even a tiny fraction of these yield meaningful utterances, a glossary listing them would far exceed the capacity of any current or imaginable computer. But this is not the more serious of the two issues arising from the brute force approach. In the coming chapters, we will show examples in which subtle changes in the choice or positioning of words can trigger major differences in meaning. And these subtle changes occur in everyday speech and in literary and technical texts alike. Will ‘harvesting’ 50 billion, or even 500 billion, words of good quality translation per year allow a translation algorithm to make these crucial distinctions? In short, can we ever trust the accuracy of these translations in critical situatations? We suggest to the reader two tests: one practical, one a test of faith. e practical test is for you to invite Google Translate to render the various examples from later chapters of this book in a language in which you are fully uent. You may then judge the degree to which the translated text accurately re ects the subtleties in the original meanings. e test of faith is one which the authors hope you will never have to face. Imagine yourself on an important tour through a region of China remote from Beijing when you experience some heart symptoms. ere is a well-equipped hospital nearby which boasts an excellent cardiac surgeon. She examines you and diagnoses a rare heart condition. She tells you that this can be corrected by surgery, and that, while it is not immediately life-threatening, you would be ill-advised to continue your journey until the condition has been put right. As she has said, the condition is rare, and she has never performed this operation. But there are several papers by rst-rate English-speaking surgeons on the Web, which she understands to contain very precise details. ere is one matter, however, on
As an illustration, on 28 September 2010, Google Translate rendered this paragraph into French as follows. L’äpreuve pratique est pour vous inviter Ÿ rendre Google Translate les difflärents exemples de chapitres de ce livre dans une langue dans lequel vous àtes tout Ÿfait couramment. Vous pourrez alors juger de la mesure dans laquelle le texte traduit re⅝ãte ⅜dãlement les subtilitäs de l’original signi⅜cations. Francophone readers will note the various errors of lexical choice and syntax to be found in the translation.
Introduction
9
which she would like your opinion. Although her conversational English is adequate, she is not sufficiently uent to capture the nuances in a text. She can email the relevant texts to a very good medical translator in Beijing, and she can expect a reply in about 7–10 days. But she knows that your journey is urgent, and she has heard good reports about Google Translate! Your wishes?
Incidentally, this discussion points up a dilemma for MT research and development: should the major effort be concentrated on near-term results which may prove eventually to be a dead end, or on an ultimate solution to the problem. Today’s decision may have been preempted by the lack of an adequate formal representation of meaning. As for tomorrow, the words of Lao-tzu echo in our ears. We hope that our work may offer a small contribution to his single step.
But we have strayed from our crown, and now return to it. e base of the crown is a collection of knowledge. Many operations involving natural language rely on access to existing knowledge for their success.
Humans typically express knowledge in the form of text or symbols, as pictures or diagrams or as recordings of sound (and here we refer to the screech of an owl or a symphony of Beethoven, rather than to spoken text). ese are also gathered into various kinds of databases, where each unit can be seen as a unit of meaning. In this monograph, we are concerned only with knowledge represented by text and symbols and by corresponding databases. Knowledge expressed as text can obviously be represented by the same semantic formalism we discuss in the coming chapters. is, in fact, also applies to the content of databases whose elemental components are textual or symbolic in nature.
us, in our ideal world, we may view knowledge as a vast collection of semantic expressions. is collection, of course, is not monolithic. In any processing operation or human interaction, we assume a base, which we may call global or world knowledge, common to the participants. Among humans, this will differ according their age, background and educational level, but any writer/speaker will presumably draw upon it in creating a text for some audience. To this base, there may be added subject-dependent, or local, segments, which may in turn have basic and supplementary components, the whole collection taking on a hierarchical form. Nor are these segments necessarily static. In the course of
10
φ e Semantic Representation of Natural Language
processing a text, new knowledge may be acquired, which must be added to the existing stock. ⁴ A complex web of such knowledge bases is illustrated by a detective novel. As a reader starts the novel, there is an assumed base of world knowledge and a pool of local knowledge known in its entirety only to the author. In addition, each character in the novel is imagined to have a subset of this pool forming a personal local knowledge base. At the outset, the reader’s local knowledge base is empty. ⁵ As the plot proceeds, the characters interact, a murder takes place and the detective questions the others. is results in the characters, and simultaneously the reader, gaining pieces of knowledge. Ultimately, the detective and the reader accumulate sufficient knowledge to be able to deduce the murderer’s identity. Even more bases, recording, for example, what X thinks that Y knows, may also prove necessary. ere are a range of processes which might be applied to knowledge bases and to collections of SEs in general. Some of these, which may be involved in particular applications, are outlined here. Different SEs may represent what humans see as related or similar meanings. We might then de ne transformations which convert one to another. For example, if (1.1) sell(John, Mary, book) means that John sells Mary a book, then we may create a transformation which converts this to: (1.2) buy(Mary, John, book) with the meaning that Mary buys a book from John. Properly formulated, it might be assumed that the same transformation can be applied to any pair of double-object actions which are the inverses of one another. ⁶ From the rst of these expressions, we can also deduce ‘new’ knowledge: ⁴ e discussion of world knowledge in the use of language is a vast eld in itself. See, for example, Peeters (2000). ⁵ Except perhaps for general knowledge of the structure of a typical detective novel, but importantly, the reader will not at the beginning of the novel know who done it. ⁶ Determining this is, of course, an empirical exercise.
Introduction
11
(1.3) own(Mary, book) with the obvious interpretation. is is a simple example, but transformations can be written to replace complex expressions with several simpler ones, and others to combine simple expressions into more complex ones. More importantly, processes might be studied which combine expressions to create knowledge not directly present in the knowledge base. ⁷ Queries can also be expressed in a semantic formalism. us: (1.4) buy(Mary, John, QUERY) might ask what Mary is buying from John. In this example, a simple patternmatching search would reveal the answer. But what if the buy-expression is not present in the knowledge base, while the sell-expression is? en the search operation must involve processes of the kind just mentioned, which try to derive the desired information. More generally, we might expect the creation of complex search processes which attempt to combine expressions in the knowledge base to provide an answer to a query. In an ideal knowledge retrieval system, a user would ask the all-knowing knowledge base a query in some natural language, and its meaning would be extracted by an NLU. e result itself would be a sequence of semantic expressions, which would in turn be rendered into some language by an NLG. ⁸ Finally, the le-hand side of our crown represents processes in which SEs are created directly, either by human agency or by a program. Perhaps a reporter will make use of a library of parametrized expressions in a limited eld, which can then be rendered in several languages; or a program may elaborate upon an outlined plot to create a more substantial story. We will not discuss these applications here, but will come back to them in our nal chapter.
⁷ is is the goal of research on inference. See Chapter 3 for more details. ⁸ In fact, the querying of databases of knowledge is also a vast eld in itself. Queries expressed in natural language represent a large and complex subset of this eld. For an introduction, see, for example, Ramakrishnan and Gehrke (2003).
12
φ e Semantic Representation of Natural Language
1.3 How to read this book Our intention is that this monograph should be accessible to a broad range of specialists, including linguists and computational linguists, whose backgrounds may vary from one extreme, where the reader has a comprehensive understanding of linguistics but a lesser knowledge of computing, to the opposite. As the work progresses, we sometimes nd ourselves wanting to elaborate on certain issues in two different circumstances. One is to cover extra matters which may interest readers at one end of the spectrum, but be of limited interest to those at the other; the second is to clarify points which may be obscure to some readers, but obvious to others. Both would intrude on the ow of the presentation. To preserve continuity, we use footnotes where the intrusions are short, but place longer instances in endnotes at the end of the chapter. ere is usually nothing to be lost by ignoring these. As Ogden and Richards (1923) noted years ago, the meaning of ‘meaning’ is exceedingly complex. While it is unquestionable that much progress has been made in this area by linguists and logicians, we are sometimes struck by the absence of other voices, including those of literary specialists and some early, now oen unread, linguists. e absence is in part explainable by the relatively unformalized language they have used, but we believe that it would be dangerous to ignore the issues they have raised. Accordingly, in Chapter 2, we present several basic concepts related to meaning from these areas. e explosion of work on the representation of meaning by logicians, linguists and computer scientists also cannot be ignored. Chapter 3 presents, in broad brush strokes, the panoply of this work, showing in particular the elements we have drawn upon in developing our model. Chapters 4 through 7 present the core of our model, beginning with a basic overview in Chapter 4, some formal issues and speci cations in Chapter 5, and discussion of the representation of a range of semantic phenomena in Chapters 6 and 7, including quanti cation, negation, tense, coreference, repetition and relatives. Finally, Chapters 8–10 illustrate the model in some detail by means of examples ranging from instruction manuals, to literary texts and textual adventure games, to short tales and human-created narratives. e reader interested in a basic overview should begin with Chapter 4. e more adventurous reader might begin with Chapter 9 to see an extended example of the formalism.
Introduction
13
Of course, a formalism is of value only to the extent that it is used. We have presented the model here in a variety of contexts to specialists and non-specialists. In our experience, while using it is not simple (aer all, meaning is complex) even non-experts can, in a relatively short time, make use of the model.
2
Basic Concepts
As we noted earlier, the eld of semantics is vast and complex. e analysis of meaning has occupied humans for centuries and works on the subject number in the thousands. Given this, yet another study must, at the very least, situate itself in the welter of competing voices. We will attempt to do this in two steps. In this chapter, we will draw attention to several theoretical distinctions which underlie most semantic models and yet which, in our opinion, have not received perhaps the attention they deserve. In the next chapter we will provide an overview of a representative set of important work in the eld of semantic analysis. Between the two chapters, our hope is that the reader will be able to construct a mental map of the domain within which the remainder of the volume will be located.
2.1 Semasiological and onomasiological perspectives A fundamental but rarely mentioned distinction in semantics concerns the difference between the semasiological and the onomasiological perspectives. e more common of the two, the semasiological perspective, assumes the mapping from a set of linguistic forms in some language (f1, f2, f3, . . . ,fn) to a set of linguistic meanings (m1, m2, m3, . . . ,mp). So, for example, in English, the string of letters “shoe” maps to a set of meanings which include ‘item of footwear’, ‘surface of a brake which presses against the brake drum’ and so on. A single form may map to more than one meaning, as the preceding example illustrates. Conversely, it is also possible for two forms to map onto a single It should be clear though that what we are not providing here and in the next chapter is an abridged introduction to semantics. A number of such introductions already exist. See, for example, Cruse (2004). For a fuller discussion of this distinction, see, for example, Baldinger (1964).
Basic Concepts
15
meaning, as in “dollar” and “buck”, “dollar” and “piasse”, and “pound” and “quid”, in Canadian English, Canadian French and British English respectively. It is also possible to compare languages in the semasiological perspective. us, the form “chat” in French maps to a set of meanings quite similar (but not identical) to the form “cat” in English. Comparison of two languages in the semasiological perspective is also used to bring to light lexical gaps, where one language has a form for some meaning which another language does not have, with the result that the second language uses some circumlocution, borrows the form from the rst language, or performs some other operation. us, the form “sibling” in English has no single-lexical-item equivalent in French; the closest equivalent is “frère ou soeur”. Similarly, English has borrowed from French the form “menu” to describe the list of foods available at a restaurant. e onomasiological perspective, on the other hand, maps from a set of language-independent meanings (m1, m2, m3, . . . ,mp) to a set of forms (f1, f2, f3, . . . ,fn) in one or more languages. So, for example, the meaning ‘bound volume containing printed pages designed to be read’ corresponds to the forms “livre”, “bouquin”, “volume”, etc. in French and “book”, “volume”, “tome”, etc. in English. Semantic analysis from the onomasiological perspective has been done in areas such as linguistic geography, where the various designations of some set of concepts is done over different regions (see, e.g. Gillieron 1902) and in the study of terminology and terminography, where the sets of terms designating a set of concepts are analysed (see, e.g. Cabré 1999). Almost all work on onomasiology has been carried out at the lexical level (as in the question: what is the word that designates the meaning ‘X’?), but in fact nothing precludes the onomasiological approach being applied at other levels as well (as in the question: what are the possible sentences/texts which carry the sequence of meanings ‘X Y Z’?). To some extent, this broader approach underlies work in Natural Language Generation, but with relatively little theoretical motivation to date. Of course, as the previous examples illustrate, the onomasiological approach requires that some means of expressing meaning be found. In the examples above, we have used a string of English words to represent the meaning, conventionally placing them in single quotes. However, we might as easily have used French (‘volume relié contenant des pages imprimées et destiné à la lecture’) and nothing of substance would have changed. is is because the strings used to represent the meaning form a metalanguage rather than an object language, where
16
φ e Semantic Representation of Natural Language
the latter represents the language being described and the former the language used to do the describing. e metalanguage/object language distinction is well known to linguists and logicians. Computational linguists, and in particular specialists in natural language generation, add to it the third concept of realization (Reiter and Dale 2000) to refer to the computational instantiation of some semantic content in some natural language. In so doing, they demonstrate that it is possible to separate the process of metalinguistic description from that of the set of operations required to identify and output a set of linguistic forms on the basis of that description. In fact, these two steps are not only operationally distinct; they are conceptually distinct as well. In other words, it is possible to discuss meaning without reference to the forms which may instantiate it in some language or in some utterance. is is what we are attempting to do here.
2.2 Meaning and reference A second fundamental distinction involves meaning and reference. As Ogden and Richards (1923) noted, it is possible in a system of communication to distinguish the symbol, the thought or reference and the referent. e rst of these is the formal device which signals some content, for example the string of letters “cat” in English, while the second is the mental construct carried by this formal device, in other words, the meaning ‘cat’. e referent is the thing in the world designated by this symbol-thought unit. For Ogden and Richards, the symbolthought relation is direct (causal in their terms), as is the relation between the thought and the referent. e relation between the symbol and the referent on the other hand is indirect, mediated by the previous two relations. is distinction has signi cant consequences for the choice of a semantic model and divides the set of models into those which deal with reference in the ‘world out there’ (such as truth-functional models) and those which restrict themselves to the form-meaning dyad. In the chapters that follow, we will argue that a semantic representation which limits itself to the truth-functional perspective fails to capture an important class of meaning phenomena. Such a contention is not a new one. As Pavel (1986) argues, ctional worlds pervade our everyday reality and render notions such as truth porous in the extreme. So, to quote one e Ogden-Richards triangle has had many subsequent avatars in the history of linguistics. So, for Stephen Ullman (1962), the triangle is designated as name, sense and thing.
Basic Concepts
17
of his examples, a sentence like “John is just like Hamlet: neither can make a decision in due time” shows the interpenetration of ‘ ctional’ and actual worlds.⁴ is interpenetration of actual and ctional worlds may be seen also in simple instances of children’s make-believe and more complex ones like literary texts.
2.3 Describing or creating reality One oen hears that the role of language is to describe reality. e assumption appears to be that the ‘world out there’ is primary and that language is a device which enables us to represent it. Statements can be quali ed as ‘true’ when there is congruence between the statement and ‘the world out there’, and ‘false’ when this is not the case. In fact, even a few minutes observing authentic linguistic behaviour will show that language is oen used not to describe some pre-existing reality but rather to create some possible reality. Consider the following conversation between A and B. (2.1) A: B: A: B: A: B:
So how’s it going? Terrible. e boss hates me. He chewed me out again today. I hate when he does that! Don’t you just want to put him in his place? [voice changes] You rotten bastard, you can take your job and stuff it! [laughs] Try that next time. [laughs ruefully]
e fourth utterance is not addressed at A, even though only A and B are present. It represents rather a simulation of a retort to the boss being played out for humorous purposes. In other words, it does not describe a reality: it creates one. Humans accept this process. In fact, they spend rather large amounts of money each year to support its maintenance, through the purchase of novels, theatre and movie tickets. e question is then what the limits are on the realities which may be created in this fashion. Must they correspond to the limits of our everyday world? Of course not. e utterance “Two and two make ve” has over three million hits in Google. One can easily tell the story of a man who lives forever, a woman who is composed only of ame, a time which runs backwards, or stops. A particularly striking example is provided by the Borges story el Milagro Secreto ⁴ e use of real places as settings for novels provides another instance of this interpenetration.
18
φ e Semantic Representation of Natural Language
(φ e Secret Miracle) (Borges 1944) where a condemned man has time stop in order to complete a play in his head. Must texts be internally consistent at the semantic level? is is a more complex question which hinges on what we mean by consistent. Consider time. It is possible to imagine a novel which begins at the end and then explains the events which led to this conclusion. Here, the order of events has not changed, but their the order of their recounting has. In other words, we tend to assume that chronology applies. Note however that science ction which presents time travel breaks down this barrier. What about truth? Clearly, texts may present different perspectives on the same event. In some cases, some of these perspectives may be presented as true when in fact they are not. us, in a detective novel, some character may deny being the murderer when in fact he or she is. Characters may have only partial knowledge of the true state of affairs. is may also be (at least apparently) true of the narrator in some cases. us, the novel Jacques le fataliste et son maétre by Denis Diderot (1970) opens with the following passage: (2.2) Comment s’étaient-ils rencontrés? Par hasard, comme tout le monde. Comment s’appelaient-ils? Que vous importe? D’où venaient-ils? Du lieu le plus prochain. Où allaient-ils? Est-ce que l’on sait où l’on va?⁵ Can we assume that the basic logical laws of identity, excluded middle and noncontradiction apply?⁶ In most cases, it would seem so, but there are exceptions. A celebrated short story by the Argentinian novelist Julio Cortázar, Continuidad de los parques (Cortázar 1966), opens with a description of an unnamed man in a green armchair overlooking a park lled with oaks. He is reading a novel in which two lovers are meeting to plan the murder of the woman’s inconvenient husband. At the end of their discussion, the male lover sets off through the woods with a knife, approaches an estate, crosses a park lled with oaks and creeps up behind his victim, a man sitting reading a novel in a green armchair . . . e dramatic effect of this text is only possible if we admit the possibility, however strange, that the man reading is also the victim, even if this requires that he exists in two realities, that of the novel and that of his everyday life. ⁵ In English: How did they meet? By chance, like everyone else. What were their names? What’s that to you? Where did they come from? Nearby. Where were they going? Does anyone know where he is going? ⁶ e law of identity states that A = A, the law of non-contradiction that not (P and (not P)) and the law of excluded middle that P or (not P).
Basic Concepts
19
Can we at least assume that texts have a meaning? Probably, but we should not assume that it is necessarily singular. Poetry provides the clearest example of this. Consider the poem ‘Le vaisseau d’or’ by the Canadian poet Émile Nelligan (Nelligan 1952): (2.3) C’était un grand Vaisseau taillé dans l’or massif: Ses mâts touchaient l’azur, sur des mers inconnues; La Cyprine d’amour, cheveux épars, chairs nues, S’étalait à sa proue, au soleil excessif. ... Ce fut un Vaisseau d’Or, dont les ancs diaphanes Révélaient des trésors que les marins profanes, Dégoût, Haine et Névrose, entre eux ont disputés.⁷ e vessel in this case is both the poet and a ship. And Aphrodite is at the same time a gure on the bow, the goddess, and the contents of the poet’s imagination. Interpretation of such a rich text involves the interweaving of various meanings. In sum, it seems clear that, beyond a limited set of relatively simple texts, the bridge between the ‘world out there’ and a particular text, if it exists at all, is, at the very least, complex.
2.4 e functions of language Jakobson (1960) proposed a functional view of language which involved six elements: the sender, who emits a linguistic signal, the receiver, who interprets it, the context in which communication occurs, the message itself, the channel of communication being used and the code in which the message is expressed. Associated with each element is a function, as Table 2.1 illustrates. Particular linguistic devices carry one or more of these functions. us, a raised intonation or an exclamation provides information on the mental or emotional state of the sender of a message (emotive function), a vocative (e.g. “You there, ⁷ is may be rendered approximately and less poetically in English as: φ ere was a massive Vessel carved in solid gold. Its masts touched the blue skies of unknown seas. Aphrodite, tousle-haired and naked was spread across the prow in the excessive sun. . . . It was a Golden Vessel whose diaphanous ⅝anks revealed treasures that the profane sailors, Disgust, Hate and Madness, fought over.
20
φ e Semantic Representation of Natural Language Table 2.1 Jackobson’s six functions of language Sender Receiver Channel Context Message Code
Emotive function Conative function Phatic function Referential function Poetic function Metalinguistic function
come here!”) carries a conative function focusing on the receiver, an utterance describing some element of the world carries a referential function, devices such as nods of the head, grunts of agreement, etc. ensure that the channel of communication remains open (phatic function), explanations of meaning (e.g. “By unacceptable, I mean unlikely to be agreed to by the members of the committee”) carry the metalinguistic function. e poetic function is somewhat more complex. Jakobson uses the term to describe cases where the focus is put on the message for its own sake, in other words where the speci c form of the message becomes the object of attention. So, for example, rhymes, puns, riddles and such require the hearer to pay attention to the speci c choice of forms used by the speaker. Two of Jakobson’s functions, the metalinguistic and the poetic, pose particular challenges to the onomasiological perspective. In all the other cases, one can assume a set of semantic speci cations which are independent of the formal devices which can instantiate them. us, if a speaker wishes to send the message that he or she is angry, it is not necessary to specify in advance the signal that will be used to send the message. It may be phonetic (intonation, stress), lexical (swearing) or syntactic (exclamative sentences). However, the metalinguistic function requires the speci cation of the formal item whose meaning is to be explicated. Similarly, the poetic function presupposes the manipulation of a set of formal textual items in the utterance itself, as in the case of rhymes. We will return to this later. While most discussions of Jakobson’s model have focused on the diversity of functions which language can support, little attention has been paid to the important corollary of that statement: the same semantic mechanisms underlie all these various functions. In other words, language is a unitary machine which may be put to different uses.
Basic Concepts
21
is runs counter to the modular view of language, based on a semasiological perspective, which holds that linguistic phenomena may be divided into ‘layers’. For example, the French linguist, Martinet (1965), proposed that, unlike all other semiotic systems, language is doubly articulated. A small number of purely formal elements (phonemes or letters) may be combined into some thousands of unitary elements which carry both form and meaning (morphemes), and these may in turn be combined into an in nite number of sentences. In fact, the spectrum is even richer than Martinet proposed. Morphemes may be combined into more or less complex words, words exist in collocations and grammatical phrases, sentences may be combined into conversations on the oral level or paragraphs in writing, paragraphs form larger units ranging from chapters, to short stories, to novels and beyond. Faced with this diversity, linguists and others have tended to parcellize their analyses. Phonologists tend not to concern themselves particularly with syntax, and most linguists spend little time analysing phenomena beyond the sentence. At the larger granularities, literary analysis takes over for the most part and there is little clear or formal mapping between narrative units such as plot and lowerlevel formal devices. It fact, it is argued in a number of linguistic frameworks that the various formal systems of language are either invisible to each other, or at the least that their interaction occurs only in a circumscribed fashion as a limited set of interfaces.⁸ From the onomasiological (i.e. semantic) perspective, the question arises whether a similar layering exists or whether, alternatively, the semantics of language is ‘all of a piece’ at all levels from the sub-lexical to the textual. e notion of compositionality (Szabó 2008) embodies this perspective to some extent in assuming that more complex semantic units are built up upon simpler ones. But to push the question farther, we can ask whether the same semantic unit can manifest itself in different formal units? Let us consider this in more detail by taking a simple concept like ‘kill’ as in “Brutus killed Caesar” and let us assume for the moment that its core meaning may be expressed informally as ‘X causes Y to cease to be living’. is semantic unit may exist at the sub-lexical level: the verb “assassinate” contains as one of its components the notion of killing, along with additional features such as ‘for reasons of politics or power’. e same meaning ‘kill’ may also exist at the lexical level, in the verb “kill”, at the clausal level, in ⁸ For an early statement of the modularity hypothesis, see Fodor (1983). For a recent challenge, see Gibbs and Van Orden (2010).
22
φ e Semantic Representation of Natural Language
phrases such as “leave X dead”, “bump X off ” and so on, at the sentence level in cases such as “Brutus’ knife pierced Caesar’s heart and he fell to the ground, lifeless”, and even at the paragraph level, as the following example illustrates: (2.4) Slowly, Brutus crept up behind his unsuspecting victim, his knife drawn and hatred in his eyes. Before Caesar could react, the blow was struck. Cursing as he twisted the blade in his hapless victim, Brutus muttered: “Your salad days are over.” eoretically, even longer texts are possible. e degree of detail may vary, but the core meaning in each of these examples is ‘X kills Y’. One conclusion which may be drawn from these examples is that, unlike the formal devices of language which are of different kinds at each level, meaning would appear to be self-similar whatever the formal device which manifests it. Although we will not explore the comparison here, it would appear that in some respects meaning presents the characteristics typically associated with fractal phenomena (Mandelbrot 1977). A number of textual manipulations play upon this fact. us, the notion of the abstract of a scienti c paper hinges on the idea that it represents the essential content of the whole paper, while the summaries of operas or plays allow those unwilling to grapple with the detail of either to nevertheless understand the whole. If meaning does have this characteristic, this raises the question: how should we capture this self-similarity across different formal spans?
2.5 Semantic units and semantic relations Over the years, semantic descriptions, at least at the lexical level, have tended to be based on one of two theoretical perspectives. Some take as the basis of semantic description the existence of basic units of meaning. Others are based on the notion of semantic relations between meaning-bearing forms, as in, for example, synonymy between words. Whereas the meaning unit model is essentially onomasiological in nature (since meaning is seen as primary), the relational model is essentially semasiological (since forms are seen as primary and meaning relations are expressed between forms).⁹ ⁹ e epistemological status of semantic units and relations is an open question. Some researchers treat basic semantic units as psychologically real, that is, inherent in the architecture of human cognition
Basic Concepts
23
Both models have weaknesses. us, the semantic unit model leaves implicit the network of relations which link semantic units. For example, in a system which assumes the semantic unit ‘human’ and the semantic unit ‘mobile’, the nature of the relation between the two tends not to be spelled out in great detail in many models. On the other hand, relational models tend to have the shortcoming that the lexical units which they link are sometimes not analysed in detail. ⁰ Lexicographical de nitions present a third alternative which, at least informally, brings together both semantic units and semantic relations. Consider the definition of “cat” provided by the Compact Oxford English Dictionary: ‘a small domesticated carnivorous mammal with so fur, a short snout, and retractile claws’ (Soanes and Hawker 2005). Here, a sequence of English words represents the semantic constituents of the overall meaning. At the same time, these constituents are linked by a clear set of relations. For example, ‘small’, ‘domesticated’ and ‘carnivorous’ modify ‘mammal’, the superclass of ‘cat’. In addition, within the entire dictionary, each of these sub-units is de ned in relation to others. us, in the same dictionary, ‘mammal’ is de ned as ‘a warm-blooded vertebrate animal that has hair or fur, secretes milk, and (typically) bears live young’. In other words, a dictionary represents a partially formalized network of relations among semantic units, both within de nitions and across de nitions. We will see in later chapters that our notion of a semantic lexicon also attempts to capture this dual function of lexicographical de nitions.
2.6 Language, knowledge and perspective As was noted in the rst chapter, the use of language presupposes a complex interplay between previous knowledge, that provided explicitly by a text, and that which may be deduced from a text. As an illustration, consider the (see, e.g. Wierzbicka 1996). Similarly, some linguists treat semantic relations as psychologically real. For example, if one presents an English-speaking test subject with the stimulus “big” and asks for the rst word which comes to mind, it is likely that this will be “little” or “small”. is is taken as evidence for the mental existence of the relation of antonymy. ⁰ is is made more complex by the existence of stylistic factors. us, even though “dollar” and “buck” are semantically equivalent (as are “pound” and “quid”) they are not stylistically equivalent, since the second of the two terms is used in familiar language only.
24
φ e Semantic Representation of Natural Language
following sentences: (2.5) She turned away from the window. e lamp cast a golden glow over her still-dripping coat and beyond to her muddy high heels. Aer reading these two sentences, what do we know? Some things are explicit: the passage describes a movement (“turned away”) of a female (“she”) as well as an object (“lamp”) probably alight (“cast a golden glow”) and some articles of clothing (“coat”, “high heels”) which appear to belong to the person known as “she” (“her coat”) and which have apparently recently been exposed to rain or some source of moisture (“still-dripping”, “muddy”). Other things are implicit: the “she” is probably an adult woman (because of the high heels); she has been looking outside (because one usually looks out windows rather than in, and because the next sentence describes a lamp which is usually found indoors; to see the lamp aer turning away from the window, it would be necessary to be indoors); it is probably evening or or night or at least cloudy (because the lamp is apparently on); it has been or still is raining (because this is what usually causes coats and shoes to become wet or muddy); “she” has recently been outdoors (because the coat and shoes are still wet and muddy and this is usually the result of being outdoors). Much of this implicit information goes beyond the basic logical interplay of presuppositions (that which must be true in order for some stretch of text to be true) and implications (that which must be true if a stretch of text is true). In particular, it brings into play our expectations of how the world works. e short passage above also illustrates the centrality of the notions of point of view and theory of mind. While it may be read as a somewhat clinical description of a series of events (a woman turns, a lamp illuminates some objects) in fact many readers will see the passage ‘through’ the woman’s eyes, as if they were experiencing it. Others will feel a particular mood and have an expectation of the sort of text which might follow. e notions of point of view and perspective are absent in many semantic analyses, oen because the examples studied tend to be decontextualized utterances. We make no claim to be presenting anything new here. Many researchers have grappled with this issue of interpretation of a text, including semanticists, literary specialists and logicians. Our goal is simply to show how any semantic representation must include not only the meaning of the text itself but also its interplay with background knowledge.
Basic Concepts
25
However, some recent linguistic frameworks have begun to grapple with the issue. For example, Cognitive Linguistics and its offshoot, Cognitive Semantics, are premised on the notion that language is embodied and built upon a number of basic cognitive frames such as physical embodiment and the distinction between foreground and background. (See, e.g. Langacker 1987; and Talmy 2000.) In addition, a considerable amount of non-computational work has now been done on issues around discourse. We will not review this vast literature in detail here but will content ourselves with mentioning several promising perspectives. First, it is now clear that any representation of semantic phenomena must take into account what speakers and hearers know in a particular situation, what is generally known, how this will change over the course of a text or a conversation, and that this affects the choices made in any communicative situation. is includes the interface between language use and context in its various senses (as delimited by Fetzer, ed., 2007 for example). ere is also general agreement that, as pointed out by Jaszczolt (2005), for example, the calculation of meaning happens one level up and incorporates pragmatic as well as linguistic factors. On the other hand, as we will see in the next chapter, we would differ from Jaszczolt and others in that we do not necessarily assume a truth-functional perspective. We believe that a large number of linguistic phenomena fall outside such a framework, including works of ction, conversations in which a speaker assumes multiple roles, and poetry, which by its nature admits of multiple meanings within the same linguistic framework. Another signi cant area turns around issues of reference and framing (see, e.g. Keren 2011). From the onomasiological perspective, referential ambiguity is absent – the speaker in principle knows what is being talked about – although intentional ambiguity is possible. In addition, variable semantic precision is assumed, elastic in precision as required by context, as we will see later.
2.7 Anthropomorphism, minimalism and practicality In common with many tasks both physical and intellectual, that of representing meaning faces tensions between two pairs of opposing requirements: the anthropomorphic and the practical on the one hand, and the practical and the minimalist on the other.
26
φ e Semantic Representation of Natural Language
e anthropomorphic requirement holds that the answer to a problem, be it mechanical or linguistic, must follow human physical or mental behaviour. e minimalist desire is that the different features used in the solution should be kept to a minimum. e practical eschews both, holding that the only criterion for success is the solution’s practicality. Let us illustrate these tensions with several examples. Consider rst the anthropormorphic requirement. Pinker (2000) describes two contrasting approaches to forming the past tenses of English verbs. One makes use of a combination of rules and exceptions to obtain the result; the other combines a neural net with a period of training. Both sides of the debate claim theirs to be the path followed by the human brain and present this as an advantage of their model. So, in the past-tense debate, it is claimed that the rules and the exceptions are executed in separate parts of the brain, with the latter winning the race to produce answers where they are applicable and thus explaining the observed behaviour of human speakers. While few physical processes are expected to by constrained by the anthopomorphic requirement – no one would claim that the wheel is ‘unacceptable’ since it does not mimic human locomotion – the expectation of anthropomorphism is present, sometimes tacitly, in a number of semantic models. In what follows, we make no claims of psychological plausibility for our semantic representation. e minimalist requirement (not to be confused with Chomsky’s Minimalist Program (Chomsky 1995)) can be illustrated by an example from simple arithmetic. Before the advent of calculators and computers, humans learned a process to add together two positive integers expressed in the decimal notation. eoretically, this algorithm uses features far more complicated than are necessary: addition of two decimal digits, carry-over and so on. In fact, all that is actually needed is the ability to increase and decrease any integer by 1, and to recognize the number 0. To add p to q, we need simply decrease p by 1, increase q by 1, and repeat this pair of operations until p becomes 0. It is doubtful, of course, that any human (or indeed, computer) would perform addition this way in practice! However, the minimalist requirement is implicit in a number of semantic models. For example, componential analysis, which we will discuss brie y in the next chapter, is predicated on the notion that if we It should perhaps be noted that in the context of a computing algorithm, a combination of rules and exceptions is merely a rule, even if the exceptions form 100 per cent of the cases (in other words, if the rule simply lists all cases and their answers).
Basic Concepts
27
could identify the minimal units of meaning and discover their interactions, we would thereby attain understanding of semantic phenomena. More importantly, it seeks to establish the absolute minimal set of such units and to avoid redundancy in their use within the representation. In the coming chapters, we will make no attempt to propose a minimalist model; our goal, rather, will be to propose one which will lend itself to being used by a variety of users of varying skills and areas of expertise. e practicality requirement may be illustrated by the structure of modern-day computer algorithms. Simple operations provided by typical computer hardware (themselves the product of lower-level physical processes) are assembled into procedures which carry out larger tasks. ese are in turn combined into bigger and bigger segments of a program. Conversely, the task of a programmer is to re ne a problem step by step into ner operations until these reach the level which the hardware implements. is ability to view a computer program at many different hierarchical levels allows human beings to create and alter programs millions of lines in length, without having to grasp the entire program at each point. We will attempt to apply this perspective in our analyses.
2.8 Desiderata In light of the phenomena we have considered in this chapter, what should we require of a semantic formalism? We suggest that it must have at least the following qualities: it should adopt an onomasiological perspective, it should provide appropriately broad coverage and fine-grained resolution, and it should be characterized by its practicality. Let us consider each of these in turn.
2.8.1 Onomasiological perspective As we have noted above, most linguistic models adopt a semasiological perspective, assuming that linguistic forms are the starting point and that the task A minimalist view of theorem-proving is illustrated by an example from the mathematical eld of Number eory. e well-known Prime Number eorem asserts that the number of primes not exceeding n is asymptotic to n/log n. Proofs of this produced independently by Hadamard (1896) and de la Vallée Poussin (1896), though correct, were deemed unsatisfying by many number theorists, on the grounds that they made use of results from integral calculus, outside the number theory eld. Subsequently, ‘elementary’ proofs were published by Selberg (1949) and Erdös (1949) which were longer, but did not exceed the minimalist requirement. eoretically, of course, a truly minimal proof in logic or mathematics involves a tiny set of axioms, which are combined and transformed by repeated use of rules of deduction. For this to be feasible in practice, the axioms are woven into simple theorems, which in turn are built into more and more complex results.
28
φ e Semantic Representation of Natural Language
of researchers is to represent the meanings associated with those forms. is propensity extends from the theoretical perspective to applications, where, for example, much more attention has been devoted to parsing than to generation. We have argued in the rst part of this chapter, using evidence of various sorts, that the attempt to capture meaning represents not just a salutary discipline, but in fact is essential if we are to escape the bounds of linguistic realization. is raises then the question: what would be the consequence of adopting a purely onomasiological framework, with meaning at its core and linguistic realization secondary, both logically and temporally?
2.8.2 Broad coverage By coverage we mean the range of features which a formalization is capable of representing. We may view it, in fact, as a two-dimensional eld. In one dimension, we see semantic phenomena running from the smallest to the largest: the meanings of morphemes and words, sentences and paragraphs, and nally, complete texts. In the other dimension, we see at each level a broad range of features. Near the lower end of the scale, the semantic formalism must capture the usual phenomena of tense and aspect, quanti cation, time sequences, causality, teleology and co-reference and so on; but further upwards, it must deal not only with statements true in some possible world, but opinions, impressions and generalities, questions, exclamations, and a host of others, including non-sequiturs and contradictions. As we will see in the next chapter, most linguistic models have restricted themselves to subsets of coverage, dealing either only with short texts, oen no longer than a sentence, and only with a limited range of phenomena. Clearly, this is partly an issue of feasibility: capturing all aspects of language is a mammoth task. at being said, it is still appropriate to ask the question: could a single formalism be created which would allow for the representation of meaning at all levels, from basic semantic units and relations to sentence-level and text-level meanings and in so doing deal with the full gamut of communicative goals? ⁴ ⁴ Blackburn and Bos (2005, p. xiv) make a similar point: We believe that the tools and techniques of computational semantics are going to play an increasingly important role in the development of semantics. . . . Modern formal semantics is still a paper-and-pencil enterprise: semanticists typically examine in detail a topic that interests them (for example, the semantics of tense, or aspect, or focus, or generalized quanti ers), abstract away from other semantic phenomena, and analyse the chosen phenomenon in detail. is “work narrow, but deep” methodology has undeniably served semanticists well, and has lead to important insights about many semantic phenomena. Nonetheless, we don’t believe that
Basic Concepts
29
2.8.3 Resolution We use the term resolution in its photographic sense to designate the ability of a semantic formalism to capture nuances, that is, slight, sometimes subtle, differences in intended meaning, which are common in human expression. Mathematical precision is of less signi cance in this respect. Consider the utterances “Canadians love hockey” or “All Canadians love hockey”. ese are not precise truths, nor are they intended to be. ey cannot reasonably be represented semantically by a predicate such as: (2.6) ∀x(Canadian(x) ⇒ love_hockey(x))
Similarly, “Most Canadians follow hockey” ⁵ and “A few Canadians play cricket” are not intended by the speaker to mean that, if C is the number of Canadians and n the number who follow or who play, then n > C/2 and 3702 < n < 12749, respectively. As another simple example of nuance, consider the two English sentences: (2.7) (a) e drug has few side effects. (b) e drug has a few side effects. ese sentences, which might be used in connection with prescription drugs, appear similar, but carry very different implications. To most English readers, the tenor of the rst suggests that the patient need not be concerned with side effects – a negative implication. e second, in contrast, presents a positive implication: while there are not many side effects, there are some which the patient needs to be aware of. e manner in which the two implications – positive and negative – are marked varies from language to language. In French, for example, the difference is carried by two distinct lexical items, “peu” versus “quelques” in the case of it can unveil all that needs to be known about natural language semantics, and we don’t think it is at all suitable for research that straddles the (fuzzy and permeable) border between semantics and pragmatics (the study of how language is actually used). Rather, we believe that in the coming years it will become increasingly important to study the interaction of various semantic (and pragmatic) phenomena . . . In our view, computational modelling is required. ⁵ For a detailed discussion of the semantics of “most”, see Ariel (2004). We will return to a discussion of quanti cation in Chapter 6.
30
φ e Semantic Representation of Natural Language
count nouns: (2.8) (a) Ce médicament a peu d’effets secondaires. [few effects] (b) Ce médicament a quelques effets secondaires. [some effects] and by the presence or absence of a determiner in the case of mass nouns: “peu” versus “un peu”: (2.9) (a) peu de vin [little wine] (b) un peu de vin [a little wine] Regardless of the natural language, however, the semantic difference is real and important. Note however that the degree of resolution required will vary from case to case, just as the required number of pixels in a photograph will vary depending on the use to which it is put. ere will be circumstances where a lesser degree of precision is needed, and others where great precision is required. So, for example, a master baker will have a great number of semantic distinctions (and terms) at his or her disposition to name various products, while the common consumer may be familiar only with the semantic and terminological distinction between rolls, cakes and cookies. is leads to a third question: what is required in order for a semantic representation to be appropriately nuanced – that is, nuanced to the degree required for use in various contexts?
2.8.4 Practicality For a formalism to be applied to real applications, equally important is the issue of practicality. We showed earlier that practicality must be distinguished from anthropomorphism and minimalism. But what do we mean by the term itself? To be practical, a formalism must be user-friendly. If anyone is to use it, it must be sufficiently clear to be understood, and created, by a wide range of users, including logicians, linguists and creative writers, without necessarily detailed technical knowledge. A second aspect of practicality lies in the previously discussed notion of coverage, in both of its two dimensions. To us it appears essential that a single formalism should not only represent a wide range of features, but should do so over the
Basic Concepts
31
entire linguistic spectrum, from morphemes to complete texts. In other words, the formalization must be scalable, but in such a way that it is still manageable. In most research on semantic analysis, examples typically involve one or two sentences, each with just a few words. What happens if one tries to scale up the formalisms to express the meaning of a few paragraphs, let alone a complete text, is never addressed. In rough terms, ignoring outliers, a typical novel might contain 250 pages, with perhaps 25 sentences per page – some 6,250 sentences averaging 20 words in all. It is hard to imagine the meaning of such a novel being expressed by any of the current semantic formalisms. And yet, such a model exists. ere are computer programs which are thousands, sometimes hundreds of thousands, of lines long. To be sure, many are poor, error-prone and difficult for humans to read and understand. But there are many others which are expressed in a clear, elegant and easy-to-read style, written by skilled craspeople, one might say artists, following established principles. is then raises a fourth question: what would be the result if principles of programming were applied to the specification of natural language semantics? In the next chapter, we will pass in review a range of earlier works on semantic analysis in light of these four questions.
3
Previous Approaches
In the previous chapter, we provided a broad overview of some of the fundamental theoretical distinctions which have provided the underpinning for much of the work in semantics over the past several decades and we concluded by suggesting that a semantic formalism should possess the traits of an onomasiological perspective, appropriately broad coverage and ne granularity, and practicality, including user-friendliness and scalability. In this chapter, we will survey a range of existing semantic theories in light of these desiderata. Our goal will be to isolate those which provide valuable responses to one or more of the desiderata. We will also consider the computational implementation, potential or actual, of the various models. e order of presentation will be from the nest granularity (lexical and sublexical units) to more complex units, including syntax, truth-functional perspectives, textual structures, ontologies and reasoning systems.
3.1 Lexical semantics 3.1.1 Lexicography Historically, the fundamental rst step in semantic analysis took the form of dictionaries. No doubt for reasons of practicality, trade and exchange in particular, the earliest lexicographical works tended to be bilingual dictionaries in which one language is described using words from another (Hartmann, 1986). It should also be borne in mind that this chapter makes no attempt to provide an exhaustive survey of work on semantic modelling – such an approach would take many volumes. As well, our goal is to provide the avour of each of a number of approaches; we make no claim to have captured the intricacies of any of them.
33
Previous Approaches
Monolingual dictionaries extend this model by using words of the same language as both object language and metalanguage. Of course, as a result, they are inherently circular. It is easy to nd lexicographical shortcuts which bring the circularity even more sharply to light, such as synonymic de nitions (“puss” de ned as “cat”) or morpho-semantic de nitions, based on use of word-formation devices (“catlike” de ned as “in the manner of a cat”). Of course, the metalanguage of human-readable dictionaries tends not to be formalized, although it is coherent. It was this coherence which led a number of researchers to seek a means of transforming dictionary de nitions into formal representations (see, e.g. Ide and Véronis 1995; Ide et al. 2000). e initiative foundered because of the huge variation found across de nitions, but the principle is sound. From the perspective of the four desiderata, lexicographical data is usually semasiological, based on headwords, but can be construed as representing meaning units alone. Dictionaries provide broad coverage, at least at the lexical level, although they fall short in terms of providing a rich syntactic description and provide essentially no textual model. ey do have a high degree of resolution at the lexical level. Finally, although they have proved their worth as humanreadable objects, they will not in themselves provide a full solution to our semantic quest.
3.1.2 Componential analysis Partly in reaction to the perceived shortcomings of traditional lexicographic models, componential analysis was introduced in the second half of the twentieth century. Based on earlier work on phonology done within the Prague School (Vachek, 1967), componential analysis is based on a set of semantic traits and a bi-valued scale (+/−) describing the necessary presence or absence of a trait. Consider for example the three words “car”, “taxi”, and “bus” as they might be represented in a componential analysis (Table 3.1).
Table 3.1 Componential analysis of several motorized vehicles Lexical unit Car Taxi Bus
‘Vehicle’
‘Motorized’
+ + +
+ + +
‘Fixed trajectory’ − − +
‘Paid fare’ − + +
34
φ e Semantic Representation of Natural Language
e shared traits ‘vehicle’ and ‘motorized’ show that all three words belong to a common class (as opposed, for example, to “bicycle” which is a vehicle which is not motorized, and “table” which is neither a vehicle nor motorized), while the different values of + and − for the traits ‘ xed trajectory’ and ‘paid fare’ differentiate among the three words. us, “bus” and “taxi” share the trait ‘paid fare’ which “car” does not necessarily have, while the differing values for ‘ xed trajectory’ differentiate “bus” and “taxi”. A major difficulty of componential analysis lies in establishing the set of traits to be used. Clearly, while some general traits like ‘human’, ‘mobile’, ‘three dimensional’ are of great generality, others are not. At some point, the addition of new traits is of increasingly diminished utility in terms of the number of lexical items distinguished by the trait. Jackendoff (1990) provides the example of “duck” and “goose” to illustrate the problem, arguing that distinctions of this neness should be captured by means of some other level of representation such as a 3D model. In other words, beyond the general traits which they share, such as ‘bird’ and perhaps ‘aquatic’, it is not necessary to create new traits for only these cases. In addition, it is easy to see that the requirement that a trait be either necessarily present or absent may sometimes be relaxed or circumvented. So, for example, a child’s car may not be motorized, a car may be used in a context where a fare is paid and a taxi may follow a xed trajectory while some buses may not. is is captured, for example, by the concept of the virtuème (Pottier, 1992). In the years since the introduction of componential analysis, several linguists have attempted to extend and enhance the original model. For example, Melčuk et al. (1995) present a semantic representation system in some detail, using a prede ned set of primitives and a functional notation where, for example, Magn(vent) represents the notion of strong wind. Melčuk’s functions deserve careful study, at least at the lexical level. Sentence-level phenomena are dealt with in other publications by Melčuk but text-level phenomena are absent. Accessibility of Melčuk’s work is also limited by the fact that much of it is written in French. Work by Anna Wierzbicka, for example Wierzbicka (1996), makes a similar attempt to de ne a set of semantic primitives claimed to be common across both individuals and languages. Expressed in terms of basic units like ‘I’, ‘you’, e model of semantic expressions we will propose in subsequent chapters, although having some similarities with Wierzbicka’s system, does not make the same claims of psychological reality.
35
Previous Approaches
‘do’, ‘happen’, ‘where’, ‘someone’, ‘part of ’, and so on, it lends itself well to an onomasiological perspective. e formalism has been tested against a wide range of languages, but it has been applied primarily at the lexical and syntactic levels. In sum, while componential models are more formalized than lexicographical data, few provide either broad coverage or ne resolution. In addition, few are sufficiently developed to permit their practicality, scalability or user-friendliness to be evaluated to any extent.
3.2 Conceptual structures Proposed by Jackendoff (1983, 1990), the model of conceptual structures is based on an encyclopaedic molecular meaning theory built from feature structures that express basic concepts. Jackendoff suggests that every human has an innate set of conceptual primitives that can be combined with learned information using conceptual formation rules to provide a rich description of possible meanings. Each major content-bearing syntactic structure in an expression has a mapping to a well-de ned conceptual structure. e overall meaning of a sentence is carried by combinations of various structures obtained from the syntax of the expression. A conceptual structure may have one or more conceptual constituents, each with a speci c type. e set of types is de ned as ing, Event, State, Action, Place, Path, Property and Amount. is model may be used to capture a range of semantic phenomena. us, the conceptual structure: (3.1)
[
]
T hing
BRUTUS
represents a ing BRUTUS that is understood to be some individual, for example, Marcus Junius Brutus. e constituent BRUTUS contains the set of properties that de ne Marcus Junius Brutus. It is possible to pick out a property associated with a constituent and explicitly make mention of it for syntactic realization: (3.2)
[ T hing
BRUTUS [P roperty MALE ]
]
Here and elsewhere, we have adopted the main traits of the author’s formalism, but applied it to examples based on the characters Brutus and Caesar.
36
φ e Semantic Representation of Natural Language
In this case, restrictive modi cation is used to indicate that BRUTUS is a MALE as in the sentence “Brutus is a man”. e Property MALE becomes fused with the conceptual constituent ing BRUTUS. Compare this with the sentence “Brutus is an honourable man” represented below: (3.3)
[
] BRUTUS , BEIdent [P roperty MAN] T hing [P lace ATIdent ([P roperty HONOURABLE ])]
State
Here honour is represented as a transient property. If read literally, the representation shown above reads as a State containing the function BE which relates the ing BRUTUS as being AT the Property HONOURABLE. Representing this property as a Place suggests that an object may move from that location to someplace else, thereby capturing the transient nature of HONOURABLE. Jackendoff (1983) suggests that for simplicity subscripts may be dropped once the reader is familiar with the types of each constituent and function-argument structure. Using this, the above example may be rewritten as: (3.4)
[
BEIdent
BRUTUS [ MAN ]
]
,
[ ATIdent ([ HONOURABLE ])]
Note however, that the resulting form, while simpler, is now ambiguous. Consider now a more complex linguistic example, that of the sentence “Brutus stabbed Caesar”: (3.5)
[[BRUTUS ], ( ) ] CAUSE [T hing ], GO [P ath TO([P lace IN([ CAESAR ])])]
[M anner PUNCTURE ]
e conceptual structure representation here literally reads that the ing BRUTUS caused an unspeci ed ing to be placed inside of CAESAR. e addition of the restrictive modi er PUNCTURE is used to indicate the Manner in which the action is performed. A more precise version of the sentence is “Brutus stabbed Caesar with a knife”, which may be represented as:
37
Previous Approaches
(3.6)
[[BRUTUS ], ( ) ] CAUSE [T hing KNIFE ], GO [P ath TO([P lace IN([ CAESAR ])])]
[M anner PUNCTURE ]
Here, the previously empty ing constituent is lled by the ing KNIFE. However, there are several ways that this English sentence might be represented using conceptual structures. If there exists a primitive conceptual function STAB that relates two ings, we could obtain the structure: (3.7)
[
STAB([ BRUTUS ], [ CAESAR ]) [ WITH[ KNIFE ] ]
]
Additional circumstances beyond the basic action could also be fused to the main Event constituent. us a complex sentence such as “Brutus stabbed Caesar with a knife at noon on the steps” could be represented as: (3.8)
STAB([ BRUTUS ], [ CAESAR ]) [ WITH[ KNIFE ] ] [ AT([ NOON ]) ] [ ON([ STEPS ]) ]
Let us turn now to anaphora. In anaphoric expressions, pronouns or other referring expressions are used to refer to a previously mentioned object. (See, e.g. McCoy and Strube, 1999, for an overview of the computational and semantic issues involved.) Consider the sentence “Juliet stabbed herself ”. Here the re exive “herself ” points back to the object “Juliet”. (3.9) [ STAB([ JULIET ]α , [ α ]) ] e alpha in this representation is used to indicate binding between conceptual constituents (Jackendoff, 1990) and provides a way of referring back to a previously introduced conceptual constituent. Jackendoff ’s model also provides a means of capturing causality. Consider the instance where we wish to relate the fact that the action of Brutus stabbing Caesar resulted in his death. e English sentence “Brutus stabbed Caesar to death” conveys this information. In Jackendoff ’s formalism, this may be expressed by:
38
(3.10)
φ e Semantic Representation of Natural Language
[
( CAUSE
[Event STAB([ BRUTUS ], [ CAESAR ]α )], [State BE([ α ], [ DEAD ])]
) ]
e expression here indicates that the act of stabbing Caesar is what resulted in Caesar’s death. Jackendoff ’s discussions do not explicitly cover a treatment of tense, although there are hints as to how this might be accomplished. We have already seen the AT function used to represent moments in time, or temporal locations. By the same logic, the representation (3.11)
[
DIE([ CAESAR ]) [ ATT emp ([ FUTURE ]) ]
]
may be proposed to show that the event in question occurs at some future point in temporal space. Let us turn now to quanti ers. Consider for example the sentence “every conspirator stabs Caesar”. is statement asserts that all the individuals that are “conspirators” are involved in the stabbing of Caesar. is may be represented as: (3.12)
[
( [ STAB
CONSPIRATOR [Amount ALL ]
]
T ype
) ] ,
[ CAESAR ]
Here, instead of referring to a speci c CONSPIRATOR Token, this argument refers to the Type of things that are a CONSPIRATOR. In this way, we can view this particular structure as referring abstractly to the set of CONSPIRATORS that exist. An additional Amount is speci ed that indicates ALL of the things that are considered to be CONSPIRATORS are involved in this argument place. Jackendoff extends this model to deal with more complex instances of quanti cation such as “every conspirator stabs some senator” where it is unclear whether there is a single senator that all the conspirators stab, or whether every conspirator is stabbing a different senator. To deal with this, he proposes a formalism such as: (3.13)
[
( [ STAB
CONSPIRATOR [Amount ALL ] T ype
]
) ] , [ SENATOR ]
39
Previous Approaches
Here the sense of a singular senator being stabbed is achieved by leaving the SENATOR in the second argument place of the STAB function as a Token value rather than a type. On the other hand, if the desired meaning is that every conspirator stabbed some senator, the addition of the Type speci cation produces the change, as in: (3.14) [
( [ STAB
CONSPIRATOR [Amount ALL ] T ype
) ]
] , [T ype SENATOR ]
In this case, the amount of SENATOR is le unspeci ed, but since we are referring to a set of things, rather than a speci c Token, the reading we are le with is that every conspirator stabbed something that was a senator, not necessarily the same SENATOR. As the reader will no doubt have gathered from the detail provided here in the presentation of Jackendoff ’s model, it has much to recommend it. It can be construed as onomasiological, even if this was not necessarily Jackendoff ’s original intent. Its conceptual machinery provides the means of capturing a relatively wide coverage, at least in terms of semantic phenomena, if not in terms of varying sizes of texts, and its notation captures aspects of resolution which are important to any mature semantic description, including typing, tense and semantic granularity. Despite these advantages, we are not aware of any computational implementations of Jackendoff ’s model. is may be due to the fact that his discussion has focused on linguistic phenomena at the expense of applications, or possibly to the somewhat unformalized nature of his terminology. In addition, although graphically pretty, his formalism would be hard, in our opinion, to extend to texts of any length. In later chapters, we will see how some of Jackendoff ’s proposals may be implemented in a user- and computer-friendly manner.
3.3 Lexical relations and inheritance networks As we noted in the last chapter, semantic analysis may be applied not only to the composition of lexical units, but also to the relations between them. So, for example, “cat” in the sense of ‘feline’ is a hyponym of mammal and a hyperonym
40
φ e Semantic Representation of Natural Language
of “Siamese cat”.⁴ is relation of hyponymy forms part of a much larger class which includes synonymy, antonymy (and its subclasses) and so on.⁵ In all these cases, we are dealing with relations not among lexical units in all their meanings, but rather with relations among speci c meanings of lexical units. us, “cat” is a hyponym of “mammal” in the sense noted above, but not in the sense of ‘catamaran’. To put this another way, semantic relations are more accurately construed from the onomasiological than from the semasiological perspective. In fact, it is possible to conceive of a lexicon as a rich network of relations among meaning units. Of course, in most instances, these meaning units are represented by the lexical items which carry them (in other words, from a semasiological perspective). A network of semantic relations may be traversed. us, for example, in WordNet (see below), if one begins with “queen” in the sense of ‘female cat’ and recursively seeks its hyperonyms, this gives the series “domestic cat”, “cat”, “feline”, “carnivore”, “placental”, “mammal”, “vertebrate”, “chordate”, “animal”, “organism”, “living thing”, “whole”, “object”, “physical entity”, “entity”. Higher items in the series are more general, while lower items are more speci c: that is, they add additional semantic traits. Seen from another perspective, items lower in the hierarchy also contain a superset of the traits found in higher levels. us, a “mammal” has all of the traits found in the class “vertebrate”, but adds additional more speci c traits not necessarily present in vertebrates. Unfortunately, this inheritance of traits can give rise to difficulties, as we will see below. One means of capturing semantic relations of the sort is the type hierarchy, where objects are arranged based upon their types (Sowa, 1987). ese types of networks are also known as inheritance networks, since subtypes may inherit properties from their parent nodes (see Lehmann 1992; Daelemans et al. 1992). Inheritance networks typically use IS–A relationships for types (see Sowa 1987; Marinov and Zheliazkova 2005). One difficulty with inheritance networks lies in the the cancellation of inherited traits.⁶ us, while “bird” has the general trait ‘capable of ying’, this is not true for some speci c birds like penguins and ostriches. More generally, as Rosch ⁴ Most linguists use the term hyperonym, while many computer scientists use the term hypernym. We prefer the former. ⁵ See, for example, Cruse (1986) for an extended discussion of semantic relations. ⁶ See, for example, Brachman (1985).
Previous Approaches
41
has demonstrated (Rosch, 1975), some words are more typical of a class than others. So, for example, when she tested subjects by asking them to rank different objects as more or less typical of the class ‘furniture’, words like “chair” consistently ranked higher than words like “clock” or “refrigerator”. Another problem arises in the capturing of extensional relationships (JohnsonLaird et al., 1984). Imagine, for example, that there are three people, A, B and C seated together. Consider the sentences “A is on B’s right” and “B is on C’s right”. e inference “A is on C’s right” requires knowledge about the real world. If the participants are seated along a rectangular table, this inference is correct. However, if they are seated around a circular table, the statement “A is on C’s right” may no longer be true. Additionally, as we have noted earlier, the existence of multiple meanings for a single lexical entry has as a consequence that each entry participates in a potentially very large number of distinct IS–A relationships (see also Busa et al. 2001). e resulting combinatorial explosion causes both theoretical and practical difficulties. A number of computational projects are based on the notion of semantic relations. We brie y describe three of them here. e WordNet Project (Miller et al., 1990) is a large lexical database of English. Words are divided into their various senses which are themselves linked by semantic relations. us, in WordNet, the noun “run” has as direct hyponyms “fun run”, “marathon”, “obstacle race”, “steeplechase”, “track event”, as hyperonym “race”, and as sister terms “automobile race”, “bicycle race”, “boat race”, etc. As can be seen from these examples, its approach is semasiological.⁷ e FrameNet Project (Ruppenhofer et al., 2006) has as its goal to capture the semantic frames which underlie the English lexicon. In contrast to WordNet, its approach is onomasiological, in that the analysis proceeds frame by frame rather than word by word. Frames are linked by a series of relations including inheritance (so, for example, the revenge frame inherits values from the rewards_and_punishments frame), using (the speed frame uses the motion frame), subframes (the criminal_process frame has as subframes arrest, arraignment, trial, and sentencing) and perspectives (hiring and get_a_job provide perspectives on employment_start). ⁷ e WordNet database has been used in a great number of projects, in a number of languages. e most current information on the project can be found at http://wordnet.princeton.edu/.
42
φ e Semantic Representation of Natural Language
Frames themselves, which might be characterized as perspectives on events, vary greatly in their degree of abstraction, as the following examples illustrate: activity_ongoing, addiction, becoming_a_member, becoming_dry. e FrameNet authors note that they have eschewed analysis of most artefacts and natural kinds, leaving that task to WordNet. As a result their ontology is not complete. Although FrameNet is limited to English, a related project, the Saarbrücken Lexical Semantics Acquisition Project (SALSA) has as its goal to provide a large, frame-based lexicon for German (Burchardt et al., 2009) using the frames proposed by the FrameNet Project. e third example we will discuss is the PropBank project (Palmer et al., 2005). Its approach is fundamentally empirical: a layer of predicate-argument structure is added to each verb in the Penn Treebank with a view to providing better domain-independent language understanding systems. For each verb in PropBank, a frameset is de ned for each of its possible instances. us, the frameset kick.01 contains the arguments Arg0 which refers to the kicker, Arg1 which refers to the thing kicked and Arg2 which refers to the instrument, in this case, by default, the foot. According to the authors, the use of numbered arguments allows for the mapping onto a variety of semantic representations including theta-roles and lexical-conceptual structure. As Merlo and Van der Plas (2009) argue on the basis of their analysis of PropBank and VerbNet, there is bene t in combining resources for syntactico-semantic analysis. e PropBank authors take this to heart, providing a website which allows for mapping between PropBank, VerbNet, FrameNet and WordNet. Although most representations of lexico-semantic relations are, as the term would suggest, semasiological, some, like FrameNet, are not, and some could be adapted to an onomasiological perspective. Some, like WordNet, offer, at the lexical level, admirably broad coverage and ne resolution, but since the metalanguage used is relatively primitive, the result is a network whose nodes are numerous but whose arcs are not well-labelled. From the practical perspective, there exist numerous applications of WordNet, but none that we are aware of have as their goal pure semantic speci cation and none deal with texts in a useful fashion. Despite this, it is also clear that WordNet speci cations would admit of a functional representation in something like the formalism discussed in Chapter 1, as in hyponym(cat, mammal).
Previous Approaches
43
3.4 e generative lexicon Developed by Pustejovsky (1991, 1995) in response to what he saw as descriptive inadequacies in other semantic theories based on the enumeration of senses, the Generative Lexicon model seeks to capture, as its name suggests, the inherently creative aspect of lexical combinations. As an illustration, consider the sentences: “John baked a potato” and “John baked a cake”. In the former, the use of the verb “bake” re ects a change of state, while in the latter it re ects a creation event. Pustejovsky claims that the shi of meaning of “bake” in the two sentences does not come from two different lexical entries, but from the two different combinations of what he calls the qualia structure. In the rst case, “bake” will combine with information in “potato” which says that potatoes are naturally occurring objects that may be heated up. In the second case, “bake” will combine with information in “cake” which says that cakes are derived artefacts that can only be created through some creation process. In this way, the choice of noun is responsible for determining how “bake” is interpreted. To achieve this level of representation, Pustejovsky (1995) proposes four levels of qualia: constitutive, formal, telic and agentive. In the case of a term like “knife”, the formal quale identi es an individual item from the set of tools, the telic quale provides information relating to the purpose of the entity, in this case, to cut something, the constitutive quale captures the material composition of the entity, and the agentive quale identi es how an item is brought about. In Pustejovsky’s model, qualia structure is one of four levels that describe the semantics of a given utterance, the others being argument structure, event structure and inheritance structure. Consider the verb “kill”, which, in one of its senses, takes two arguments, both of which must designate physical objects. ese argument speci cations act as constraints on the types of objects that can be passed to the various functions. At the level of the event structure, it is noted that “kill” starts with a process and results in a state of being. In order to capture this information, Pustejovsky de nes a notation for event ordering, and provides an attribute within the event structure to express this information. e notation developed by Pustejovsky (1995) allows for many types of event structures. To see how this allows for multiple meanings to be captured depending on context, consider the statement “Brutus used the knife”. Since the telic attribute within the qualia structure provides the purpose of a knife (to cut), the de nition of the verb use can consult this information to ll in the details of its meaning in this context.
44
φ e Semantic Representation of Natural Language
ere have been several criticisms of the Generative Lexicon. For example, according to Fodor and Lepore (1998), Pustejovsky’s framework cannot explain how an item may be put to a use that does not re ect its telic role. For example, one may use a screwdriver to pound a nail into a wall. Fodor and Lepore summarize the heart of this issue with the question: “what happens if a verb makes a demand on an argument that the lexical entry of the argument doesn’t satisfy?” Other criticisms focus on the ontological characteristics of the lexicon itself. For example, Fodor and Lepore ask why the lexical entries for the two senses of the word “bake” should be combined into a single entry at all. More importantly, what is the methodology used to determine their relatedness? While surely they share some properties, one is a creation process and the other re ects a change of state. Fodor and Lepore conclude that the problem of distinguishing word sense is instead recast as a problem regarding the enumeration of all possible processes that occur in natural language. It is also important to note that the approach proposed by Pustejovsky is essentially focused on the interpretation of utterances rather than their generation. at being said, the Generative Lexicon model begins to capture the rich interplay of levels required to model meaning, from the sublexical to the argument level. e SIMPLE project (Busa et al., 2001) is based on Pustejovsky’s model.
3.5 Case grammar In several of the models just discussed, it is clear that the representation of lexical meaning has necessary links to syntactic phenomena. Characterization of these links has occupied linguists for a number of decades. We provide several examples of this here, but it should be noted that this list is in no way exhaustive. One of the earliest attempts at characterizing the semantic content of a sentence was developed by Fillmore (1968), who recognized that a semantic type of role labelling was needed, since surface structure grammatical function labels do not provide a clear semantic account of the participants involved in a given proposition. For example, consider the sentence “Brutus stabbed Caesar”. Here the subject of the verb is “Brutus”, and the direct object of the verb is “Caesar”. However, the terms subject and object used to label the grammatical function of syntactic constituents provide no semantic information. As Fillmore noted, if the sentence is transformed to the passive, the grammatical function of each constituent may change. For example, in the case of “Caesar was stabbed by
Previous Approaches
45
Brutus”, the subject of the verb is now “Caesar” and the object of the verb is now “Brutus”. However, even here the underlying proposition and its parameters remain the same. In both cases, the proposition ‘stab’ requires an entity that will perform the stabbing, and an entity that will be the victim of the action. Fillmore’s goal was to create a label system to account for these semantic roles and allow for a mapping between the semantic participants and the surface structure grammatical functions. Fillmore proposed six original semantic roles: e agentive indicates an animate actor who carries out a particular action. e dative indicates an animate object that is affected by the action. e instrumental indicates the use of a non-animate object that is the cause of the action. e factitive indicates the creation or resulting object of the action. e locative indicates the location of the action, and the objective is a general ‘catch-all’ case that designates an object involved in the action. Over time, additional cases have emerged as new semantic roles are either discovered or further explored. e combination of cases with verbs in a sentence leads to what Fillmore calls a case frame whose purpose is to dictate the lexical selection of verbs and nouns based upon the features and semantic roles available to be lled. In order to choose appropriate nouns and verbs during the generative process, the lexicon is consulted. For verbs, each lexical entry includes a set of frame features that indicate the case for each of its formal parameters. For example, the verb “stab” accepts two formal parameters, and may have a frame feature that looks like the following: (3.15) +[
D + A]
is states that the verb requires dative and agentive nouns. In the case of nouns, each lexical entry has an associated set of semantic features that are used to indicate various properties of the word. For example, the noun “Brutus” may have the feature [+animate], which would make it a valid selection for any case that requires an animate object (e.g. the dative or agentive cases). us, one possible selection of nouns and verbs for the expansion of V + D + A would result in the sentence “Brutus stabbed Caesar”, represented by: (3.16) stab + BrutusD + CaesarA
46
φ e Semantic Representation of Natural Language
One of the major drawbacks of case grammar lies in the relatively coarse granularity of the cases available (Levin and Rappaport, 1986). Problems also arise from the fact that the same lexical item may participate in a range of case frames and that case frames may vary across related lexical items. us, Winograd (1983) points out that in the case of “Cinderella broke the glass” and “Cinderella polished the glass” both “break” and “polish” have the same frame features, but that “break” may be reformulated into an ergative (“e glass broke”), while “polish” cannot (*“e glass polished”). Case grammar alone cannot account for this difference, a problem of resolution. From the perspective of practicality, it must be recognized that case grammar has had a signi cant impact on NLG, a testament to its at least potential practicality. Systems that use some variant of the formalism include: ERMA (Clippinger, 1975), PLANE (Waltz and Goodman, 1977), GIST (Swartout, 1982) and GENNY (Maybury, 1989).
3.6 Conceptual Dependency Created by Schank (1972), Conceptual Dependency (CD) is a form of representation based on a set of primitives which carry the underlying meaning of a word or set of words. In Schank’s model, there are three different types of conceptual primitives: nominals, actions and modi ers. Nominals are concepts that can be understood by themselves, and usually relate to nouns. Schank called these objects picture producers (PP), since he noticed they they tend “to produce a picture of that real world item in the mind of the hearer.” As their title suggests, conceptual actions (ACT) express what a PP is doing, and usually relate to verbs. Conceptual modifiers relate to properties, and rely on the existence of a PP or ACT to be correctly interpreted. Schank identi ed two types of modi ers: picture aiders (PA) and action aiders (AA). Additional primitives are also available for locations (LOC) and for times (T) (Schank, 1975). Utterances in a natural language are formed by mapping a set of concepts onto a set of syntactic structures, a process which, according to Schank, humans perform when they wish to communicate. Formally, meaning in CD is carried by concepts and their relationships to one another in a conceptual dependency network or C-diagram (Schank, 1972). A dependency consists of a governor and a dependent. Governors are PPs and
47
Previous Approaches
ACTs, while dependents are PAs and AAs. For example, consider the sentence “Caesar is dead”.⁸ (3.17) Caesar HEALTH(−10) Here, Caesar and HEALTH are members of a two-way attributive dependency which shows that the PP Caesar is dependent on the PA HEALTH in a predication. e dependence relationship is two-way since both Caesar and HEALTH must be present for the attributive predication to exist. Schank (1975) describes the state HEALTH by a numerical scale, which ranges from −10 (death) to +10 (perfect health).⁹ Other states represented using numerical scales include FEAR, ANGER, MENTAL STATE, PHYSICAL STATE, etc. Even more complex states can be expressed by combining many of these primitive states. e formation of valid C-diagrams is governed by a xed set of conceptual syntax rules (Schank, 1975). For a C-diagram involving more complex rules, consider the sentence “Caesar walked home from the senate”, shown below: (3.18)
p o Caesar ↔ PTRANS ← Caesar ←
D
to from
Caesar POSS → home senate
In this example, several primitive concepts are combined to provide the nal C-diagram. e concepts ‘home’ and ‘senate’ are involved in an ACT PTRANS that has a two-part dependency between objects designated ‘from’ and ‘to’. is portion of the structure expresses the directive case, represented by the D on the diagram (Schank, 1972). e ACT PTRANS represents the physical change of location that an object undergoes. It is one of the 11 primitive ACTs de ned by Schank (1975), which include: SPEAK, PROPEL, MOVE, INGEST, EXPEL, GRASP, ATTEND, MTRANS, MBUILD and ATRANS. Many of these are fairly intuitive, but several are not. us, ATRANS is the act of changing an abstract relationship, ATTEND is the act of sensing a stimulus, MTRANS is the act of transferring information and MBUILD is the act of creating a new thought out ⁸ As in earlier cases, we illustrate the formalism using examples we have ourselves created based on the characters of Brutus and Caesar. ⁹ We will borrow this model of using a numeric scale to represent a scalar dimension in Chapter 5.
48
φ e Semantic Representation of Natural Language
of others. e arrow connecting the objects ‘Caesar’ and ‘home’ is quali ed with POSS, indicating that one object possesses the other. In this case, we are referring to “Caesar’s home”. e symbol connecting ‘Caesar’ and ‘go’ is similar to the dependency shown in the previous example, but in this instance is used to show an action predication dependency, namely that ‘Caesar’ is performing the ACT ‘go’. e ‘p’ over the symbol between ‘Caesar’ and PTRANS is a temporal marker used to indicate that the action occurred at an unspeci ed point in the past. CS eory is not without critics. For example, Dunlop (1990) takes issue with Schank’s requirement that ACTs always require the instrumental case. As Dunlop observes, since the instrumental case itself requires fully formed concepts to carry meaning, the description of a sentence like “Brutus stabbed Caesar with a knife” is problematic: an instrumental case is needed to describe the fact that it was Brutus’ arm that moved the knife towards Caesar, which in turn requires an instrumental case to capture the fact that Brutus’ arm muscles caused the arm to move, which in turn requires an instrumental case to describe the fact that impulses from his brain caused the muscles in Brutus’ arm to actually move, and so on. As Dunlop states, “if the performance of any given one ACT presupposes an in nity of ACTs, no ACTs will be possible at all”. While interesting theoretically, this quibble illustrates the tension between practicality and resolution. For many purposes, it may in fact be sufficient to use the coarse representation proposed by Schank. What is needed is the ability to transition to a ner resolution as required. Despite its weaknesses, Schank’s model possesses the advantage of presenting an onomasiological perspective. In addition, it is built upon a formally de ned set of operators whose interaction is speci ed in detail. It has also demonstrated its practicality: several computational systems make use of CD including: BABEL (Goldman, 1975), TALE-SPIN (Meehan, 1977), Kaa (Mauldin, 1984) and ViRbot (Savage et al., 2009).
3.7 Semantic networks Semantic networks have existed since the late nineteenth century, one of the earliest forms being the existential graphs developed by C. S. Peirce (Lehmann, 1992). As noted by Sowa (1987), the particulars regarding the style of each graph may vary, but the underlying principles remain the same: nodes depict semantic individuals or concepts, and a series of arcs are drawn between them to represent
49
Previous Approaches
relationships. e relational graph is the simplest type of network, so named since the arcs that connect conceptual nodes are understood to represent speci c relationships. Usually, the relationships used to link semantic concepts together are semantic roles (see Lehmann 1992). For example, the graph below uses a verb-centred relational graph to represent the sentence “Brutus stabs Caesar viciously”. (3.19) BRUTUS
Agent
STAB
Experiencer
CAESAR
Manner
VICIOUS A different type of semantic network known as a propositional network was developed in order to deal with expressive problems in verb-centred relational graphs (Sowa, 1987). Propositional networks allow nodes to represent entire propositions, allowing for the expression of sentences that require fully formed expressions to become nested arguments to other propositions. ⁰ For example, the gure below contains a propositional network for the sentence “strangely, Brutus sang”, which cannot be expressed with simple verb-centred relational graphs: (3.20) STRANGE
Manner
BRUTUS
Agent
SING
Although the examples cited here are semasiological, semantic networks lend themselves well to an onomasiological perspective. ey have shown their practicality by the various computational implementations which make use of them, including: ACT (Anderson and Kline, 1977), KL-ONE (Brachman and Schmolze, 1985), ANALOG (Ali and Shapiro, 1993) and Kalos (Cline and Nutter, 1994). Coverage has been primarily limited to sentences and short texts to date and degree of resolution has been variable. One noteworthy recent example of the semantic network paradigm is presented in Helbig (2006), which describes the MultiNet system, designed to function as “a common backbone for all aspects of natural language processing” in the context of “intelligent information and communication systems and for natural language interfaces to the Internet”. ⁰ We will make use of this distinction in Chapter 4.
50
φ e Semantic Representation of Natural Language
It is not possible to do justice to the complexity of the MultiNet system in the space available here. We will content ourselves with pointing out several salient points. First, many of the “global requirements” spelled out by Helbig conform to the desiderata spelled out in Chapter 2. Among them, to use his terminology (Helbig, 2006, p. 4), are: • Universality – independence of any particular natural language or
eld of
discourse. • Cognitive adequacy – the capacity to capture the full extent of each concept. • Interoperability – capable of being used in various systems. • Homogeneity – the ability to represent the meanings of words, sentences or • • • • • • •
•
texts with the same means used for describing reasoning processes. Communicability – such that elements of the system are “intuitively intelligible.” Practicability – technically tractable and scalable. Automability – the ability to function independently of human control. Completeness – the ability to represent all meanings. Optimal granularity – the ability to specify the degree of precision required in each case. Consistency – the capacity to ensure that distinct items of knowledge are both globally and locally consistent. Multidimensionality – the ability to distinguish between and also represent immanent and situational knowledge, intensional and extensional aspects, and quality and quantity. Local interpretability – the capacity to interpret items of knowledge both in themselves, independently of the entire context of knowledge.
As might be expected given its roots in the semantic network paradigm, the MultiNet model provides an especially rich set of speci cations for semantic relations, as well as space and time, modality and negation, quanti cation, conditions and causality and semantic traits. at being said, its approach is essentially semasiological in that it assumes natural language utterances as the starting point and some semantic representation as the product. e model is illustrated by application to a large number of simple and complex sentences in German and English. Although the volume touches on relations with text representation models such as Rhetorical Structure eory (see below), it does not deal with extended texts.
51
Previous Approaches
3.8 Systemic grammar Systemic Grammar was originally developed by Halliday as a way of describing the social and functional roles of language in context (see, e.g. Halliday 1985). Central to the model is the use of functional descriptions at various levels to describe the meaning-based choices that occur in production of a text. So, for example, the thematic dimension turns on issues of information (what a text is about) and its status as new or old. Halliday uses the terms theme to refer to what a span of text is about and rheme to designate what is said about it. Another dimension, that of mood, characterizes the role of the text in the relation between speaker and hearer, and includes values of declarative, interrogative, imperative. e dimensions postulated by Halliday may be applied at a variety of textual levels, ranging from the lexical to the textual. Central to the model is the notion that the same dimensions may be found at all levels, including the textual, although their instantiation will vary. So, for example, clauses may be linked by the conjunctive function having the values of elaboration (e.g. “in other words”, “by the way”), extension (“and”, “but”) and enhancement (“likewise”, “therefore”). Realization of utterances using a systemic grammar is typically based on a series of system networks. So, for example, Figure 3.1 (reproduced from
Animate Question Subjective Objective Reflexive Possessive Possessive-Determiner
Case
Personal
Person
First Second Third
Number
Demonstrative
Singular Plural
Gender
Feminine Masculine Neuter
Near Far
Figure 3.1 A system network for the choice of English pronouns. e Systemic Grammar model has undergone a number of revisions and developments over the years. Halliday and Mathiesen (2004) provide a more recent version.
52
φ e Semantic Representation of Natural Language
Winograd 1983, p. 293) represents the choices involved when selecting an English pronoun. A disjoint set of choices exists between ‘Question’, ‘Personal’ or ‘Demonstrative’, indicated by the vertical line. is means that only one feature may be selected. If ‘Personal’ is chosen, then parallel sets of choices exist for ‘Case’, ‘Person’ and ‘Number’. Parallel choices are indicated by curly braces, and mean that every choice must be selected. If Person: ird and Number: Singular are chosen, then a disjoint choice for Gender must be made. If this choice is ‘Feminine’ then the pronoun “she” is used. e systemic grammar model has the advantage of supporting an onomasiological perspective, since, at least in principle, it begins by specifying semantic choices prior to the determination of formal mechanisms to carry them. It is also applicable at a variety of levels. And nally, it is demonstrably implementable: many NLG systems make use of it, including NIGEL (Mann and Matthiessen, 1985), SLANG (Patten, 1988), GENESYS (Fawcett, 1990), HORACE (Cross, 1992), IMAGENE (Vander Linden et al., 1992), WAG (O’Donnell, 1995), Prétexte (Gagnon and Lapalme, 1996), SURGE (Elhadad and Robin, 1996), KPML (Bateman, 1997) and ILEX (O’Donnell et al., 2001).
3.9 Truth-functional perspectives All of the models discussed to date fall under the meaning-form side of the Ogden-Richards triangle, in the sense that they do not provide for representation of states of affairs in any world. us, to take one example, there is no sense in which one of Jackendoff ’s models may be said to be true or false. ere does exist, however, a vast literature in which notions of truth and possible worlds are central. In Chapter 2, we showed that meaning situations extend beyond the boundaries of truth. At the same time, it is important to recognize that reference to some world underlies a great many semantic issues ranging from anaphora, to tense, to causality. Given that, we will now spend some time exploring two typical truth-functional models: rst-order predicate calculus (FOPC) and intensional logic (IL). e FOPC is a reference theoretic formalism that is based upon a model M that consists of the ordered pair (A, F) where A is a domain of individuals (the set of entities e) and F is a function that assigns values to constant terms (names and As was the case earlier, the examples below make make use of the logical models we will be examining, but apply them to data we have created.
Previous Approaches
53
predicates). Predicates are functions that take zero or more terms as arguments and return truth values. Additionally, variable assignments g exist that assign values to variable terms. e truth value of a particular expression is always obtained relative to the model M and the variable assignments g. FOPC has been used extensively as a formalism for capturing the semantics of natural languages, despite the fact that only a very loose methodology exists for translating the meaning of a natural language expression into its rst order logical equivalent, or vice versa. e attraction of this particular formalism stems no doubt from the extensive set of computational tools for various inferencing and automated reasoning tasks which make use of it (Blackburn and Bos, 2003, 2005). IL is a reference theoretic formalism proposed by Montague (1974b) to extend the FOPC. In it, each syntactic formation rule is coupled with a well-de ned semantic interpretation. Montague’s model embodies the notion of compositionality: complex formalisms are based systematically on simpler ones and every syntactic formation rule has a well-de ned semantic interpretation (Partee, 1972; Montague, 1974c). IL assumes a model M that consists of the quintuple (A, W, T, completion, obtain(X, material) = slist[over_and_done] ( purveyor :: entity, purveyor = qual(man, possess(EREL, batch[plur](material))), -- a man who has batches of material, meet(X, purveyor), -- X met a purveyor of the desired material ... say ( X,
208
φ e Semantic Representation of Natural Language
purveyor, in_order_that ( build(speaker, house), give[imperative](listener, speaker, batch[plur](material)) ) ), -- and asked the latter to give him some material -to build a house. give(purveyor, X, batch[some](material)) -- the purveyor obliged, hoping to save X's bacon! ),
-- bad_encounter -------
The function describes an incident in which an intended victim builds a house of stuff and has a bad encounter with a villain; tm is the number of attempts the villain makes to destroy the house. The attempt succeeds with fatal results unless the house is made of brick.
bad_encounter :: (entity, entity, entity, integer) -> completion, bad_encounter(villain, int_victim, stuff, tm) = slist[over_and_done] ( stuffhouse :: entity, stuffhouse = made_of(house, stuff), obtain(int_victim, stuff), build(int_victim, stuffhouse)), -- The int_victim obtained stuff -and built a house of stuff. come(villain, stuffhouse),
φ ree Little Pigs
209
knock_on(villain, of(door, stuffhouse)), -- A villain came along, knocked at the door, ... say ( villain, int_victim, slist ( rpt(2, vocative(int_victim)), allow[imperative](listener, speaker, enter(speaker, stuffhouse)) ) ), -- and said: "int_victim, int_victim, let me come in." -----
Note that a typical value for int_victim will pig2. Rendering this as "little pig" rather than "second little pig" is a matter for the NLG to decide, akin to its replacing an entity by a pronoun.
cause ( swear_by[over_and_done] ( int_victim, villain, of(beard, speaker), allow[neg](speaker, listener, enter(listener, stuffhouse)) ), -- The intended victim answered: -"By the hair of my chinny chin chin, -I will not let you in." say[over_and_done] ( villain,
210
φ e Semantic Representation of Natural Language
int_victim, in_order_that ( destroy[yet_to_come](speaker, stuffhouse), seq[yet_to_come] ( blow_on(speaker, stuffhouse, huffily), blow_on(speaker, stuffhouse, puffily), ) ) ) -- The villain said: "Then I'll huff -- and I'll puff and I'll blow your house in." ), -- In essence, the 'cause' function here captures the -- meaning of the English word "Then". Because the -- intended victim refuses entry, the villain announces -- his evil intention. Actually, we have simplified the -- rather complex relationship between the completions -- here. The refusal not only causes the villain's -- threat, but also the attempted destruction. -- Furthermore, if the attempt succeeds, it causes the -- actual destruction, etc.; if it fails, this is in -- spite of the attempt. Later in the chapter, we will -- present an alternative version of this part of -- 'bad_encounter' showing this structure in a clear -- form. rpt ( tm, seq[over_and_done] ( blow_on(villain, stuffhouse, huffily), blow_on(villain, stuffhouse, puffily) ) ), -- So he huffed and he puffed several times, and ...
φ ree Little Pigs
211
if (stuff != brick) -- a conditional expression; if stuff is not equal to -- brick, the then-clause is evaluated; otherwise, the -- else-clause. then seq[over_and_done] ( destroy(villain, stuffhouse), -- the villain destroyed the house, eat(villain, int_victim) -- and he ate up the intended victim. ) else destroy[able.over_and_done.neg](villain, stuffhouse) -- the villain was unable to destroy the house ),
-- subterfuge --------
The function tells of an attempt at subterfuge. The villain tells the hero that he knows a place where there is a source of a desirable object. During a discussion, he reveals that this source can be found at a certain site. He proposes to meet the hero at a certain time to take him there. The hero agrees to go.
subterfuge :: (entity, entity, entity, entity, entity, integer) -> completion, subterfuge(villain, hero, desirable_obj, source, site, time) = slist[over_and_done] ( pl, src :: entity, src = qual(source, contain(EREL, desirable_obj)), -- a source which contains a desirable object
212
φ e Semantic Representation of Natural Language
pl = qual(place, contain(EREL, src)), -- a place where there is such a source -- The subsidiary constant src makes pl easier for a -- human to read. say ( villain, hero, know(speaker, pl) ), -- The villain tells the hero -that he knows such a place. say ( hero, villain, qual_is(pl, placeat(QUERY)) ), -- The hero asks where it is. (Earlier, functions such -- as placeat were defined as yielding circumstances. -- Here the circumstance is coerced to a qualifier.) say ( villain, hero, seq ( qual_is(pl, placeat(site)), propose ( speaker, visit[yet_to_come] (elist(speaker, listener), site, timeat(hour(time))) ) ) ),
φ ree Little Pigs
213
-- The villain tells him that it is at certain site, and -- proposes that they will go there at a certain time. consent(hero) -- The hero consents. ),
-- early_bird --- The function tells us of the hero getting up at a certain -- time, going to a site and buying an item. Being a moral -- animal, he/she prefers purchase to theft. early_bird :: (entity, entity, entity, integer) -> completion, early_bird(hero, item, site, time) = slist[over_and_done] ( arise(hero, timeat(hour(time))), visit(hero, site, UNSPEC), buy(hero, item) ),
-- wiliness --- At this point we might define a function combining the -- two previous ones; say: -- wiliness :: (entity, entity, entity, entity, entity, entity, entity) -> completion, -- wiliness(villain, hero, tmpt_obj, bght_obj, source, site, time) = -slist -( -subterfuge(villain, hero, tmpt_obj, -source, site, time), -early_bird(hero, bght_obj, site, time - 100) -)
214
φ e Semantic Representation of Natural Language
--------
Separate parameters are provided for the object used to tempt the hero and the one the hero actually buys, because in our third application, he buys an object which the villain never suggested. The generality is also appropriate for first two cases. The villain might suggest the purchase of, say, turnips, while the hero decides to buy a 20kg box.
-- Note that these functions can be thought of as topoi, -- built from basic functions and simpler topoi combined -- into higher topoi.
-- The Story --- Having defined some functions, we now come to the story -- itself. The story begins by introducing the three little -- pigs and their mother, an old sow. threepigs :: [entity], threepigs = qual(pig, qlist(number(3), little), -- three little pigs -------
This is the matter raised in the previous chapter. Nothing here connects this with pig1, pig2 and pig3. Defining this as elist(pig1, pig2, pig3) would do so, but then an NLG would have no reason to express this as "three little pigs", which is the title of the tale.
exist(rqual[descr](oldsow, possess[over_and_done](REL, threepigs))), -- There was an old sow, who had three little pigs. cause ( cause
φ ree Little Pigs
215
( possess[over_and_done, durative, neg](oldsow, money), feed[able.neg, over_and_done, durative] (oldsow, threepigs) ), command[over_and_done] ( oldsow, threepigs, in_order_that ( seek(threepigs, fortune), enter(threepigs, outside_world) ) ) ), -- Because the old sow had no money, she could not feed the -- little pigs; and because of this, she told them to go out -- into the world to seek their fortune. bad_encounter(W, pig1, straw, 1), -- pig1 built a house of straw and was eaten by the wolf. bad_encounter(W, pig2, furze, 2), -- pig2 built a house of furze and was eaten by the wolf. bad_encounter(W, pig3, brick, 3), -- pig3 built a house of brick and survived an encounter -- with the wolf. -- Now, having failed to reach pig3 by force, the wolf tried -- subterfuge, hoping to tempt him out with the promise of food. subterfuge(W, pig3, turnip[plur], field, of(farm, Smith), 0600), -- The wolf invited pig3 to accompany him to a field at -- Smith's farm to get some turnips, leaving at six o'clock.
216
φ e Semantic Representation of Natural Language
early_bird(pig3, turnip[plur], of(farm, Smith), 0500), -- The little pig got up at five, went to Smith's farm and -- bought some turnips. return(pig3, of(house, pig3)), -- The pig returned home. arrive(W, timeat(hour(0600))), say(W, pig3, seq(vocative(qual(pig, little)), qual_is[qu](listener, ready))), -- The wolf arrived at six and asked if the little pig -- was ready. say ( pig3, W, seq ( visit[over_and_done](speaker, of(farm, Smith), UNSPEC), in_order_that(boil(speaker, soup), buy[over_and_done](speaker, turnips[plur])) ) ) -- pig3 replied: "I visited Smith's farm, and bought some -- turnips to make soup." qual_is[over_and_done](W, angry[very]), -- The wolf was very angry. -- Nonetheless ... subterfuge(W, pig3, apple[plur], apple_tree, Merry_garden, 0500) -- The wolf invited pig3 to go with him to an apple tree at -- Merry_garden to get some apples, leaving at five o'clock. early_bird(pig3, apple[plur], Merry_garden, 0400), -- The little pig got up at four o'clock, and went to
φ ree Little Pigs
217
-- Merry's U-pick garden and bought some apples. slist -- the sublist allows us to define a local constant. ( pomme :: entity, pomme = apple, -- in this section, we need to distinguish between a -- general apple and a specific one thrown by the pig. seq[while] ( descend[durative](pig3, apple_tree), arrive(W, placeat(apple_tree)) ), -- As the pig was coming down from the -tree, the wolf arrived. say ( W, pig3, qual_is[qu](apple[plur, spec], nice) ), -- The wolf said to the little pig: -"Are the apples nice?" say ( pig3, W, seq ( qual_is(apple[plur, spec], nice[very]), throw_to[yet_to_come](speaker, listener, pomme) ) ), -- "Very nice," replied pig3. "I will -throw one down to you."
218
φ e Semantic Representation of Natural Language
throw(pig3, pomme, far[very]), -and he threw it very far. seq[while] ( retrieve[durative](W, pomme), seq ( descend(pig3, apple_tree), run(pig3, towards(of(pig3, house))) ) ), -- while the wolf was picking it up, the little pig -- climbed down the apple tree and ran home. ), -- Not being one to give up easily, -the wolf came again the next day ... slist ( C, R :: entity, C = used_for(churn, milk), -- a milk churn R = qual(thing, qlist(large, round)), -- a great round thing subterfuge(W, pig3, UNSPEC, fair, Shanklin, 1500), -- and invited pig3 to go with him to a fair at -- Shanklin, leaving at three in the afternoon. -- (You would think that, by now, the wolf would -- guess what was going to happen.) early_bird(pig3, C, Shanklin, 1400), -- The little pig set out at 2 o'clock to go the -- fair, where he bought a milk-churn (though why -- he wanted one we can't imagine, unless he had -- already read this story).
φ ree Little Pigs
219
seq[before] ( depart(pig3), see ( pig3, rqual[descr] ( W, ascend[over_and_done, durative] (REL, hill[spec]) ) ) ), -- When he was leaving, he saw the wolf coming up -- the hill. in_order_that(hide[over_and_done](pig3), enter[over_and_done](pig3, C)), -- So he got into the churn to hide, roll(C, down(hill[spec])), -- and it rolled down the hill. cause(frighten[over_and_done](C, W), run[over_and_done](W, towards(of(home, W)))), -- The churn frightened the wolf, so he ran home
come(W, of(house, pig3)), -- The wolf went to the little pig's house, say[reported] ( W, pig3, frighten[over_and_done] (
220
φ e Semantic Representation of Natural Language
qual(R, roll(EREL, down(hill[spec]))), speaker ) ), -- and told him he had been frightened by a great -- round thing that rolled down the hill. laugh(pig3), say[reported] ( pig3, W, seq[over_and_done, and] ( C :: entity, C = used_for(churn, milk), -- a milk churn -------
Why have we defined a local C different from the earlier one? Because, although *we* have heard of this churn before, W has *not*. So, in the locale of the speech, it would be inappropriate to speak of 'the churn'.
qual_is(R, C), qual_is[over_and_done](speaker, placein(C)) ) ) -- The pig laughed, and revealed that the great round -- thing was a milk churn and that he had been inside -- it. ), -- To cut a long story short, the wolf, having failed to get -- at pig3 by house demolition and subterfuge, decided upon -- forced entry via the chimney. -- It was at this moment that the little pig decided to make
φ ree Little Pigs
------
221
his turnip soup, and hung a cauldron of water in the fireplace, making a blazing fire. Unhappily for the wolf, he fell into the cauldron, turning the vegetable soup into wolf stew; unhappily for the pig, he was a vegetarian, diet_includes[neg](pig3, meat), so his supper was ruined.
-- Later, the little pig married Shrek's sister-in-law (Steig, -- 1990). They had many piglets, for whom they built houses of -- brick, and they all lived happily ever after. )
-- This closes the original slist, terminating the complete -- expression.
e constants labelled Dramatis Personae, such as W and pig3, are essentially parameters of the story as a whole. We can trivially perform role-reversal by rede ning pig1, pig2 and pig3 as wolf cubs and W as a wild boar. We might also de ne constants to parameterize the turnips and Smith’s farm, allowing the wolf to invite pig3 to go out for a veggieburger at McDonalds. (It would, of course, have lacked sensitivity for him to suggest a ham sandwich.) Doubtless the functions might be used to construct other similar stories.
9.3 Alternative segment for bad_encounter As we promised earlier, we now give an alternative version for the last section of the function bad_encounter, displaying a more thorough relationship between several of the completions. We begin by de ning several completion-constants and one parameterized function, whose use makes the ‘working’ part of the code much easier to read. An expert computer programmer would be inclined towards greater parameterization of these components, but we avoid this in the interests of simplicity. Note, by the way, that one of the parameters is an adjustment. (9.3) slist ( -- completion-constants and function refusal, threat, destruct_ try, try_succeeds, try_fails :: completion,
222
φ e Semantic Representation of Natural Language
blowing :: (entity, adjustment) -> completion, refusal = swear_by[over_and_done] ( int_victim, villain, of(beard, speaker), allow[neg](speaker, listener, enter(listener, stuffhouse)) ), -- The intended victim answered: -"By the hair of my chinny chin chin, -I will not let you in." blowing(blower, timing) = seq ( blow_on[timing](blower, stuffhouse, huffy), blow_on[timing](blower, stuffhouse, puffy) ), -- The blower huffs and puffs -(adjusted for timing). threat = say[over_and_done] ( villain, int_victim, in_order_that ( destroy[yet_to_come](speaker, stuffhouse), blowing(speaker, yet_to_come) ) ), -- The villain said: "Then I'll huff and -I'll puff and I'll blow your house in."
φ ree Little Pigs
223
destruct_try = rpt ( tm, blowing(W, over_and_done) ), -- So villain huffed and puffed several times. try_succeeds = seq[over_and_done] ( destroy(villain, stuffhouse), -- the villain destroyed the house, eat(villain, int_victim) -- and he ate up the intended victim. ), try_fails = destroy[able.over_and_done.neg] (villain, stuffhouse), -- the villain was unable to destroy the house -- the "working" completion itself: cause ( refusal, if (stuff != brick) -- a conditional expression; if stuff does not -- equal brick, the then-clause is evaluated; -- otherwise, the else-clause. then cause ( destruct_try, try_succeeds
224
φ e Semantic Representation of Natural Language
) else although ( destruct_try, try_fails ) ), )
9.4 Length issues Incidentally, the ree Little Pigs example may give the reader the impression that our semantic representation is substantially longer than the English version of the story. is, however, is illusory. e apparent expansion is caused partly by the commentary, partly by the layout. In actual fact, the several versions of Joseph Jacobs’s original occupy between 100 and 130 lines in a typical page format and font, totaling around 5000 characters. With the commentary removed and the indented lines ushed le, the semantic expression has about double the number of lines but with much the same number of characters as the natural language form.
10
Applications: Creation
In the previous chapters, we have considered applications in which text in a natural language is expressed in the form of semantic expressions. In the nal chapter, we will turn to the converse: applications where the semantic expressions are created directly, perhaps by a human author, perhaps as the output of a computer program. We will begin with a simple example where the resultant text is expected to follow the expressions very closely, and will progress towards cases where a program is asked to elaborate on an outline or to produce a report building upon information found in a knowledge base. Presumably this is done with a view to the creation of text in some natural language by an appropriate NLG. We have said that it is not our purpose here to discuss the natural language generation process itself. We will, however, present short examples in English and French, which we generated using our VINCI NLG for a Proppian fairy tale.
10.1 A hole in three Sadly, none of the authors plays golf. Yet we will choose this as a prototypical example of an area where the SEs required by a writer mostly belong to a rather restricted set. Golf reporters should not take this as criticism. ere are, aer all, very few ways to say that a master golfer hit a driver from the tee to the centre of the fairway, loed an iron shot to within one foot of the cup, and putted for a birdie three. (e exception, of course, is the “colour commentator”, who can be relied upon to note that a certain player has never lost a tournament in June when scoring a par four at the eenth hole on the second day
226
φ e Semantic Representation of Natural Language
while the sun was shining; but even a colour commentator can be replaced by a computer!) e function repertoire might contain functions exempli ed by: (10.1) golf_stroke(Mickelson, hole(5), driver, yards(310), tee, trees, veer_left) indicating that a certain player used a driver to hit a ball 310 yards with unfortunate effect, or: (10.2) hole_length(5), bunkers_at(5) which might consult a database to determine that hole 5 is 478 yards in length, with bunkers at 270 yards to the right of the fairway and at 290 yards to the le. In the ideal world of Chapter 1, a unilingual writer might rapidly produce a report in several languages, built largely from functions like these. e subject area is one of many where reports might be constructed quite simply from a restricted lexicon of functions.
10.2 A Proppian fairy tale We consider next the creation of a simple fairy tale based on an outline composed from Propp’s narrative functions (Propp 1968). We begin with a cast of six entities, as shown in Table 10.1.
Table 10.1 Cast of characters for a folktale Entity
Example
pompous twit victim or heroine villain hero good fairy magic object
a king or rich merchant the twit’s daughter a sorcerer, a witch or an ogre a prince or a brave woodcutter the victim’s fairy godmother a sword or a silver goblet
is section is adapted from Levison and Lessard (2004).
Applications: Creation
227
is assumes that there is a local semantic lexicon containing potential characters, artifacts, and so on. A fragment of the lexicon might take the form: (10.3) Midas|entity|human, male, rich, cowardly, vain, occup(˜, king), father(˜, elist(Marie, Madeleine)), home(castle, ˜), . . . Marie|entity|human, female, beautiful, kind, occup(˜, princess), daughter(˜, Midas), home(castle, ˜), . . . Merlin|entity|human, male, evil, occup(˜, sorcerer), home(forest, ˜), . . . Lionheart|entity|human, male, brave, strong, handsome, good, occup(˜, prince), home(mansion, ˜), . . . Axel|entity|human, male, poor, brave, handsome, occup(˜, woodcutter), home(cottage, ˜), . . . Wanda|entity|supernatural, female, good, fairy_godmother(˜, Marie), . . . e entries are divided into elds by the symbol | and include rudimentary encyclopedic information. e rst eld is the function name of the entry, or in other words, the headword, while the second gives its semantic type. For entities, the third eld contains a list of SEs: quali ers, entities and relations, which provide information about the headword. e quali ers may be attributes of the headword, including both essential and accidental qualities , and as we noted in Chapter 5, their identi ers may be grouped into attribute types. So an attribute type wealth might contain values rich, well_off, comfortable and poor. us, assuming the quali ers have mnemonic names, Midas has the prime qualities for a pompous twit of richness and vanity, while either Lionheart or Axel might serve well as a hero. e entities in eld 3 may be names of sets to which the headword belongs. is distinction goes back to Aristotle. For a recent overview, see, for example, Robertson (2011).
228
φ e Semantic Representation of Natural Language
e third category of items in eld 3, while resembling functions, are actually relations. In each case, the symbol ~ simply abbreviates the headword. So the relation: (10.4) daughter(~, Midas) in the entry for Marie indicates that Marie has the relation daughter to Midas. e relation: (10.5) father(~, elist(Marie, Madeleine)). in the entry for Midas shows that Midas has the relation father to both Marie and Madeleine. is might just as easily been expressed as: (10.6) daughter(Marie, ~), daughter(Madeleine, ~) In fact, except perhaps for some efficiency consideration, there is no reason to include it at all, since it replicates the information in the daughters’ entries; it might even be constructed from the latter pair by transformation. To make use of the contents of the semantic lexicon, we must introduce some functions like: (10.7) (a) select_entity(X, qual(X, qlist(human, male, rich, vain))) (b) find_answer(qual(hero, wealth[wh])) e former takes two parameters, the rst being the name of an entity-constant, the second being some semantic expression which quali es (i.e. restricts) the entity. e value returned is a headword tting the restriction. us, as we have suggested, ‘Midas’ is a potential value for the example shown. Since there may be many headwords which t the bill, a choice might have to be made, perhaps at random. Technically speaking, both functions and relations are sets of pairs. e function commonly written f(x) is properly de ned as a set of pairs (x1, f(x1)), (x2, f(x2)), . . . where every value in the set from which the xi’s are drawn, called the domain, occurs in exactly one pair. e set of f(xi)’s is called the range. Put another way, there is a unique value of the range corresponding to every value of the domain. (ere is no requirement that the domain should be countably in nite, so it need not be possible to enumerate set of pairs as we have started to do above.) A relation consists of a similar set of pairs, except that each xi may occur in zero or more pairs. So, in the relation father, Midas may have several corresponding values, as shown, or none.
Applications: Creation
229
e latter takes as parameter a semantic expression representing a question, and obtains the answer from the lexicon. In Chapter 6, we reminded readers that expressions such as: (10.8) qual_is(hero, wealth[wh]) represent the meaning of the question: (10.9) What is the wealth-status of the hero? not its answer. By contrast, the function find_answer searches the lexicon to nd the answer. So, if Axel has been selected as hero, the value for the sample question would be poor. Some special constant, such as NO_VALUE, can be returned if the search cannot locate an answer. Using these features, we might de ne constants for the cast speci ed earlier: (10.10) twit, victim, villain, hero, goodfairy, magicobj :: entity, twit = select_entity(X, qual(X, qlist(human, male, rich, vain))), victim = select_entity(Y, daughter(Y, twit)), villain = select_entity(Z, qual(Z, qlist(human, evil))), hero = select_entity(W, qual(W, qlist(human, male, brave, handsome))),⁴ goodfairy = select_entity(V, fairy_godmother(V, victim)), magicobj = select_entity(U, qual(U, qlist(physical_obj, magic))) We take the the plot of the fairy tale to be a sequence of SEs derived from Propp’s narrative functions, as shown in Table 10.2. To create a more interesting fairy story, we may want to embellish these SEs automatically, based on the information in the semantic lexicon or in other knowledge bases. ⁴ Of course, in these days of gender equality, the hero might well be female, the victim male and the twit female.
230
φ e Semantic Representation of Natural Language
Table 10.2 Semantic expressions for the sample folktale Semantic expression
Possible instantiation
exists(twit) describe(twit)
Once upon a time there was a twit. He was rich, vain and none too brave (or he might have rescued his daughter himself!). e twit had a daughter, the victim. She was beautiful and kind (possible quali ers obtained from the lexicon). e twit warned the victim about walking in the forest (deed might be a semantic expression). Unfortunately, the victim was bored and disobedient. She went for a walk in the forest. In the forest there lived a villain (home obtained from the lexicon). He was strong and evil. e villain came upon the victim and kidnapped her. In the same area, there lived a hero. e twit contacted the hero and sought his help. e hero went to nd the goodfairy. e goodfairy provided the hero with a magicobj. e hero set out to search for the villain. He killed the villain with the help of the magicobj. e hero rescued the victim, . . . married her, . . . and they lived happily ever aer.
exists(victim) describe(victim) admonish(twit, victim, deed) disobey(victim) act(victim, deed) exists(villain) describe(villain) kidnap(villain, victim) exists(hero) seekhelp(twit, hero) seek(hero, goodfairy) give(goodfairy, hero, magicobj) seek(hero, villain) kill(hero, villain, magicobj) rescue(hero, victim) marry(hero, victim) livehappily(hero, victim)
ere are several ways in which this might be done. One is to de ne the functions used here to represent a whole sequence of subexpressions, some involving knowledge from the lexicon. So, describe might select one or two quali ers from the lexicon to produce a more comprehensive description. If the lexicon included more complex information, we might expect to see a sequence of sentences such as: (10.11) Once upon a time there was a king called Midas. Everything he touched turned to gold.
Applications: Creation
231
or, (10.12) Nearby there lived a prince called Braveheart, who was renowned for killing monsters. A closely similar approach to elaboration is to process the SEs into a longer sequence, again based on lexicon or knowledge bases. We will catch a glimpse of this in the subsequent examples. A third approach is to have the natural language generator itself carry out the elaboration. is, of course, means that the resulting output is languagedependent, in contrast to our announced intentions. We did, however, use this approach in our early experiments, having our NLG, VINCI, create the above story both in English: (10.13) Once upon a time there was a king called Midas who lived in a castle. He was rich and vain. e king had a daughter, a princess named Marie, who was beautiful. e king warned Marie not to go out of the castle. e princess disobeyed the king. She le the castle. A sorcerer called Merlin lived in the woods. He was evil. e sorcerer kidnapped the princess. Nearby there lived a woodcutter who was named Axel. e king sought the help of the woodcutter. e woodcutter went to look for the fairy godmother. e fairy godmother passed Axel a magic sword. Axel searched for the sorcerer. e woodcutter killed the sorcerer with the magic sword. e woodcutter rescued the princess. e woodcutter and the princess got married and lived happily ever aer.
and in French: (10.14) Il était une fois un roi qui s’appelait Midas et qui vivait dans un beau château. Il était riche et vain. Le roi avait une lle, une princesse qui s’appelait Marie et qui était belle. Le roi interdit à Marie de quitter le château. La princesse désobéit au roi. Elle quitta le château. Dans la forêt il y avait un sorcier qui s’appelait Merloc. Il était méchant. Le sorcier enleva la princesse. Aux alentours vivait un prince qui s’appelait Coeur de Lion et qui était beau. Le roi demanda l’aide du prince. Le prince chercha la bonne fée. La bonne fée donna une épée magique au prince. Le prince chercha le sorcier. Coeur de Lion utilisa l’épée magique pour tuer le sorcier. Le prince libéra la princesse. Le prince épousa la princesse et ils eurent beaucoup d’enfants.
232
φ e Semantic Representation of Natural Language
ese are not translations of one another, but separate generations. e close similarities result from the use of restricted syntaxes and very small lexicons. Note that the lexicon or knowledge base required for this Proppian tale is static, xed throughout the creation of the story. In more complex examples, a dynamic knowledge base will be required to capture ever-changing information as the story proceeds.
10.3 Variant stories In Chapter 8, discussing a passage from the novel Manon Lescaut, we de ned and re ned a function eliminate: (10.15) eliminate(stranger, Lescaut) as a topos to represent the text: (10.16) It’s Lescaut, he said, and red a shot; he’ll dine with the angels tonight. en he disappeared. Lescaut fell, with no sign of life. e actual purpose of this piece is to eliminate Lescaut while attaching guilt to his sister Manon, whose demands led them to be there. is is one of a series of misadventures which she causes. e stranger has no other role in the novel, and the function (and topos) might better have been speci ed as: (10.17) -- alt_eliminate --- The function describes the elimination of character L -- while assigning guilt to character M. alt_eliminate :: (entity, entity) -> completion, alt_eliminate(L, M) = slist[over_and_done] ( S :: entity, S = stranger,
Applications: Creation
M S S S S L
233
persuades L to visit a specific location appears, recognizes L, kills L, leaves, dies
) (Here again, we use English re nements for ease of reading.) Alternative re nements might equally have led to: (10.18) slist[over_and_done] ( M expresses the strong need for a walk, M and L go for a walk along the cliffs, L trips on a rock, L falls over the edge, L dies ) or: (10.19) slist[over_and_done] ( M says that M must have a drink, M and L enter an inn, L orders a Dubonnet with lemon, L chokes on a lemon pit, L dies ) where the internal expressions of the slists might themselves be elaborated in a variety of ways. Suppose we exploit this concept to the fullest degree. A story consists of a number of plot-lines, which are functions at a very high level. ese functions themselves may have alternative re nements into simpler functions, and these may have alternative re nements into functions which are simpler still. ere is no need
234
φ e Semantic Representation of Natural Language
for the simpler functions to grow exponentially in numbers. We can imagine that some of them will be shared by the ones at a higher level; indeed, the propensity for sharing can be expected to increase, the deeper we get down the tree. Now let us suppose that lists of functions, along with their alternative re nements, are stored in a database. Starting from a set of plot-lines, we can choose randomly from their re nements, then recursively from their re nements, and so on. e stage is set for an explosion of variant stories. us a recursive algorithm for creating Harlequin Romances is born!
10.4 Romeo and Juliet is is not to suggest that a satisfying story can be constructed in a hierarchical fashion, with each function being re ned independently of the others. Returning to the plot of Romeo and Juliet, whose plot-lines form the function RJ of Chapter 8, we note that the feud between families is played out throughout the play. Also, the secret meeting requires a method of communication: a priest carrying a message, or perhaps a cell-phone; while this, in turn, needs a means of disruption to prevent a crucial message being passed: a plague or an electrical storm. e latter may be introduced elsewhere, and in an apparently incidental manner (“It was dark and stormy night.”) In effect, the plot involves several threads which proceed concurrently, and the re nements of one thread may be interwoven with, or dependent on, others. If the choice of means of communication is a cell-phone, the plague is probably not very useful. Once a choice has been made among alternatives, a dynamic database must be updated, restricting choices elsewhere in the story.⁵ And the heightened suspense of a murder mystery requires the delicate build-up, piece by piece, of the various component threads, rather than their complete revelation one at a time.
⁵ Interestingly, this contrasts with the principle, more honoured in the breach than the observance, of building complex computer programs: that each re nement should be developed independently of the others. is is both because different programmers may work on different areas, and because it is important to be able to modify parts of a program without affecting others.
Applications: Creation
235
We can picture the expansion of the plot, from the plot-lines down to small groups of completions, corresponding perhaps to paragraphs in a text, as a massive tree structure of the form displayed in Chapter 5. e skilled task is to construct a story thread which passes through the tree, visiting each paragraph thread, but in an order which piques the interest of a reader without violating obvious rules of chronology or having characters (except the murderer!) know some detail that can’t have been revealed to them. But this is a topic of current research, and we will report on it at a future time.
10.5 In sum So now we approach the end of our journey. At the outset, we committed ourselves to presenting a formalism capable of capturing natural language meaning, both at the gross level, of plot-lines and topoi, and at the ne level, of linguistic subtleties, and to do so in a practical and user-friendly manner. We have demonstrated our approach, and tested it against a variety of examples and applications. We believe we have ful lled our commitment. e challenge is now to the literary, linguistic and computational communities. Perhaps, aer more than 60 years of ignoring the practical use of semantic formalism, it is time to produce the tools to bring to fruition the crown which we presented in the beginning.
Bibliography
Ali, S. S. and Shapiro, S. C.: 1993, Natural language processing using a propositional semantic network with structured variables, Minds and Machines 3(4), pp. 421–451. Anderson, J. R.: 1983, e Architecture of Cognition, Harvard University Press, Cambridge, MA. —— 1993, Rules of the Mind, Lawrence Erlbaum Associates, Hillsdale, New Jersey. Anderson, J. R. and Kline, P.: 1977, Design of a production system for cognitive modelling, Proceedings of the Workshop on Pattern-Directed Inference Systems, pp. 60–65. Ariel, M.: 2004, Most, Language 80, 658–706. Asher, N. and Lascarides, A.: 2003, Logics of Conversation, Cambridge University Press, Cambridge. Baldinger, K.: 1964, Sémasiologie et onomasiologie, Revue de linguistique romane 28, 249–272. Bateman, J. A.: 1997, Enabling technology for multilingual natural language generation: the KPML development environment, Journal of Natural Language Engineering 3(1), 15–55. Bateman, J. A., Kamps, T., Kleinz, J. and Reichenberger, K.: 2001, Constructive text, diagram and layout generation for information presentation: the DArtbio system, Computational Linguistics 23(7), 409–449. Battistella, E. L.: 1996, e Logic of Markedness, Oxford University Press, New York. Bird, R.: 1998, Introduction to Functional Programming Using Haskell, 2nd edn, PrenticeHall, London. Blackburn, P. and Bos, J.: 2003, Computational semantics, eoria 18(46), 27–45. Blackburn, P. and Bos, J.: 2005, Representation and Inference for Natural Language: A First Course in Computational Semantics, CSLI Publications, Stanford. Borges, J. L.: 1944, Ficciones, Sur, Buenos Aires. Brachman, R. J.: 1985, I lied about the trees, or, defaults and definitions in knowledge representation systems, AI Magazine 6(3), 80–95. Brachman, R. J. and Schmolze, J. G.: 1985, An overview of the KL-ONE knowledge representation system, Cognitive Science 9, 171–216. Bringsjord, S. and Ferrucci, D.: 1999, Artificial Intelligence and Literary Creativity: Inside the Mind of BRUTUS, a Storytelling Machine, Lawrence Erlbaum Associates, New Jersey.
Bibliography
237
Burchardt, A., Erk, K., Frank, A., Kowalski, A., Pado, S. and Pinka, M.: 2009, FrameNet for the semantic analysis of German: Annotation, representation and automation, in H. C. Boas (ed.), Multilingual FrameNets in Computational Lexicography: Methods and Applications, Mouton de Gruyter, pp. 209–244. Burger, A.: 1962, Essai d’analyse d’un système de valeurs, Cahiers Ferdinand de Saussure 19, 67–76. Busa, F., Calzolari, N., Lenci, A. and Pustejovsky, J.: 2001, Building a semantic lexicon: Structuring and generating concepts, in H. Bunt, R. Muskens and E. ijsse (eds), Computing Meaning, Vol. 77 of Studies in Linguistics and Philosophy, Kluwer Academic Publishers, Dordrecht, e Netherlands, pp. 29–51. Cabré, M. T.: 1999, Terminology: eory, Methods and Applications, John Benjamins, Amsterdam. Callaway, C. and Lester, J.: 2001, Evaluating the effects of natural language generation techniques on reader satisfaction, Proceedings of the Twenty-ird Annual Conference of the Cognitive Science Society, pp. 164–169. Callaway, C. and Lester, J.: 2002, Narrative prose generation, Artificial Intelligence 139(2), 213–252. Chomsky, N.: 1956, ree models for the description of language, IRE Transactions on Information eory 2, 113–124. Chomsky, N.: 1995, e Minimalist Program, MIT Press, Cambridge, MA. Cline, B. E. and Nutter, J. T.: 1994, Kalos – a system for natural language generation with revision, AAAI ‘94: Proceedings of the Twelh National Conference on Artificial Intelligence, pp. 767–772. Clippinger, J. H.: 1975, Speaking with many tongues: some problems in modeling speakers of actual discourse, TINLAP ‘75: Proceedings of the 1975 Workshop on eoretical Issues in Natural Language Processing, Association for Computational Linguistics, Morristown, NJ, pp. 68–73. Comrie, B.: 1976, Aspect: An Introduction to the Study of Verbal Aspect and Related Problems, Cambridge University Press, Cambridge. Comrie, B.: 1985, Tense, Cambridge University Press, Cambridge. Cortázar, J.: 1966, Final del juego: cuentes, Editorial Sudamericana, Buenos Aires. Coventry, K. R., Cangelosi, A., Newstead, S. N. and Bugmann, D.: 2010, Talking about quantities in space: Vague quantifiers, context and similarity, Language and Cognition 2(2), 221–241. Cross, M.: 1992, Choice in Text: A Systemic Approach to Computer Modelling of Variant Text Production, PhD thesis, Macquarie University. Cruse, D.: 1986, Lexical Semantics, Cambridge University Press, Cambridge, MA. —— 2004, Meaning in Language: An Introduction to Semantics and Pragmatics, Oxford University Press, Oxford. Daelemans, W., Smedt, K. D. and Gazdar, G.: 1992, Inheritance in natural language processing, Computational Linguistics 18(2), 205–218. de la Vallée Poussin, C. J.: 1896, Recherches analytiques sur la théorie des nombres premiers, Annales de la Société Scientifique de Bruxelles 20, 183–256.
238
Bibliography
de Rosis, F., Grasso, F. and Berry, D. C.: 1999, Refining instructional text generation aer evaluation, Artificial Intelligence in Medicine 17(1), 1–36. de Rosnay, É. F., Lessard, G., Sinclair, S., Rouget, F., Vernet, M., Zawisza, É., Blumet, L. and Graham, A.: 2006, À la recherche des topoï romanesques: le project TopoSCan, in D. Maher (ed.), Tempus in Fabula: Topoï de la temporalité narrative dans la fiction d’Ancien Régime, Presses de l’Université Laval, pp. 21–32. de Tencin, C. G. d.: 1735, Mémoires du comte de Comminge, Néaulme, Paris. Diderot, D.: 1970, Jacques le fataliste et son maître, 1797 edn, Garnier-Flammarion, Paris. DiMarco, C. and Hirst, G.: 1993, A computational theory of goal-directed style in syntax, Computational Linguistics 19(3), 451–499. Donald, M.: 2006, A Metalinguistic Framework for Specifying Generative Semantics, Master’s thesis, Queen’s University. Dowty, D. R.: 1981, Introduction to Montague Semantics, Vol. 11 of Synthese Language Library, D. Reidel Publishing Company, Dordrecht, Holland. Dunlop, C. E.: 1990, Conceptual dependency as the language of thought, Synthese 82(2), 274–296. Elhadad, M. and Robin, J.: 1996, An overview of SURGE: a reusable comprehensive syntactic realization component, Technical Report. Endriss, C. and Klabunde, R.: 2000, Planning word-order dependent focus assignments, Proceedings of the First International Conference on Natural Language Generation (INLG’2000), Mizpe Ramon, Israel. Erdös, P.: 1949, On a new method in elementary number theory which leads to an elementary proof of the prime number theorem, Proceedings of the National Academy of Sciences 35, 374–384. Fawcett, R. P.: 1990, e computer generation of speech with discoursally and semantically motivated intonation, Proceedings of the Fih International Workshop on Natural Language Generation, Dawson, PA, pp. 164–173a. Fetzer, A. (ed.): 2007, Context and Appropriateness: Micro meets macro, Vol. 162, J. Benjamins Pub. Co., Amsterdam. Fillmore, C. J.: 1968, e case for case, in E. Bach and R. T. Harms (eds), Universals in Linguistic eory, Holt, Rinehart and Winston, New York, NY, pp. 1–90. Flaubert, G.: 1862, Salammbô, Michel Lévy, Paris. Fodor, J. A.: 1983, Modularity of Mind: An Essay on Faculty Psychology, MIT Press, Cambridge, MA. Fodor, J. A. and Lepore, E.: 1998, e emptiness of the lexicon: Reflections on James Pustejovsky’s “e Generative Lexicon,” Linguistic Inquiry 29(2), 269–288. Gagnon, M. and Lapalme, G.: 1996, Prétexte: a generator for the expression of temporal information, in G. Adorni and Zock, M. (eds), Trends in Natural Language Generation, An Artificial Intelligence Perspective, Lecture Notes in Artificial Intelligence 1036, Springer-Verlag, pp. 238–259. Gal, A.: 1991, Prolog for Natural Language Processing, Wiley, New York. Gazdar, G. and Mellish, C.: 1989a, Natural Language Processing in LISP, Addison-Wesley, Reading, MA. —— 1989b, Natural Language Processing in Prolog, Addison-Wesley, Reading, MA.
Bibliography
239
Gibbs, R. and Van Orden, G. C.: 2010, Adaptive cognition without massive modularity, Language and Cognition 2(2), 149–176. Gillieron, J. L.: 1902, Atlas linguistique de la France, E. Champion, Paris. Goldman, N. M.: 1975, Sentence paraphrasing from a conceptual base, Communications of the ACM 18(2), 96–106. Gulla, J. A.: 1996, A general explanation component for conceptual modelling in case environments, ACM Transactions on Information Systems 14(3), 297–329. Hadamard, J.: 1896, Sur la distribution des zéros de la fonction zeta(s) et ses conséquences arithmétiques, Bulletin de la Société Mathématique de la France 24, 199–220. Halliday, M.: 1985, An Introduction to Functional Grammar, Arnold, London. Halliday, M. and Mathiesen, C.: 2004, An Introduction to Functional Grammar, 3rd edn, Hodder Education, London. Hartmann, R. (ed.): 1986, e History of Lexicography, John Benjamins. Helbig, H.: 2006, Knowledge Representation and the Semantics of Natural Language, Springer, Berlin. Hobbs, J. R., Stickel, M. E., Appelt, D. E. and Martin, P.: 1993, Interpretation as abduction, Artificial Intelligence 63, 69–142. Hornby, A. S., Cowie, A. P. and Lewis, J. W.: 1974, Oxford Advanced Learner’s Dictionary of Current English, Oxford University Press, London. Ide, N., Kilgariff, A. and Romary, L.: 2000, A formal model of dictionary structure and content, Proceedings of Euralex 2000, Stuttgart. Ide, N. and Véronis, J.: 1995, Knowledge extraction from machine-readable dictionaries: An evaluation, in P. Steffens (ed.), Machine Translation and the Lexicon, Lecture Notes in Artificial Intelligence 898, Springer-Verlag. Jackendoff, R.: 1983, Semantics and Cognition, e MIT Press, Cambridge, MA. —— 1990, Semantic Structures, e MIT Press, Cambridge, MA. Jacobs, J.: 1890, English Fairy Tales, Oxford University Press, Oxford. Jakobson, R.: 1960, Closing statements: Linguistics and poetics, in T. A. Sebeok (ed.), Style in Language, MIT Press, Cambridge, MA, pp. 350–377. Jaszczolt, K: 2005, Default Semantics: Foundations of a Compositional eory of Acts of Communication, Oxford University Press, Oxford, UK. Jerz, D. G.: 2007, Somewhere nearby is Colossal Cave: Examining Will Crowther’s original “Adventure” in code and in Kentucky, Digital Humanities Quarterly 1(2). Johnson-Laird, P. N., Herrmann, D. J. and Chaffin, R.: 1984, Only connections: A critique of semantic networks, Psychological Bulletin 96(2), 292–315. Kamp, H.: 1981, A theory of truth and semantic representation, in J. Groenendijk, T. Janssen and M. Stokhof (eds), Formal Methods in the Study of Language, Vol. 135 of Mathematical Centre Tracts, Mathematisch Centrum, pp. 277–322. —— 1988, Discourse representation theory: What it is and where it ought to go, Natural Language at the Computer, Vol. 320 of Lecture Notes in Computer Science, Springer, Berlin, pp. 84–111. Kamp, H. and Reyle, U.: 1993, From Discourse to Logic: Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation
240
Bibliography
eory, Vol. 42 of Studies in Linguistics and Philosophy, Kluwer Academic Publishers, Dordrecht, e Netherlands. Kasner, E. and Correa, E.: 1940, Mathematics and the Imagination, Simon and Schuster, New York. Kearns, K.: 2000, Semantics, Palgrave Martin, Hampshire, England. Keren, G.: 2011, Perspectives on Framing (Society for Judgment and Decision Making Series), Psychology Press, New York. Klarner, M.: 2004, Hybrid NLG in a Generic Dialog System, in A. Belz, R. Evans and P. Piwek (eds), Natural Language Generation (NLG): ird international Conference (INLG 2004), Vol. 3123 of Lecture Notes in Artificial Intelligence, Springer, pp. 205–211. Kleene, S. C.: 1952, Introduction to Metamathematics, Walters-Noordhoff and NorthHolland. Kohl, D., Plainfossé, A. and Gardent, C.: 1990, e general architecture of generation in ACORD, COLING, pp. 388–390. Langacker, R.: 1987, Foundations of Cognitive Grammar, Vol. 1–2, Stanford University Press, Stanford. Lehmann, F.: 1992, Semantic networks, Computers & Mathematics with Applications 23(2–5), 1–50. Lessard, G., Sinclair, S., Vernet, M., Rouget, F., Zawisza, E., Fromet de Rosnay, L.-É. and Blumet, É.: 2004, Pour une recherche semi-automatisée des topoï narratifs, in P. Enjalbert and M. Gaio (eds), Approches sémantiques du document électronique, Europia, Paris, pp. 113–130. Levin, B.: 1993, English Verb Classes and Alternations: A Preliminary Investigation, University of Chicago Press, Chicago. Levin, B. and Rappaport, M.: 1986, e formation of adjectival passives, Linguistic Inquiry 17(4), 623–661. Levison, M. and Lessard, G.: 2004, Generated narratives for computer-aided language teaching, in L. Lemnitzer, D. Meurers and E. Hinrichs (eds), COLING 2004 eLearning for Computational Linguistics and Computational Linguistics for eLearning, COLING, Geneva, Switzerland, pp. 26–31. —— 2005, Generating complex verb phrases: an extension of Burger’s French verbal morphology, in C. Cosme, C. Gouverneur, F. Meunier and M. Paquot (eds), Proceedings of Phraseology 2005 Conference, Centre for English Corpus Linguistics, Louvain-la-Neuve, pp. 235–242. Levison, M., Lessard, G., Gottesman, B. and Stringer, M.: 2002, Semantic expressions: An experiment, Technical report, School of Computing, Queen’s University. Locke, W. and Booth, A.: 1955, Machine Translation of Languages: Fourteen Essays, MIT Press, Cambridge, MA. Mandelbrot, B.: 1977, Fractals: Form, Chance and Dimension, W.H. Freeman, San Francisco. Mann, W. C.: 1984, Discourse structures for text generation, ACL-22: Proceedings of the 10th International Conference on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Morristown, NJ, USA, pp. 367–375.
Bibliography
241
Mann, W. C. and Matthiessen, C. M.: 1985, Demonstration of the Nigel text generation computer program, in J. D. Benson and W. S. Greaves (eds), Systemic Perspectives on Discourse, Vol. 1, Ablex, Norwood, NJ, pp. 50–83. Mann, W. C. and ompson, S. A.: 1988, Rhetorical Structure eory: Toward a functional theory of text organization, Text 8(3), 243–281. Marinov, M. and Zheliazkova, I.: 2005, An interactive tool based on priority semantic networks, Knowledge-Based Systems 18(2–3), 71–77. Martinet, A.: 1965, La linguistique synchronique: études et recherches, Presses universitaires de France, Paris. Mauldin, M. L.: 1984, Semantic rule based text generation, Proceedings of the 10th International Conference on Computational Linguistics, Association for Computational Linguistics, Morristown, NJ, pp. 376–380. Maybury, M. T.: 1989, GENNY: A knowledge-based text generation system, Information Processing and Management 25(2), 137–150. McCoy, K. F. and Strube, M.: 1999, Generating anaphoric expressions: Pronoun or definite description?, in D. Cristea, N. Ide and D. Marcu (eds), e Relation of Discourse/Dialogue Structure and Reference, Association for Computational Linguistics, New Brunswick, New Jersey, pp. 63–71. Meehan, J. R.: 1977, TALE-SPIN, an interactive program that writes stories, Proceedings of the Fih International Joint Conference on Artificial Intelligence, Cambridge, Massachusetts, pp. 91–98. —— 1981, TALE-SPIN, in R. C. Schank and C. K. Riesbeck (eds), Inside Computer Understanding, e Artificial Intelligence Series, Lawrence Erlbaum Associates, Hillsdale, NJ. Melčuk, I., Clas, A. and Polguère, A.: 1995, Introduction à la lexicologie explicative et combinatoire, Duculot, Louvain-la-Neuve. Merlo, P. and Van der Plas, L.: 2009, Abstraction and generalisation in semantic role labels: PropBank, VerbNet or both?, Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP, Suntec, Singapore, pp. 288–296. Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D. and Miller, K. J.: 1990, Introduction to WordNet: An on-line lexical database, International Journal of Lexicography 3(4), 235–244. Mitton, R.: 1986, A description of the files cuvoald.dat and cuv2.dat, the machine usable form of the Oxford Advanced Learner’s Dictionary. Technical Report, Oxford Text Archive, Oxford. Montague, R.: 1974a, English as a formal language, Formal Philosophy: Selected Papers of Richard Montague, Yale University Press, New Haven, pp. 188–221. —— 1974b, Formal Philosophy: Selected Papers of Richard Montague, Yale University Press, New Haven. —— 1974c, e Proper Treatment of Quantification in Ordinary English, in R. omason (ed.), Formal Philosophy: Selected Papers of Richard Montague, Yale University Press, pp. 247–270.
242
Bibliography
Moore, J. D. and Swartout, W. R.: 1989, A reactive approach to explanation, Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, Vol. 2, pp. 1504–1510. Moss, L. S. and Tiede, H.-J.: 2006, Applications of modal logic in linguistics, in P. Blackburn, J. F. van Benthem and F. Wolter (eds), Handbook of Modal Logic, Elsevier Science. Nelligan, É.: 1952, Oeuvres complètes, texte établi et annoté par Luc Lacourcière, Fides, Montréal. O’Donnell, M.: 1995, Sentence generation using the Systemic Workbench, Proceedings of the Fih European Workshop on Natural Language Generation, Leiden, e Netherlands, pp. 235–238. O’Donnell, M., Mellish, C. and Oberlander, J.: 2001, ILEX: An architecture for a dynamic hypertext generation system, Natural Language Engineering 7(3), 225–250. Ogden, C. K. and Richards, I.: 1923, e Meaning of Meaning: A Study of the Influence of Language upon ought and of the Science of Symbolism, Harcourt, New York. Palmer, M., Gildea, D. and Kingsbury, P.: 2005, e Proposition Bank: A corpus annotated with semantic roles, Computational Linguistics 31(1), 71–106. Partee, B.: 2011, Montague grammar, in J. van Bentham and A. ter Meulen (eds), Handbook of Logic and Language, Elsevier, pp. 3–94. Partee, B. H.: 1972, Some transformational extensions of Montague grammar, in B. H. Partee (ed.), Montague Grammar, Academic Press, pp. 51–76. Patten, T.: 1988, Systemic Text Generation as Problem Solving, Studies in Natural Language Processing, Cambridge University Press, New York, NY. Pavel, T.: 1986, Fictional Worlds, Harvard University Press, Cambridge, MA. Pearsall, J. and Trumble, B.: 2002, e Oxford English Reference Dictionary, 2nd edn, Oxford University Press, Oxford, England. Peeters, B.: 2000, e Lexicon-Encyclopedia Interface, Elsevier, Amsterdam. Peters, Stanley and Westerståhl: 2006, Quantifiers in Language and Logic, Clarendon Press, Oxford. Pinker, S.: 2000, Words and Rules: the Ingredients of Language, Perennial, New York. Pottier, B.: 1992, Sémantique générale, Presses universitaires de France, Paris. Prévost, abbé: 1742, L’Histoire du chevalier Des Grieux et de Manon Lescaut, Arkstée et Merkus, Amsterdam-Leipzig. Propp, V.: 1968, Morphology of the Folktale, 2nd edn, University of Texas Press, Austin, TX. Pustejovsky, J.: 1991, e generative lexicon, Computational Linguistics 17(4), 409–441. —— 1995, e Generative Lexicon, e MIT Press, Cambridge, Massachusetts. Queneau, R.: 1947, Exercices de style, Gallimard, Paris. Ramakrishnan, R. and Gehrke, J.: 2003, Database Management Systems, 3rd edn, McGraw-Hill, New York. Raskin, V.: 1985, Semantic Mechanisms of Humor, Reidel, Dordrecht.
Bibliography
243
Reichenbach, H.: 1947, e tenses of verbs, in S. Davis and B. S. Gillon (eds), e Elements of Symbolic Logic, MacMillan, pp. 526–534. Reiter, E. and Dale, R.: 2000, Building Natural Language Generation Systems, Cambridge University Press, Cambridge. Robertson, T.: 2011, Essential vs. accidental properties, in E. N. Zalta (ed.), e Stanford Encyclopedia of Philosophy. is is an e-resource found at http://plato.stanford.edu. Robinson, Alan and Voronkov, A.: 2001, Handbook of Automated Reasoning, NorthHolland Publishing Company, Amsterdam. Rosch, E.: 1975, Cognitive representations of semantic categories, Journal of Experimental Psychology: General 104(3), 192–233. Rosetta, M. T.: 1994, Compositional Translation, Kluwer Academic Publishers, Dordrecht. Ruppenhofer, J., Ellsworth, M., Petruck, M. R. L., Johnson, C. R. and Scheffczyk, J.: 2006, Framenet ii: Extended theory and practice, http://framenet.icsi.berkeley.edu/book/ book.html. Savage, J., Minami, Y., Negrete, M., de la Cruz, R., Matamoros, M., Ayala, F., Figueroa, I., Dorantes, F. and Sanabra, L.: 2009, Team description paper for robocup@home 2009, Technical Report, PUMAS-Mexico. Schank, R. and Abelson, R.: 1977, Scripts, Plans, Goals and Understanding: An Inquiry into Human Knowledge Structures, Lawerence Erlbaum Associates, Hillsdale, NJ. Schank, R. C.: 1972, Conceptual dependency: A theory of natural language understanding, Cognitive Psychology 3(4), 532–631. —— 1975, Conceptual dependency theory, Conceptual Information Processing, Vol. 3 of Fundamental Studies in Computer Science, North-Holland Publishing Company, Amsterdam, e Netherlands, pp. 22–82. —— 1991, Tell Me a Story: A New Look at Real and Artificial Memory, Charles Scribner, New York. Selberg, A.: 1949, An elementary proof of the prime number theorem, Annals of Mathematics 50, 305–313. Soanes, C. and Hawker, S.: 2005, Compact Oxford English Dictionary of Current English, 3rd edn, Oxford University Press, Oxford. Sowa, J.: 1987, Semantic networks, in S. C. Shapiro, D. Eckroth and G. A. Vallasi (eds), Encyclopedia of Artificial Intelligence, Vol. 2, John Wiley & Sons, New York, NY, pp. 1011–1024. Swartout, W. R.: 1982, GIST English generator, AAAI, pp. 404–409. Szabó, Z. G.: 2008, Compositionality, in E. N. Zalta (ed.), e Stanford Encyclopedia of Philosophy. is is an e-resource found at http://plato.stanford.edu. Talmy, L.: 2000, Toward a Cognitive Semantics, MIT Press, Cambridge, MA. omas, C.: 2002, A Prosodic Transcription Mechanism for Natural Language Generation, PhD thesis, Queen’s University. —— 2010, e Algorithmic Expansion of Stories, PhD thesis, Queen’s University.
244
Bibliography
Truss, L.: 2003, Eats, Shoots & Leaves: e Zero Tolerance Approach to Punctuation, Profile, London. Turing, A.: 1936, On computable numbers, with an application to the Entscheidungsproblem, Proceedings, London Mathematical Society, pp. 230–265. Ullmann, S.: 1962, Semantics: An Introduction to the Science of Meaning, Blackwell, Oxford. Vachek, J.: 1967, A Prague School Reader in Linguistics, Indiana University Press, Bloomington. Van der Meer, J.: 2009, Google translation toolkit, TAUS. http://www.translationautomation.com/technology/google-translation-toolkit.html Van Eijck, J. and Unger, C.: 2010, Computational Semantics with Functional Programming, Cambridge University Press, Cambridge. Vander Linden, K., Cumming, S. and Martin, J.: 1992, Using system networks to build rhetorical structures, in R. Dale, E. Hovy, D. Rösner and O. Stock (eds), Aspects of Automated Natural Language Generation; Proceedings of the 6th International Workshop on Natural Language Generation, Springer-Verlag. Vinay, J. and Darbelnet, J.: 1958, Stylistique comparée du français et de l’anglais : Méthode de traduction, Didier, Paris. Waltz, D. and Goodman, B.: 1977, Planes: a data base question-answering system, ACM SIGART Bulletin (61), p. 24. Wierzbicka, A.: 1996, Semantics: Primes and Universals, Oxford University Press, New York, NY. Wiles, A.: 1995, Modular elliptic curves and Fermat’s Last eorem, Annals of Mathematics (141), 443–551. Winograd, T.: 1983, Language as a Cognitive Process: Syntax, Addison-Wesley Longman Publishing Co., Inc., Boston, MA. Yngve, V.: 1964, Implications of mechanical translation research, Proceedings of the American Philosophical Society, 108(4), 275–281.
Index
- operator 131 . operator 84, 108 ~ operator 227 a_bit_of 123 ACT-R 74 action aider 46 actions 79, 122 modi cation of 83, 97–8 actual parameters 152, 166 adjustments 83, 95–7, 107–9, 117 bivalent 154 compound 108 default 155 propagation of 157 univalent 157 alist 90, 153, 160 all 126 although 99 ambiguity 78, 101, 104 deliberate 78 anaphora 37, 52, 70, 146–9 and 155 array 169–71 block 170, 175 one-dimensional 167 two-dimensional 167 aspect 136 accomplished 136 backgrounded 136 conjectured 136 associated 86 associativity 156 attribute 107–9, 227 compound 109, 111 boolean 128 c_circ 93 case frame 45 case grammar 44–6
causality 99, 143, 188–9 cause 99, 103, 151 Chomsky grammars 106 chronological, order 138, 155 sequence 99, 144 Church-Turing thesis 105 circ 93 circumstances 79, 90–3, 122 constant 151 modifying actions 90 modifying circumstances 93 modifying completions 90, 93 modifying other circumstances 91 modifying quali ers 91, 93 clist 90, 153 co-referential relation 146 coercion 84, 93, 112 cognitive linguistics 25 cognitive semantics 25 commutativity 156 comp_circ 93 completions 79, 122 combining 98–100 converting to circumstances 100 modifying 83, 93–5, 97–8 qualifying an entity 86 componential analysis 33 composition 204 compositionality 21 conceptual dependency 46–8 conceptual structures 35–52 cond 98 connective 154, 156 constant 149 action 151 circumstance 151 entity 148, 161 quali er 150 context-free grammar 119
246 control mechanism 197 count entity 122, 130 coverage 28–30 CYC 73 dateat 92 demonstrative 145 determiner 128 de nite 131, 147 inde nite 131, 145 Discourse Representation eory 65–7 donkey anaphora 147–9 dot-operator see operator double articulation 21 dramatis personae 221 durative 137 durative activity 140 dynamic database 195, 197, 234 e_rel 86 effectively computable function 105 elaboration 187, 204, 231, 233 elist 90, 153 emphasis 125 encyclopedic information 116, 227 encyclopedic knowledge 116, 188 entity 79 attaching quali er to 84 count-, 176 mass-, 176 non-speci c 145 qualifying 82 qualifying by completion 86 speci c 145 EREL 161 event_durative 142 event_punctual 142 EVERY 130 extension operator 55 favour 202 Fermat’s Last eorem 75 rst order predicate calculus 52–64 xed type 78 xed-valency rule 110, 156 fn_circ 94 formal parameters 152, 166 fractal phenomena 22 frame 41 FrameNet 41 function 53, 112
Index application 109, 204 de nition 114–16 formal 114 informal 114 name 107–9 parameter 80, 111 returning function 94 type 108 value 108 function library 226 functional programming 165 functional programming language 76, 104 functional system 191 functions, 0-valent 115 adjusted 109 lexical 205 list 153, 159 narrative 226, 229 pattern 116 quantization 177 general recursive function 105 generative lexicon 43–4 give 101 global 153 granularity 4, 21, 123, 129, 202 guide book 198 half_of 123 happening_now 108 Haskell programming language 76, 104 headword 227 hierarchy 9, 27, 76, 191 identi er 107 identity 117 identity operator 58, 59 imperatives 179 imperative 179 imperative programming language 104–6 implication 117 implicit type change 85 in_order_that 99 incremental poems 161 inheritance networks 39 instruction manual 199 intended meaning 186–7 intension operator 56, 63 intensional logic 52–64 interactive ction 191
Index interlingua 5 interrogatives 132 IS–A relationship 40 kind_of 122 KL-ONE 71 knowledge base 10 dynamic 190, 232 global 9 local 10, 190 static 190 knowledge representation 71–3 knowledge retrieval 11 lambda operator 55 lambda-conversion 55 language de ner 77, 120 language function, conative 19 emotive 19 metalinguistic 19 phatic 19 poetic 19 referential 19 language independence 78 leaf 110 length issues 224 lexical creativity 117 lexical function 205 lexical semantic relations 39 antonymy 40 hyponymy 39 synonymy 40 lexical semantics 32–4 lexicography 32–3 lexicon size 117 list 90, 153 constructor 90, 153 function 159 head of 159 tail of 159 listener 177, 181 literal meaning 186–7 literary allusion 197 local 153 local constant 119 local semantic lexicon 227 logical conjunction 154 logical disjunction 154 lovers’ paradox 165, 167, 169, 177
247
machine translation 5 statistical 7 mass 122 mass entity 122, 176 metalanguage 15 metaphor 114 modal combination 136 modal verb 137 modality 136–8 most_of 123 multi-type constant 81 must 138 narrative functions 226, 229 narrative prose generation 69–70 narrative structure 69–70 natural language 3, 225 generation see under NLG generator see under NLG realization 16 understander see under NLU neg 83, 124 negation 124, 155 nested speech 182 negatives 124 NLG 2–5, 7, 11, 15–16, 46, 52, 64, 90, 101, 103, 108, 112, 115, 118, 119, 145, 149, 151, 165, 172, 178, 183, 190–5, 207, 209, 214, 225, 231 NLU 3–5, 11, 149, 151 NONE 130 nucleus 68 number 128 object language 15 of 86 omission 187, 191 only 127–8, 172–6 operator 83, 108, 131, 154 ord 129 over_and_done 83, 95 overloaded operator 131 OWL 71–2 paragraph 235 param 126 parameter 78 actual 152 formal 152 parameter list 109 parameter type 79
248 pattern matching 159 pattern-style function 79, 116 perspective anthropomorphic 26 minimalist 26 onomasiological 14–16, 75 semasiological 14–16, 27 truth-functional 52–64 picture aider 46 picture producer 46 placenear 92 placeon 92 placewithin 92 plot 202 plot-lines 233–5 possessive 145 practicality 27, 30–1 pronoun 146 anaphoric 146 cataphoric 146 propositional network 49 Proppian fairy tale 116, 226–32 punctual activity 140 puns 197 q_circ 93 qlist 90, 153 qu 133–4 qual 82, 93, 103 qual_is 85, 103, 150 qualia 43 agentive 43 constitutive 43 formal 43 telic 43 quali ers 79, 84–6, 122 constant 150 descriptive 89–90 modifying an entity 82 relative 86–9, 160 restrictive 89–90 quanti cation, existential 61 universal 61 quanti er 124 generalized 62, 121–2 negation 124 number 128 two-dimensional 168–170 universal 61
Index quanti er notation, restricted 61 quantization function 177 queries 11 QUERY 132–3 question 229 real 128 recipe book 199 recursive expressions 90 reference 16 referent 16 re nement 202–4, 233–4 REL 87, 89 relation 228 relationships 189–91 repetition 158 reported 180 resolution 29–30 restr 103 result type 79, 80 Rhetorical Structure eory 67–9 Roman_num 128 rpt 158 rqual 87, 88 RST schema 68 satellite 68 say 180 scalability 30–31, 205 scope 152–3 scripts 69 self-similarity 22 semantic, expression 9, 11, 76, 227 properties of 107–10 syntax of 109–10 validity of 112–14 lexicon 79, 110, 112–19, 180, 205, 227, 229 network 48–9 primitives 34–5 processes 10–11 relations 22–3 role, agentive 45 dative 45 factitive 45 instrumental 45 locative 45 objective 45
Index trait 33 tree 110–12 type 35, 79, 227 units 22–23 seq 99, 138, 151 sequencing 138 set 128, 130–2 difference 130–2 slist 90, 153 some_of 123 speaker 177, 181 spec 103, 145 speech 180–5 staba 79, 80 stabr 80 stepwise re nement 202 story thread 235 string 128 stylistic variation 4 subarray 170 SUMO 72–3 symbol 16 syntax tree 119 systemic grammar 51–2 tale_view 144 temporal pattern diagram 139, 141 tense 108, 136 text structure 65 theory of mind 24 threads 234 time 59, 136 current 60 event 60 reference 60 time 92 timeat 92 timebefore 92
timestamp 100, 141 topos (pl. topoi) 69, 199–204, 232, 235 towards 91 transformations 10–11, 117 translation 6 tree, semantic 112 syntactic 112 Turing machine 105 type 53, 79 hierarchy 40 speci cations 80 variable 95 type-consistency rule 110 type-signature 80, 85, 113 schema 95, 113 unknown 131 UNSPEC 81–83 until 92 user-function 109, 151 using 97 valency 78, 79 validity checking 113 variable type 95 VINCI 108 vocative 179, 180 vtd 79, 80 vtdr 80, 96, 111 well-formedness 112 wh 134–135 when 92–3 when_alt 100 wish 83–4, 96 WordNet 40, 41 yet_to_come 83
249