211 90 19MB
English Pages 298 [296] Year 1988
Categorial Investigations Logical and Linguistic Aspects of the Lambek Calculus
Groningen-Amsterdam Studies in Semantics (GRASS) This series of books on the semantics of natural language contains collections of original research on selected topics as well as monographs in this area. Contributions from linguists, philosophers, logicians, computer-scientists and cognitive psychologists are brought together to promote interdisciplinary and international research. Editors Alice ter Meulen Martin Stokhof
Editorial Board Renate Bartsch University of Amsterdam Johan van Benthem University of Amsterdam Henk Verkuyl University of Utrecht
Other books in this series: 1. Alice G.B. ter Meulen (ed.) Studies in Modeitheoretic Semantics 2. Jeroen Groenendijk, T h e o M.V. Janssen and Martin Stokhof (eds.) Truth, Interpretation and Information 3. Fred Landman and Frank Veltman (eds.) Varieties of Formal Semantics 4. Johan van Benthem and Alice ter Meulen (eds.) Generalized Quantifiers in Natural Languages 5. Vincenzo Lo Cascio and Co Vet (eds.) Temporal Structure in Sentence and Discourse 6. Fred Landman Towards a Theory of Information 7. Jeroen Groenendijk, Dick de Jongh, M a r t i n Stokhof (eds.) Foundations of Pragmatics and Lexical Semantics 8. Jeroen Groenendijk and Martin Stokhof (eds.) Studies in Discourse Representation Theory
All communications to the editors can be sent to: Department of Philosophy or Department of Linguistics, G N 40 University of Amsterdam University of Washington Grimburgwal 10 Seattle, Washington 98195 1012 GA Amsterdam U.S.A T h e Netherlands
Michael Moortgat
Categorial Investigations Logical and Linguistic Aspects of the Lambek Calculus
¥ 1988
FORIS PUBLICATIONS Dordrecht - Holland/Providence Rl - U.S.A.
Published by: Foris Publications Holland P.O. Box 509 3300 AM Dordrecht, The Netherlands Distributor for the U.S.A. and Canada: Foris Publications USA, Inc. P.O. Box 5904 Providence RI 02903 U.S.A.
CIP-DATA
ISBN 90 6765 387 X (Paper) ® 1988 Foris Publications - Dordrecht No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission from the copyright owner. Printed in The Netherlands by ICG Printing, Dordrecht.
To Elisabeth
CONTENTS
Preface
xi
Chapter 1. Generalized Categorial Grammar: the Lambek Calculus 1.0 1.1 1.1.1 1.1.2 1.2 1.2.1 1.2.2 1.2.3 1.3 1.3.1 1.3.2
Introduction The categorial engine Category structure Reduction laws Categorial parsing as implicational deduction The Lambek-Gentzen calculus Cuts and decidability Lambda semantics for Gentzen proofs The categorial hierarchy Options for categorial calculi Polymorphism and unification
1 3 3 10 27 27 31 37 40 40 49
PARTI LINGUISTIC ASPECTS OF THE LAMBEK CALCULUS
Chapter 2. Associativity and Restructuring 2.0 2.1 2.2 2.3 2.4 2.5
Introduction Structural completeness Gentzen proofs: prosodic interpretation Intonational versus morphosyntactic phrasing Morphological bracketing paradoxes Cliticization: English auxiliary and genitive's
57 58 60 65 70 74
Chapter 3. Between L and L P : Discontinuous Dependencies 3.0 Strategies for extending L 3.1 LP additions to the L axiom base 3.1.1 Permutation duals: Lifting, Composition, Substitution 3.1.2 Collapse into LP
81 83 85 90
viii
CONTENTS
3.2 Discontinuity at the lexical level 3.2.1 Complement inheritance 3.2.2 Verb-raising, Dutch versus German 3.2.3 Division versus Composition
94 95 101 104
3.3 Non-concatenative connectives in the sequent calculus 3.3.1 Extraction/Infixation 3.3.2 Extraction Introduction Illustration: unbounded dependencies 3.3.3 Infixation Elimination Illustration: verb-projection raising 3.3.4 Extraction, Infixation: partial logic
108 108 111 113 114 115 121
PART TWO CATEGORIAL PARSING AS GENTZEN DEDUCTION
Outline of Part Two
127
Chapter 4. The Lambek-Gentzen Calculus With Resolution 4.1
4.2
4.3
The Lambek-Gentzen sequent calculus as Horn clause logic Resolution Illustration An interpreter for the Lambek-Gentzen system Gentzen proof trees Full interpreter for the sequent calculus Resolution, unifying substitutions: examples Semantic interpretation for Gentzen proofs Reducing search complexity Deduction trees versus proof trees Pruning the search space: count-invariance Premise selection: degree Conclusion
131 134 136 138 139 144 147 148 150 151 154 160 163
Chapter 5. Polymorphism in the Lambek-Gentzen Calculus 5.0 5.1
Introduction Complete and incomplete search strategies Logical infinity of L
165 166 170
CONTENTS
5.2 5.3
5.4
5.5
Bottom-up proofs in L+{Cut} Extending L with Cut Partial execution of the Elimination rules The Lemma Database: System M Monotonicity Recursive axiomatization: system M M semantics: partial execution Associativity: atomic boundary cases Conclusion Examples Incremental left-associative processing Non-determinism and structural ambiguity Non-constituent coordination Conclusion Flexible semantics for L proofs Argument lifting: L versus LP The quantification calculus H Semantics for Gentzen proofs: L + H Examples Type-shifting, scope ambiguity, non-constituent conjunction Conclusions
Appendix References Index of Names Subject Index
ix
174 111 180 182 184 186 189 192 199 201 202 211 215 220 221 223 225 226 229 243
247 265 271 273
PREFACE
This book grew out of my interest in different facets of categorial grammar, which covers a period of five years. The structure of the book shows traces of its derivational history. Chapters 3 and 2 are based in part on Moortgat (1988a) and (1988c), which were originally written in 1985 and 1986. I thank D. Reidel Publishing Company for the permission to use a section from my (1988a). In the original papers the emphasis is on linguistic analysis; in the present context, morphosyntactic phenomena are adduced to illustrate the consequences of the Lambek approach on global grammatical architecture. For more extensive argumentation, and analysis of numerous additional phenomena, the reader can consult the original papers. It is a pleasure to express my gratitude to all those who helped me, first of all to Flip Droste and Johan van Benthem. Flip Droste introduced me to linguistics and brought me into contact with Montague's writings, so he bears a heavy responsibility for the course I have taken. Johan van Benthem's inspiring work incited me to study the Lambek systems, and the logical perspective on categorial derivability. His guidance during the preparation of this book, his untiring comments and stimulating suggestions have determined the final shape to a large extent, and the book would have gained a lot in terms of soundness and completeness if I had incorporated more of his advice. Acknowledging indebtedness to a large group of people poses interesting problems of presentation. I will stay close to the format of the Homeric Catalogue of Ships, and keep them geographically ordered. I migrated to Holland because the intellectual climate here struck me as ideal for the study of linguistics in general and formal semantics in particular. I got convinced of this on the train, while reading Thomason (1974), when a casual fellow-traveler invited me to briefly explain (between Haarlem and Amsterdam) the gist of Montague's Formal Philosophy. But why Leiden? It is well known that Einstein, when he was looking for a job, preferred Princeton to Leiden, after having been informed that at the latter University, the transition between life and death is almost unnoticeable. Thanks to Teun Hoekstra and Harry van der Hulst I found out that from a linguistic point of view it can be a very lively place. Over the years, we collaborated on many enterprises and I greatly benefited from their stimulating company. Besides that, they are invaluable informants on disciplines not covered in this book. My interest in flexible categorial grammar stems originally from the categorial developments emanating, at that time, from Groningen:
xii
PREFACE
the work of Gosse Bouma, Jan van Eijck, Jack Hoeksema, Elias Thijsse, Ron van Zonneveld, and Frans Zwarts has deeply influenced my thinking. Alice ter Meulen and Theo Janssen encouraged me to investigate the consequences of compositionality for the study of morphology and the lexicon-syntax interface; a concern for compositionality, in empirically interesting forms, is the cantus firmus underlying the following chapters. The participants of the Amsterdam colloquium 'Montague grammar and related topics' provided me with valuable feedback on embryonic forms of the material discussed here, and brought me into contact with many 'related topics' from which I learned a lot. The Dutch Lexicological Institute (INL) provided optimal facilities for writing this book. But the grand time scale of the Institute (its main project, the Dutch historical dictionary, is measured in centuries, and classifies its editors by generations) is not without dangers, when one is engaged in a quickly developing field of research: it is of vital importance to get occasional forceful external stimuli, in order to write down things that would no doubt be better articulated if I had thought them over for an extra five years. While working on these investigations I twice had the chance of visiting the States on the occasion of the LSA Linguistic Institute, at UCLA in 1983 and at Stanford in 1987, thanks to the financial support of the INL, the Dutch Organization for the Advancement of Pure Research (ZWO, now NWO), and CELEX. This book has benefited in many ways from the teaching of and/or stimulating conversations with people I met during these visits, especially Emmon Bach, Bob Carpenter, Gerald Gazdar, Abel Gerschenfeld, Ed Keenan, Ewan Klein, Glynn Morrill, Remo Pareschi, Barbara Partee, Carl Pollard, Tom Roeper, Ivan Sag, Sue Schmerling, Stuart Shieber, Therese Torris, Hans Uszkoreit, Susan Warwick, Kent Wittenburg and Mary Wood. I am particularly indebted to Philip Miller, Dick Oehrle, Mark Steedman and Anna Szabolcsi for valuable comments and constructive criticism of earlier versions of various chapters which led to many improvements in style and content. Equally important were the categorial conferences held in Tucson in 1985 and in Amsterdam in 1987: I thank the organizers for giving me the opportunity to present the germs of Part One and Two, respectively. In 1987 I had the occasion to treat parts of this book as a guest lecturer at the Universities of Tilburg and Leiden. I thank Jan van Eijck and Harry van der Hulst for making this possible. The first chapter is the result of the courses I taught, and owes a great deal to the alertness of the people who attended them. During the last two years, a number of ideas from Part Two have been implemented for the morphosyntactic analysis of the INL text corpus, a project executed in cooperation with the Nijmegen Centre for Lexical Information (CELEX). Implementation of immaculate
PREFACE
xiii
theoretical ideas could be a horrible shock — thanks to Dirk Heylen and Ton van der Wouden it was a challenging experience which in turn generated many fruitful ideas. I thank them also for expertly shielding me from every-day practical problems while I was absorbed in the last stages of writing this book. Part Two got its final shape while I was collaborating on a categorial parsing project with the ITI group of TNO (Brigit van Berkel, Erik-Jan van der Linden and Adriaan van Paassen) and owes a lot to their enthusiasm and penetrating feedback. Colin Ewen helped me at a critical point when proportional spacing threatened to turn the many derivation trees into utter chaos. Real life support is at least as important as the academic encouragement above. I thank my parents for stimulating me from the start to pursue these exotic studies. My children, Joachim and Judith, although they find my year-long preoccupation with such a simple thing as fractions a bit amusing, discovered many unexpected graphical possibilities of Gentzen proofs. Without the continual support of Elisabeth I wouldn't have started this book (and could never have finished it): it is dedicated to her. Leiden, August 1988
CHAPTER 1 GENERALIZED CATEGORIAL GRAMMAR: THE LAMBEK CALCULUS
1.0. INTRODUCTION
Generalized Categorial Grammar belongs to the broad family of current linguistic theories that instantiate the program of modeltheoretically interpreted surface syntax laid out in Montague's Universal Grammar (Chapter 7 in Thomason 1974). Here are some key ideas that delineate the categorial research from a number of close relatives such as Generalized Phrase Structure Grammar (GPSG, Gazdar, Klein, Pullum & Sag 1985), Head-Driven Phrase Structure Grammar (HPSG, Pollard & Sag 1988) or standard Montague Grammar of the PTQ variety. • Laacalism. Surface-oriented theories of grammar show a common tendency of shifting the explanatory burden from the syntactic component to the lexicon. For example, by developing a richer notion of category structure, GPSG eliminates the transformational component of classical generative grammar. Categorial Grammar takes this move towards lexicalism a step further, and eliminates the phrase structure component itself. Syntactic information is projected entirely from the category structures assigned to the lexical items. In its most pure form, Categorial Grammar identifies the lexicon as the only locus for language-specific stipulation. The syntax is a free algebra: a universal combinatorics driven by the complex category structures. • Function-argument structure. The specific categorial contribution to the theory of categories is that incomplete expressions are modelled as functors, syntactically and semantically. The basic dependencies between expressions that determine phenomena such as government, control and agreement are defined on the functionargument hierarchy, rather than on structural configurations. • Flexible constituency. Classical Categorial Grammar, like Phrase Structure Grammar, assigns a unique constituent structure to a non-ambiguous expression. The generalized categorial theories replace this rigid notion of constituency by a flexible one by introducing a richer inventory of combinatory operations in the form of a typeshifting calculus. A non-ambiguous expression is associated with a set of equivalent derivations. Generalized Boolean coordination serves as an experimental technique that brings the hidden alternative derivational subconstituents to the surface.
2
CHAPTER 1.0
• Compositionality. The relation between the syntactic algebra and the semantic algebra is a homomorphism, i.e. a structure-preserving mapping, with a semantic operation corresponding to each syntactic operation. Classical Categorial Grammar embodies a strong form of compositionality based on the correspondence between the central syntactic reduction rule and functional application in the semantics. The generalized categorial systems extend the strong form of compositionality to the calculus of type-change, thus realizing the program of type-driven interpretation. In this book, we study a standard system among the flexible categorial theories currently under investigation: the type calculus of Lambek (1958, 1988). The present chapter is intended as a general introduction to the Lambek calculus, and as an exposition of the research questions that will be addressed in the remaining chapters. The Lambek calculus replaces the set of categorial reduction laws that have been proposed in the literature (Application, Composition, Lifting, etc.) by a general notion of derivability, with respect to which the reduction laws have the status of theorems. Derivability is characterized in the form of a sequent axiomatization, which reduces categorial derivations to logical deductions on the basis of the proof procedure originally developed by Gentzen in his work on the intuitionistic propositional calculus. We present this parsing-asdeduction perspective in Section 1.2, and introduce the linguistic motivation for a flexible categorial approach in Section 1.1. Section 1.3 then localizes the Lambek calculus within the broader perspective of the categorial hierarchy, contrasting its properties with those of a number of weaker and stronger calculi. Since the text of this introductory chapter is dense with definitions and examples, references have been collected as much as possible into paragraphs of suggested readings for the main sections that will help the reader to further explore the discussed material on the basis of easily accessible seminal works, and to identify lines of research that are not dealt with here.
THE LAMBEK CALCULUS
3
1.1. THE CATEGORIAL ENGINE 1.1.1. CATEGORY STRUCTURE
Categorial grammar projects the information usually encoded in phrase structure trees onto the internal structure of the categories assigned to the lexical items, thus eliminating the need for an explicit phrase structure component. In order to carry out this elimination, the category system provides an infinite supply of possible category objects, recursively construed out of two small finite sets, a set of basic categories (atoms), and a set of categoryforming connectives. The categories are also called (syntactic) types. Where necessary, we will take care to distinguish syntactic types (categories), i.e. elements from the syntactic algebra, from semantic types, their counterparts in the semantic algebra. Definition 1.1.1 Let BASCAT be a finite set of atomic categories and CONN a finite set of category-forming connectives. Then CAT is the inductive closure of BASCAT under CONN, i.e. the smallest set such that (i) BASCAT is a subset of CAT, and (ii) if X,Y are members of CAT and | is a member of CONN, then (X | Y) is a member of CAT Example 1.1.1 Let BASCAT be {S, N, NP, AP, PP} and CONN {/, • ,\} (right-division, product and left-division, respectively). The following are members of CAT: NP, (NP\S), (S/NP), ((NP-AP)\S), (S/(AP\(NP\S))),... We limit our attention in this study to directional category systems. Categorial types are interpreted as sets of expressions, more specifically, as subsets of the set S obtained by closing the set of lexical items under concatenation. Basic categories identify a small number of such subsets of expressions. In order to single out arbitrary subsets of S, we have at our disposal the three type-forming connectives, inductively characterizing the infinite set of possible types. The interpretation of complex categories is fixed by the following definitions for the type-forming connectives. Definition 1.1.2 Interpretation of the type-forming connectives A«B = {xyeS | x e A & y e B } [Def-] C/B = { x e S | V y e B, xy e C} [Deft] A\C = j y e S | V x e A, xy e C} [Def\]
4
CHAPTER 1.1
A complex category (X | Y) consists of three immediate subcomponents: the subtypes X and Y, which are themselves categories, and the connective. The product connective is the concatenation operator, i.e. an expression belongs to a product category (X* Y) if it is the concatenation of an expression of category X and an expression of category Y, in that order. The division connectives form functor categories. A functor category (X|Y) is associated with an incomplete expression: it will form an expression of category X in combination with an expression of category Y, or an expression of category Y in combination with an expression of category X. The orientation of the connective indicates the mode of combination of a functor category: in the case of left-division the argument of the functor has to be found to the left; in the case of right-division, the functor looks for its argument to the right. Observe that the notation for functor categories is symmetrical: the argument always appears under the fraction sign. The following definitions fix the intended interpretation for the domain (argument) and range (value) subcomponents of functor categories. Definition 1.13 Domain, Range A functor X/Y combines with an argument Y to the right to form an X. A functor Y\X combines with an argument Y to the left to form an X. dom(X/Y) = dom(Y\X) = Y; ran(X/Y) = ran(Y\X) = X. The above interpretation of the type-forming connectives immediately yields the fundamental laws of the arithmetics governing the combination of categories, as shown in Lambek (1988). Concatenation is an associative operation. But unlike numerical multiplication, categorial concatenation is non-commutative. Therefore, the product operator has a left-inverse and a right-inverse, the operations of left-division and right-division. The axioms below state these fundamental laws with respect to the basic relationship between types seen as sets of expressions: the inclusion relation '£'. We will fully explore the consequences of these laws in Section 1.2. Definition 1.1.4 Concatenation and its inverses A • (B • C) £ (A • B) • C and (A • B) • C £ A • (B • C) [Axiom 1] A • B £ C if and only if A £ C/B [Axiom 2] A • B £ C if and only if B £ A\C [Axiom3] Figure 1.1.1 illustrates how the recursively structured category objects mirror phrase structure information, and encode the language-specific properties of lexical items as to directionality requirements. Consider first some VP expansions for verb-final Dutch embedded clauses. An intransitive verb constitutes a complete VP. A
THE LAMBEK CALCULUS
5
transitive verb is an incomplete expression that will project a VP if it finds a direct object to its left: hence the category assignment NP\VP. Similarly for other arguments, such as the PP argument of liggen 'to lie' which leads to the category assignment PP\VP for this verb. A verb such as leggen 'to put' has two arguments in the VP: it first combines with a PP complement, then with an NP object to form a VP. There is an alternative way of encoding the combinatory properties of the verb, using a product argument: one can equally categorize leggen as a verb looking to the left for the concatenation of an NP expression and a PP expression to form a complete VP; these alternative categorizations would correspond to binary branching versus flat VP structure in a phrase structure model. In the following section we will see that these different assignments are equivalent in a precise way. For the S expansion rule, we are faced with a choice again. We could say that VP combines with an NP expression to the left to form an expression of type S, and substitute (NP\S) for VP in the assignments seen before. This view in fact assigns the verb the role of head of S: S expressions are projected from verbs. But we could also say that the subject NP will give an S, when combined with a verb phrase (NP\S) to the right. Again, we will explore this double perspective on categories in detail in the next section. PSG rules
CG types
VP -> Vx
slapen => VP 'sleep'
VP
NP V-2
eten 'eat'
VP
PP V-3
liggen => PP\VP 'lie'
VP
NP PP V4
leggen 'put'
NP\VP
PP\(NP\VP) (NP • PP)\VP
VP -> NP AP V,5
S
NP VP
maken =» AP\(NP\VP) 'make' (NP • AP)\VP VP=> NP\S NP=s> S/(NP\S) Figure 1.1.1
6
CHAPTER 1.1
The discussion of Figure 1.1.1 suggests that the category objects allow for a strictly local characterization of some further notions that are defined on tree configurations in phrase-structure approaches, namely the notions head and modifier, and the related structuralist concepts of exocentricity and endocentricity. Functors come in two varieties, depending on whether the domain equals the range category or not. A functor X/Y (Y\X) projects a result expression X in combination with an argument Y, i.e. the distribution of the functor and the resultant combination is different. We call such functors the head of an exocentric construction with the argument category Y as the complement. A functor X/X (X\X) combines with an X argument to give a result of the same type X, i.e. with the same distributional properties as the argument expression. These functors, then, are modifiers of a head X in an endocentric construction. Note that it follows that modifiers are optional and that they can be stacked, contrary to heads (a nice blue coat versus *John sleeps swims).
CATEGORY GRAPHS
A graph representation for category structures which makes the distinction between the subcomponents explicit can take the form of an attribute-value graph. In this representation the arcs are labeled with attributes (features), and the nodes with values for these features. The arc labels annotate the category structure with an explicit interpretation of the parts. To represent the structure introduced so far, we could use the features CONNECTIVE, BASCAT, DOMAIN, RANGE, FIRST and LAST. The values for the features CONNECTIVE and BASCAT are atomic, taken from the set of category forming connectives and basic categories respectively. DOMAIN, RANGE (for functor categories) and FIRST, LAST (for product categories) have themselves category structures as values. Below are some examples. The type of Example 1.1.2, as we just saw, would be assigned to a verb like leggen (put); the type of Example 1.1.3 contains expressions consisting of the concatenation of a noun phrase and a preposition (e.g. John in). Observe that expressions like the latter, that would be condemned to non-constituent status in a phrase-structure approach, are equal-rights citizens of our categorial universe.
THE LAMBEK CALCULUS
Example 1.12 (PP\(NP\(NP\S)))
Example 1.13 (NP • (PP/NP))
-CONNECTIVE: \
-CONNECTIVE:
-DOMAIN
-FIRST
—BASCAT: PP -RANGE
—BASCAT: NP —LAST
—CONNECTIVE: \ DOMAIN
-CONNECTIVE: / —DOMAIN JVJlVLt
L
—BASCAT: NP RANGE
-BASCAT: NP
—RANGE LANG
-CONNECTIVE: \
L
-BASCAT: PP
—DOMAIN I—BASCAT: NP -RANGE I—BASCAT: S
GENERAL PROPERTIES OF TYPES
For future reference, a number of general properties of types that will play a role in the sequel are collectively defined below. We do not elaborate on the definitions in the present section, since these notions will be fully discussed and exemplified in their proper context later on. The reader who is anxious to see the category system put to work can immediately turn to the next section, and revert to this paragraph for elucidation when the need arises. The definitions have a common recursive structure, mirroring the recursive structure of complex categories. The base case defines the relevant property for the basic categories. The recursive clauses perform induction on the subcomponents of a complex type.
8
CHAPTER 1.1
Definition 1.1.5 Subtypes subtype(X) = { X } , if X e BASCAT subtype(X/Y) = {(X/Y)} u subtype(X) u subtype(Y) subtype(Y\X) = { ( Y \ X ) } u subtype(X) u subtype(Y) subtype(X • Y ) = { ( X - Y ) } u subtype(X) u subtype(Y) Example subtype((AP\(NP\S))) = {(AP\(NP\S)),AP,(NP\S),NP,S} Definition 1.1.6 Degree d(X) = 0, if X e BASCAT d(X/Y) = 1 + d(X) + d(Y) d(Y\X) = 1 + d(X) + d(Y) d ( X - Y ) = 1 + d(X) + d(Y) Example d((NP\S)/(NP\S)) = 3 The degree (or complexity) of a type is the number of type-forming connectives in it. The notion of degree can be generalized to sequences of types and derivations. The degree of a sequence of types is the sum of the degrees of the elements in that sequence: dflX^.^XJ) = d(X x ) + ... + d(X n ). Similarly, the degree of a derivation T=>X is the sum of the degree of the input sequence T and that of the resulting type, i.e. d(T=>X) = d(T) + d(X). The concepts of subtype and degree can be easily visualized on the basis of a graph representation for complex categories, where branching nodes are labeled with the connectives, and terminal nodes with the basic categories. Figure 1.1.2 gives the graphs for the category S/(AP\(NP\S)) and (NP\S)/(NP\S). The subtypes correspond to the subtrees in the graph. The degree equals the number of branching nodes.
S NP
S Figure 1.12
The above definitions do not discriminate between the domain and range components of functor categories. The notions of order and category count treat domain and range asymmetrically. For these definitions, the domain subtype is the negative component of a category, and the range subtype the positive component.
THE LAMBEK CALCULUS
9
Definition 1.1.7 Order order(X) = 0, if X e BASCAT order(X/Y) = max(order(X), order(Y) + 1) order(Y\X) = max(order(X), order(Y) + 1) order(X-Y) = max(order(X), order(Y)) Example order(S/(AP\(NP\S)) = 2; order(AP\(NP\S)) = 1 The order of a functor category is 1 plus the order of the domain subtype, if the order of the domain subtype is greater than or equal to that of the range subtype, otherwise it is the order of the range subtype. For product categories, where the distinction between positive and negative component does not apply, the order is the maximal order of the immediate subtypes. Definition 1.1.8 Count count(X,X) = 1, if X e BASCAT count(X,Y) = 0, if X,Y e BASCAT, X * Y. count(X,27Y) = count(X,Z) - count(X,Y) count(X,Y\Z) = count(X,Z) - count(X,Y) count(X,Y • Z) = count(X,Y) + count(X,Z) Example
count(NP,(NP\S)) = -1; count(S,(NP\S)) = 1 count(NP,(S/(NP\S))) = 1; count(S,(S/(NP\S))) = 0
This function counts positive (range) minus negative (domain) occurrences of a basic category X in an arbitrary category, basic or complex. Generalized to sequences of categories, the X-count of a sequence is the sum of the X-counts of the elements in the sequence, i.e. count(X,[Y1,...,Yn]) = coimt(X,Y1) + ... + count(X,Yn). BACKGROUND
The category system, as presented, is a simplification in many respects. First, in order to deal with phenomena of inflectional morphology, the unanalysed basic categories must be further decomposed into feature-value sets, as is standardly done in feature theories. Work by Bach (1983a,b, and papers cited there) has shown how the feature decomposition of basic categories can be added to the recursive build-up of complex categories, which is essential to characterize their combinatorial potential. This work also demonstrates how morphological phenomena of agreement and government can be related to the function-argument structure. In this study, we will largely ignore feature decomposition, and limit ourselves to combinatorial issues. Second, the directionality of functors can be
10
CHAPTER 1.1
derived from general ordering conventions rather than stipulated in each category, as demonstrated in Flynn (1983). Third, as in most feature theories, we would like to associate categories with partial information whenever fully specified information is unavailable. Partiality of information is a central feature of unification-based versions of categorial grammar. We return to partiality of categorial information and unification in Section 3.2, and in the second part of this book. For the origins of the analytic machinery of Definitions 1.1.5 to 1.1.8 Buszkowski (1988) and Van Benthem (1988a) are comprehensive sources. Category graphs, as a theory-neutral representation of structured category objects, are discussed in Gazdar et al. (1987).
1.1.2. REDUCTION LAWS
In the previous section we became acquainted with the objects that populate the categorial universe: the recursively structured types. Let us investigate now what we can do with these objects, i.e. how we can combine the types into larger constituents on the basis of a system of reduction rules. We will focus on flexible categorial systems. The flexible reduction systems generalize the basic cancellation operation of orthodox categorial grammars, i.e. the Application rule which combines a functor with domain Y and range X with a suitable argument of category Y to give an X, with corresponding functional application in the semantics. The catalogue R1-R6 below constitutes a representative set of generalized reduction laws that have been proposed in the literature. The aim of the present section is to introduce the intuitive motivation behind the flexible notion of categorial derivability. To that end, we first present the reduction laws and their interpretation informally as primitives, i.e. as axiom schemes. In Section 1.1.2, the notions of reduction and derivability will be studied from a more fundamental perspective. Let us introduce some notational conventions. Since we are dealing with directional systems, the reduction schemes come in symmetric doubles for right- and left-division. The intended interpretation of type combinations and type transitions is given in tandem with their syntax, as compositionality requires. We write Typel: Semi, Type2: Sem2 => Type3: Sem3 if a category Typel with interpretation Semi together with a category Type2 with interpretation Sem2 reduces to a category Type3 with interpretation Sem3\ similarly for the unary type transitions Typel: Semi => Type2: Sem2. In order to interpret the lambda
THE LAMBER CALCULUS
11
terms that serve as denotational recipes, we must establish the mapping between syntactic and semantic types. We will provisionally assume here that each syntactic category is associated with a unique semantic type, given by the following induction. For the set of basic categories, we fix type(S)=t, type(NP) = ((e,t)j), type(N) =iype(AP) = type(PP) = (e,t), where t is the type of truth values, e the type of entities and (a]b) the type of functions from a-type objects to btype objects. For functor categories X/Y (or Y\X) the semantic type is (type(Y),type(X)). (Notice that the syntactic atoms AP and PP are interpreted as predicates, i.e. as adjectival or prepositional complements: in their role of modifiers, adjectival or prepositional phrases will appear with a modifier type X/X, e.g. N/N for attributive adjectives. In this introductory chapter, we will concentrate on the preservation of thematic structure under flexible reduction assumptions. We abstract from quantifier scope phenomena which will be dealt with in Chapter 5, where the relationship between the syntactic and the semantic algebra is studied in greater detail.) Definition 1.1.9 Reduction laws R1 Application X/Y:/, Y: a => X:/(a) Y:a,Y\X:/=>X:/(a) R2 Composition X/Y:/, Y/Z: g => X/Z: \v.f(g(y)) Z\Y: g, Y\X:/=> Z\X: Xv./(g(v)) R3 Associativity (Z\X)/Y:/=> Z\(X/Y): \Vlkv2.f(y2)(Vl) Z\(X/Y):/=> (Z\X)/Y: Xv1Xv2./(v2)(v1) R4 Lifting X: a => Y/(X\Y): Xv.v(a) X:a=* (Y/X)\Y: Xv.v(a) R5 Division (main functor) X/Y:/=> (X/Z)/(Y/Z): X v ^ v r f i y f a ) ) Y\X:f* (Z\Y)\(Z\X): Xv1Xv2./(v1(v2)) R6 Division (subordinate functor) X/Y:/=> (Z/X)\(Z/Y): Xv1Xv2.v1(f(v2)) Y\X:/=^ (Y\Z)/(X\Z): Xv1Xv2.v1(f(v2)) DISCUSSION: APPLICATION
In our discussion of the category system, we observed that the information encoded in phrase structure trees can be projected onto the internal structure of the categories themselves. As a result of this move, the classical categorial reduction system can replace a set of stipulative rewriting rules by one general reduction scheme R1
12
CHAPTER 1.1
that reflects the interpretation of fractional categories as incomplete expressions or functors. The reduction scheme R1 is neither language-specific nor construction-specific: it embodies the ideal of lexicon-driven syntax. The relation between the syntactic algebra and the semantic algebra is strictly compositional: the syntactic cancellation scheme R1 corresponds to the application of the semantic value of the functor category to the semantic value of its argument. Below is an Application derivation for the sentence John loves Mary. The example is theoretically quite innocent, so let us use this opportunity to introduce some further notational conventions. Instead of the familiar but typographically impractical tree format, we will display categorial derivation trees in an indented list format. Indentation corresponds to branching in the tree, and top-down to leftright order. Non-terminal nodes are annotated with the reduction law used to derive them. The symbol V introduces terminal nodes and their lexical type assignment.
Example 1.1.4 Notational conventions. Conventional tree format: S NP
John
-NP\S
(NP\S)/NP
NP
loves
Mary
Indented list format:
Semantic interpretation:
[RI] S
loves(mary) (john) john loves(mary) loves mary
[Rl]
NPs John (NP\S) ((NP\S)/NP) 3 loves NP3 Mary
THE LAMBEK CALCULUS
13
THE RELATTVTZATION OF CONSTITUENCY
With Application as the only reduction scheme, a non-ambiguous expression is associated with a unique rigid constituent structure analysis. Flexible categorial systems are based on the insight that as soon as we project the structural information onto the internal structure of the categories, the rigid notion of constituency can be relativized. The quasi-arithmetical nature of the functor categories allows for different modes of combination that will associate a nonambiguous expression with a number of simultaneous alternative derivational constituent analyses that are semantically equivalent. For an analogy based on the correspondence between Application and fraction multiplication, think of the alternative bracketings of a product term that will all give the same result: (2/3 • 3/5) • 5/7 = 2/3* (3/5'5/7). The relativized constituency concept has been motivated on the basis of a variety of phenomena. We will focus here on two central forms of support. First, the reduction laws R1-R6 make it possible to give arbitrary expressions a uniformly leftbranching analysis, construing the semantic interpretation in an incremental way as the expression is processed. That is, the theory of grammar underlying R1-R6 can at the same time be considered as a model for the left-to-right processing and incremental interpretation of expressions. Secondly, the relativized constituency concept offers a unified account for the pervasive phenomenon of Boolean coordination by reducing various forms of 'non-constituent' coordination to simple coordination of constituents in the extended sense.
ASSOCIATIVITY: LEFT-BRANCHING ANALYSIS
The Application analysis for John loves Mary is strongly equivalent to the conventional phrase-structure representation for a sequence subject-transitive verb-direct object, with the transitive verb and the direct object grouped into a VP constituent. Suppose now that we are not so much interested in constituent structure, as commonly understood, but rather in the notion of derivability, that is, in the question: Given a sequence of input types, what type(s) can be derived from the input sequence? It will be clear that the result type S would also be derivable if the transitive verb had been assigned the type NP\(S/NP) instead of (NP\S)/NP. The unary rule R3 captures this type-shifting perspective in general terms. R3 states that functors with two arguments Y,Z and value X which first combine with their Y argument to the right and then with their Z argument to the left, stand in a one-to-one relationship with functors of two argument Y,Z that first combine to the left with
14
CHAPTER 1.1
their Z argument and then to the right with the Y argument to give a value X. The intended semantics for this type transition makes the semantic values of the arguments land up in the place in the function-argument structure where they belong given the semantic value of the original functor. With {R1,R3} as the reduction system, an alternative left-branching S derivation can be given for the expression John loves Mary, as shown in Example 1.1.5. We trace the semantic interpretation here in full just for this example. In the remainder of this study, we will silently perform lambda reductions whenever possible. R3 Associativity ( Z \ X ) / Y : / » Z\(X/Y): \v1Xv2./(v2)(v1) Z \ ( X / Y ) : / * (Z\X)/Y: Xv1Xv2./(v2)(v1) Example 1.1.5 Left-associative analysis for John loves Mary [Rl] S [Rl] (S/NP) N P s John [R3] (NP\(S/NP)) ((NP\S)/NP) a loves N P s Mary Semantic interpretation: [Rl] [Rl] [R3]
S: Xy.loves(y) (john)(mary) « loves(mary)(john) (S/NP): XxXy.loves(y)(x)(john) \y.loves(y)(john) NP: john (NP\(S/NP)): XxXy.loves(y)(x) ((NP\S)/NP): loves NP: mary
NON-CONSTITUENT COORDINATION: {R1,R3}
As indicated above, the left-branching analysis can be taken to model the left-to-right processing of the sentence, with an incremental construction of the desired semantics. Suppose, however, one wants to keep the theory of grammar and the processing theory strictly separated. Then in simple phrases, there seems to be little need for alternative derivations such as the left-branching one above with the unorthodox derivational subconstituent John loves, i.e. a combination of subject and transitive verb of category S/NP. Crucial
THE LAMBEK CALCULUS
15
grammatical motivation for this flexible notion of constituent structure comes from Boolean coordination. Given the assumption that only like categories can be conjoined, conjoinability has always been the standard test for constituency. As soon as we broaden our horizon to include coordination phenomena, the unorthodox subconstituents of an {R1,R3} derivation turn up as conjuncts. Assume for the moment that we have at our disposal an infinite supply of possible types for the Boolean connectives and, or, but, with a common structure (X\X)/X, i.e. the connectives conjoin two phrases of equal category. Semantically, we assume that the Boolean coordinators and and or are interpreted respectively as the generalized meet (intersection) and join (union) operations in the semantic domains corresponding to the conjoined syntactic expressions, i.e. we adopt the generalized theory of Boolean coordination of Partee and Rooth (1983). For semantic objects P,Q of the truth-value type t, MEET(P)(Q) and JOIN(P)(Q) are interpreted as (P & Q) and (P v Q) respectively, i.e. as conjunction and disjunction in the domain of truth values. For semantic objects AJB denoting functions from Xtype objects to Y-type objects, MEET(^4)(fi) is interpreted as Xv.MEET(/l(v))(5(v)), where v is a variable of type X; analogously for JOIN(,4)(B). With the relativized constituency concept, we can now reduce the type of non-constituent conjunction of Example 1.1.6 to simple constituent coordination of the constituents John loves and Bill hates. Observe that the semantic interpretation obtained at the S/NP node for the conjunction John loves but Bill hates, namely (S/NP): MEET(Xy.hates(y)(bill)) (Xx.loves(x)(john)) , reduces to an abstraction over a shared variable in the conjuncts, by the generalized semantic recipes for the Boolean coordinators: MEET(\y.hates(y)(bffl))(Xx.loves(x)Gohn))» Xv.MEET((\y.hates(y)(bill))(v))((Xx.loves(x)(john))(v))« Xv.MEET(hates(v) (bill)) (loves(v)(john)),
so that the semantic interpretation for the direct object NP Mary gets distributed over the conjuncts in the lambda reduction accompanying the final step in the derivation: (Xv.MEET(hates(v)(bill))(loves(v)(John)))(mary) & MEET(hates(mary) (bill)) (loves(mary) (john)) .
16
CHAPTER 1.1
Example 1.1.6 John loves but Bill hates Mary [RI] [RL]
[Rl] [R3]
[Rl] [Rl] [R3]
S (S/NP)
(S/NP) NPb John (NP\(S/NP))
((NP\S)/NP) 9 loves ((S/NP)\(S/NP)) (((S/NP)\(S/NP))/(S/NP)) 9 but (S/NP) NP9 Bill (NP\(S/NP)) ((NP\S)/NP) 3 hates N P s Mary
Semantic interpretation: [Rl] [Rl] [Rl] [R3] [Rl] [Rl] [R3]
S: MEET(hates(mary) (bill)) (loves(mary) (john)) (S/NP): MEET(Xy.hates(y)(bill))(Xx.loves(x)(john)) (S/NP): Xx.loves(x)(john) NP: john (NP\(S/NP)): XyXx.loves(x)(y) ((NP\S)/NP): loves ((S/NP)\(S/NP)): MEET(Xy.hates(y)(bill)) (((S/NP)\(S/NP))/(S/NP)): MEET (S/NP): Xy.hates(y)(bill) NP: bill (NP\(S/NP)): XxXy.hates(y)(x) ((NP\S)/NP): hates NP: mary LIFTING + COMPOSITION
Let us look now at some motivation to extend the combinatory apparatus to {R1,R3} U {R2,R4} by adding the closely related rules of Composition and Lifting. Here is an argument originally due to Lambek (1958). Suppose we want to account for the different distributional properties of subject (he, she) versus object (him, her) pronouns by lexically assigning them a different category. Since a subject pronoun combines with a VP ( = NP\S) to form a sentence, we assign it the second-order type S/(NP\S), thus excluding a VP likes he. Likewise, an object pronoun can be assigned the type (S/NP)\S, i.e. it is a functor looking to the left for the
THE LAMBEK CALCULUS
17
combination of a subject and a transitive verb. The Associativity rule R3 makes this combination possible, as we just saw. Again, the second-order object pronoun type makes the string him loves underivable. (We do not want to go into the semantics of pronominal elements here, so we restrict ourselves to syntactic derivations. The discussion in the following paragraph will illustrate the semantic effect of Composition and Lifting.) Example 1.1.7 he loves Mary versus Mary loves him [RI]
S
[Rl]
(S/(NP\S)) 3 he (NP\S) ((NP\S)/NP) 3 loves NP3 Mary
[RI] [Rl]
S
[R3]
(S/NP) NPs Mary (NP\(S/NP)) ((NP\S)/NP) 3 loves ((S/NP)\S) 3 him
However, to derive the sentence he loves him, the combinatory apparatus {R1,R3} does not suffice: there is no way of combining the second-order subject type with the transitive verb. In order to derive this sentence, we appeal to the Composition rule R2. R2 Composition X/Y:/, Y/Z: g * X/Z: Xv./(g(v)) Z\Y:g,Y\X:/=*Z\X:Xv/fe(v)) Composition (or partial combination) combines two functor expressions, a main functor X/Y (Y\X) and a subordinate functor Y/Z (Z\Y). The range of the subordinate functor, i.e. the type Y, is the domain of the main functor. Composition cancels this equal middle term, and yields a functor that is still looking for the argument of the subordinate functor (Z) to derive the range of the main functor (X). The intended interpretation of this mode of combination, as the name suggests, is the composition of the semantic value of the main functor and the subordinate functor. Composition derives the expression he loves in type S/NP by cancelling the middle term (NP\S) of the categories for he (S/(NP\S)) and loves ((NP\S)/NP). Example 1.1.8 he loves him [RI] [R2]
S
(S/NP) (S/(NP\S)) 3 he ((NP\S)/NP) 3 loves ((8/NP)\S) 9 him
18
CHAPTER 1.1
The higher-order lexical type assignment to subject and object pronouns rules out ungrammatical strings like him loves he. But as a result of this policy the pronouns and the ordinary NP's that can appear in subject and object position now have a different type. Still pronouns and 0-order NP expressions like John can be conjoined, and conjunction requires the types of the conjuncts to be equal. So we want the second-order pronoun types to be accessible from the 0-order NP type, by a general type-shifting mechanism. This is the effect of the Lifting rule R4. R4 Lifting X: a => Y/(X\Y): Xv.v(a) X: a => (Y/X)\Y: \v.v(a) Lifting turns an argument X into a functor that is looking for a functor with the original type X as argument and with range Y to yield an expression of type Y. R4, in other words, reverses the function-argument relation, a reversal that is undone in the interpretation of this type-shift. Consider what happens when we substitute for the subject and object in the sentence John loves Mary the conjoined phrases he or John or him or Mary. The Lifting rule R4 can turn the argument type NP into the required higher-order type that we assigned lexically to the pronouns he and him, reducing this construction to coordination of equal (second-order) types. Example 1.1.9 he or John (loves Mary) [Rl] [Rl] [R4]
(S/(NP\S)) (S/(NP\S)) 3 he ((S/(NP\S))\(S/(NP\S))) (((S/(NP\S))\(S/(NP\S)))/(S/(NP\S))) (S/(NP\S)) N P s John
s or
Example 1.1.10 (John loves) him or Mary [Rl] [Rl] [R4]
((S/NP)\S) ((S/NP)\S) 9 him (((S/NP)\S)\((S/NP)\S)) ((((S/NP)\S)\((S/NP)\S))/((S/NP)\S)) ((S/NP)\S) NPb Mary
s or
THE LAMBEK CALCULUS
19
LIFTING: ARGUMENTS AS FUNCTORS
For another field where the added forces of Lifting and Composition can solve coordination problems, we turn to a verb-final language, Dutch. In Dutch subordinate clauses, the verb is preceded by a list of argument phrases, i.e. a sequence of atomic categories. The combinatory rules {R1,R2} join a functor and its argument (in the case of Application) or two functor expressions (in the case of Composition); they offer no way to combine two atomic categories. Without the Lifting rule, the following right-branching derivation is the only possible analysis path for the Dutch sentence omdat John gek is. In general, it would seem to be impossible to extend the incremental left-to-right processing approach to a verb-final language. Example 1.1.11
[Rl]
S
[Rl] [Rl]
S/S
omdat because
NP
AP
AP\(NP\S)
John gek is John mad is 'because John is mad'
(S/S) 3 omdat S NPb John (NP\S) A P S gek ( A P \ ( N P \ S ) ) A is
omdat(is(gek)(john)) omdat is(gek)(john) john is(gek) gek is
Lifting, however, turns arguments into functors by reversing the functor-argument relation, as demonstrated above. As has been shown in Steedman (1985), we can lift each of the arguments that precede the verb in its sentence-final position to the appropriate higher order type, and thus provide an alternative strictly leftbranching analysis for our example sentence, with Composition combining the higher-order lifted argument expressions. Example 1.1.12 omdat John gek is, left-branching derivation [Rl] [R2] [R2] [R4]
S
(S/(AP\(NP\S))) (S/(NP\S)) (S/S) 3 omdat (S/(NP\S)) N P B JOHN
[R4]
((NP\S)/(AP\(NP\S))) APA
gek
( A P \ ( N P \ S ) ) S IS
omdat(is(gek)(john)) Xv.omdat(v(gek)(john)) Xz.omdat(z(john)) omdat Xx.x(john) JOHN XY.Y(GEK)
gek IS
20
CHAPTER 1.1
NON-CONSTITUENT COORDINATION
The left-branching analysis path is again motivated on the basis of non-constituent coordination, where the subphrases of this analysis yield the required conjunct type. Example 1.1.13
[Rl] [Rl] [R2] [R4] [R4] [Rl] [R2] [R4] [R4]
(omdat) John gek en Mary dom is (because) John mad and Mary silly is 'because J. is mad and M. silly
S (S/(AP\(NP\S))) (S/(AP\(NP\S))) (S/(NP\S)) N P s John ((NP\S)/(AP\(NP\S))) APB gek ((S/(AP\(NP\S)))\(S/(AP\(NP\S)))) (((S/(AP\(NP\S)))\(S/(AP\(NP\S))))/(S/(AP\(NP\S)))) 3 en (S/(AP\(NP\S))) (S/(NP\S)) NPb Mary ((NP\S)/(AP\(NP\S))) A P s dom (AP\(NP\S))s is
Semantic interpretation: [Rl] S: MEET(is(dom)(mary))(is(gek)(john)) [Rl] (S/(AP\(NP\S))): MEET(Xw.w(dom)(mary))(\z.z(gek)(john)) [R2] (S/(AP\(NP\S))): Xz.z(gek)(john) [R4] (S/(NP\S)): Xx.x(john) NP: john [R4] ((NP\S)/(AP\(NP\S))): Xy.y(gek) AP: gek [Rl] ((S/(AP\(NP\S)))\(S/(AP\(NP\S)))): MEET(\w.w(dom)(mary)) ((S/(AP\(NP\S)))\(S/(AP\(NP\S))))/(S/(AP\(NP\S))):MEET [R2] (S/(AP\(NP\S))): Xw.w(dom)(mary) [R4] (S/(NP\S)): Xu.u(mary) NP: mary [R4] ((NP\S)/(AP\(NP\S))): Xv.v(dom) AP: dom (AP\(NP\S)): is
THE LAMBEK CALCULUS
21
UNARY TYPE-TRANSITIONS VERSUS BINARY REDUCTION
We introduced R2, R3 and R4 on the basis of constructions that would be underivable but for the new combinatory or type-shifting rule. However, the joint potential of {R1,R2,R3,R4} then confronted us with a central property of flexible categorial systems: a (nonambiguous) grammatical string is associated not with a unique constituent structure, but with a set of semantically equivalent derivations. The earlier left-branching {R1,R3} derivation for John loves Mary, which was motivated on the basis of non-constituent coordination like John loves, but Bill hates Mary, used R3 as a unary type-shift to feed the binary reduction rule Rl. But the required subject-transitive verb combination can also be derived on the basis of {R2,R4}, by lifting the subject NP to S/(NP\S) and then composing the higher order subject with the transitive verb type (NP\S)/NP. In this case Lifting is the unary type-transition that feeds binary Composition. See the alternative {R2,R4} derivation for non-constituent conjunction below. As the reader can check, the semantic interpretation for the {R2,R4} derivation is identical to the interpretation associated with the alternative {R1,R3} analysis of Example 1.1.6:
S:
MEET(hates(mary)(bill))(loves(mary)(john)) .
Example 1.1.14 Non-constituent coordination: {R2,R4} John loves but Bill hates Mary [RI] [Rl] [R2] [R4]
S (S/NP) (S/NP) (S/(NP\S)) NPb John ((NP\S)/NP) 3 loves
[Rl] [R2] [R4]
((S/NP)\(S/NP)) (((S/NP)\(S/NP))/(S/NP)) (S/NP) (S/(NP\S)) N P s Bill ((NP\S)/NP) 9 hates N P s Mary
9 but
22
CHAPTER 1.1
Semantic interpretation: [RI] [RI] [R2] [R4] [Rl] [R2] [R4]
S: MEET(hates(mary)(bill))(loves(mary)(john)) (S/NP): MEET(Xy.hates(y)(bill))(Xx.loves(x) (john)) (S/NP): Xx.loves(x)(john) (S/(NP\S)): Xz.z(john) NP: john ((NP\S)/NP): loves ((S/NP)\(S/NP)): MEET(Xy.hates(y)(bill)) (((S/NP) \(S/NP))/(S/NP)) : MEET (S/NP): Xy.hates(y)(bill) (S/(NP\S)): Xv.v(bill) NP: bill ((NP\S)/NP): hates NP: mary
COMPOSITION, DIVISIONI AND DIVISION2
This trade-off between unary and binary rules can be further illustrated by investigating the relation between Composition and the two Division rules R5 and R6. Consider the adjective phrase related to Mary. A pure Application analysis assigns this phrase a rightbranching derivation. A left-branching alternative is available as soon as we have a Composition rule at our disposal. By cancelling the middle term PP, functional composition can combine AP/PP {related) and PP/NP (to) into AP/NP. Conjoinability of the expression related to with expressions of equal type motivates this alternative analysis (e.g. acquainted with or related to).
Example 1.1.15 Application
Example 1.1.16 Composition
[Rl]
[Rl] [R2]
[Rl]
AP (AP/PP) 9 related PP (PP/NP) 9 to N P 3 Mary
AP (AP/NP) (AP/PP) 3 related (PP/NP) 3 to N P s Mary
THE LAMBEK CALCULUS
23
Example 1.1.17 Coordination evidence [RI] [R2] [Rl] [R2]
(AP/NP) (AP/NP) (AP/PP) 3 acquainted (PP/NP) 9 with ((AP/NP)\(AP/NP)) (((AP/NP)\(AP/NP))/(AP/NP)> or (AP/NP) (AP/PP) 3 related (PP/NP) 9 to
The Composition rule R2, just like Application, is a binary rule, combining two types into a new compound expression. Suppose now we want to keep the combinatory apparatus as simple as possible and to restrict it essentially to the Application rule Rl. This means that we have to rely on unary type transitions in cases where functional application blocks: the type-shifting rules make it possible for a category to adapt to its context, and to assume an appropriate shifted type so that an Application reduction goes through. Above we saw an example of this trade-off between unary and binary rules when we compared different options to combine a subject with an adjacent transitive verb: we could either shift the transitive verb category by Associativity so that it could combine with the subject by Application, or we could lift the subject type and combine the higher-order subject type with the transitive verb by Composition. Composition itself can be eliminated in favour of a unary type transition followed by simple Application. There are two options here, depending on whether the type-shift is applied to the main functor or to the subordinate functor. R5 divides range and domain of the main functor category AP/PP by the domain of the subordinate functor, NP; the resulting higher-order type (AP/NP)/(PP/NP) can combine with the subordinate functor PP/NP by Application. Alternatively, R6 operates on the subordinate functor and shifts it into a higher-order functor that can consume the main functor by functional application. R6 is the inverse of the division rule R5: the range of the main functor (AP) is divided by range and domain of the subordinate functor. The two options are illustrated below.
24
CHAPTER 1.1
R5 Division (main functor) X/Y:/=> (X/Z)/(Y/Z): X v ^ . / f V ^ ) ) Y \ X : / ^ (Z\Y)\(Z\X): Xv1Xv2./(v1(v2)) R6 Division (subordinate functor) X/Y:/=> (Z/X)\(Z/Y): Xv1Xv2.v1(/"(v2)) Y\X:/=» (Y\Z)/(X\Z): Xv1Xv2.v1(f(v2)) Example 1.1.18 Division (main functor) [Rl] [Rl] [R5]
AP: related(to(mary)) (AP/NP): Xy.related(to(y)) ((AP/NP)/(PP/NP)): Xx\y.related(x(y)) (AP/PP): related (PP/NP): to NP: mary
Example 1.1.19 Division (subordinate functor) [Rl] [Rl] [R6]
AP: related(to(mary)) (AP/NP): Xy.related(to(y)) (AP/PP): related ((AP/PP)\(AP/NP)): XxXy.x(to(y)) (PP/NP): to NP: mary
UNARY TYPE TRANSITION + BINARY REDUCTION BY APPLICATION
With Division instead of Composition, an incremental left-branching analysis for the sentence below can be obtained with Application as the only rule to combine two expressions together. In all cases where types do not fit the requirements of the Application rule scheme, unary type-shifting laws can change the initial lexical type assignments into the required derived types. We will return to this issue when we discuss recursive axiomatizations of categorial calculi in Section 1.3.1.
THE LAMBEK CALCULUS
25
Example 1.120 Application + unary type transitions John is related to Mary [RI] [Rl] [R5] [Rl] [R5] [Rl] [R3]
S (S/NP) ((S/NP)/(PP/NP)) (S/PP) ((S/PP)/(AP/PP)) (S/AP) NPs John (NP\(S/AP)) ( ( N P \ S ) / A P ) s is ( A P / P P ) 3 related (PP/NP) 9 to
NPb Mary Semantic interpretation: [Rl] [Rl] [R5] [Rl] [R5] [Rl] [R3]
S: is(related(to(mary)))(john) (S/NP): Xx2.is(related(to(x2)))(john) ((S/NP)/(PP/NP)): Xy2\x2.is(related(y2(x2)))(john) (S/PP): Xx1.is(related(x1))(john) ((S/PP)/(AP/PP)) : Xy1\x1.is(y1(x1))(john) (S/AP): Xy.is(y)(john) NP: john (NP\(S/AP)): XxXy.is(y)(x) ((NP\S)/AP): is (AP/PP): related (PP/NP): to NP: mary
CONCLUSIONS
The discussion of flexible reduction systems raises a number of important questions. Categorial grammar eliminates the phrase structure component by projecting the information that is encoded in phrase structure onto the richly structured categorial types. The modelling of categories as functors in turn suggests a more flexible, quasi-arithmetical combinatorics for these functors instead of the original Application system that is in the end as rigid as phrase structure rules. However, it seems now that instead of a stipulative set of phrase structure rules, we end up with an ever growing set of categorial reduction laws that is no less stipulative as long as these laws are introduced as primitives. Moreover, as the discussion
26
CHAPTER 1.1
of multiple representations suggests, the laws introduced above are not independent, but show a high degree of mutual interdefinability. In the following section, we will investigate the notion of derivability from a more abstract perspective, and uncover the deeper logic behind the categorial connectives so that we have an answer to the questions: What is the connection between R1 to R6, and what makes these, rather than other conceivable operations, into valid laws?
BACKGROUND
The arguments for flexible categorial reduction systems on which this section is based can already be found in the classical papers Lambek (1958), Bar-Hillel et al. (I960), Cohen (1967), Geach (1972). Since then, the reduction laws R1-R6 have been the basis for detailed linguistic analysis of phenomena in syntax, morphology and semantics. A representative collection of papers is Oehrle, Bach & Wheeler (1988). The work on generalized Boolean semantics of Gazdar (1980), Keenan & Faltz (1985) and Partee & Rooth (1983) brought to light the crucial relevance of coordination as a motivation for the flexible notion of constituent structure. Zwarts (1986) and Dowty (1988) are careful investigations of constituent-structure argumentation in the light of flexible categorial systems.
THE LAMBEK CALCULUS
27
1.2. CATEGORIAL PARSING AS IMPLICATIONAL DEDUCTION 1.2.1. THE LAMBEK-GENTZEN CALCULUS
In this section, the unsatisfactory situation arising from considering the reduction laws as primitive axioms will be resolved by reinterpreting the categorial reduction system as a calculus analogous to the implicational fragment of propositional logic. The type calculus presented in Lambek (1958, 1988) is the standard representative for this line of research, and it will form the subject of our investigations in the rest of this book. The propositional calculus has for each logical connective two inference rules regulating their behaviour: an Elimination rule to remove the connective from the conclusion, given a number of premises, and an Introduction rule that tells you how from a set of premises a conclusion with the connective in question can be derived. For the implication connective the Elimination rule is Modus Ponens: from p, p>q one can infer q where the conclusion q does no longer contain this occurrence of the connective the Introduction rule is Conditionalization: if a set of premises S validates a conclusion q, then from S-{p} one can concludep^q. The basic insight of Lambek's work rests on the realization that the Application rule R1 is the categorial analogue of Modus Ponens, and that a full logic of the categorial connectives can be obtained by adding an Introduction rule analogous to Conditionalization. As soon as this step is taken, the reduction laws that were introduced as primitives before, get the status of theorems, i.e. valid inferences of the logic of the categorial connectives. The Lambek calculus can be implemented in a number of equivalent ways. We will present the sequent calculus, which Lambek adapted from Gentzen's intuitionistic system LJ (see Chapter 3 in Szabo 1969). The sequent perspective is a particularly lucid basis for the discussion of the central notions of derivability and decidability. Let us characterize derivability as a relation holding of sequents of types, where a sequent is defined as follows. Definition 1.2.1 A sequent is a pair (G,D) of finite (possibly empty) sequences G = [A^.^A,,,], D = [B1,...,Bn] of types. For categorial L-sequents, we require G to be non-empty, and n = l. For the sequent (G,D) we write G=>D. The sequence G is called the antecedent, D the succedent. For simplicity, a sequence [A1,...,Am] is written simply as A1,...,Am. A sequent A j , . . . ^ =» B qualifies as valid if the type B is derivable from the sequence A1,...,An. In order to give content to the derivability relation we define what sequents are valid by defini-
28
CHAPTER 1.2
tion (the axioms), and we provide a set of inference rules for the categorial connectives, to tell us what sequents can be validly inferred from given premise sequents. As indicated above, the inference rules come in pairs: the 'left' rules regulate the behaviour of the connectives in the antecedent of a sequent (cf. Elimination), the 'right' rules in the succedent (cf. Introduction). Definition 122 The axioms of L are sequents of the form X
X.
Definition 123 System L: Lambek's (1958) Gentzen sequent calculus for the associative directional categorial system. Type-forming connectives: X,Y,Z are types, P,T,Q,U,V sequences of types, P,T,Q non-empty. Inference rules of L: [/:right]
T
X/Y if T,Y=»X
[/:left]
U,X/Y,T,V Z if T=> Y and U,X,V=»Z
T
Y\X if Y,T=*X
U,T,Y\X,V => Z if T=»Yand U,X,V=>Z
[\:right] [\:left]
[»cleft] U,X• Y,V Z if U,X,Y,V =» Z [ • :right] P,Q =» X • Y if P=»Xand Q=>Y In order to prove the validity of a sequent A j , . . . ^ B, we try to show that this sequent follows from the axioms of L by a finite number of backward applications of the inference rules for the connectives. The proof of a sequent can be naturally represented as a tree, the nodes of which are labeled by sequents. The construction of the proof tree proceeds bottom-up, starting from the theorem of which we want to establish the validity, and working upward towards axiom sequents. We have presented the inference rules of the calculus L in the format 'Conclusion if Premise' or 'Conclusion if Premise! and Premise2' to stress the fact that the Gentzen proof procedure traces the inference steps in backward fashion. From a problem-solving perspective (cf. Kowalski 1979), the proof procedure amounts to a top-down control strategy: we start with the goal sequent, and break up this goal into smaller subgoals by means of the inference rules until axiom leaves are reached. In a proof-tree, each inference corresponds to a local subtree (i.e. a tree of depth 1)
29
THE LAMBEK CALCULUS
where the root node is labeled with the conclusion and the leaf nodes with the premise sequents that licence the conclusion. See Figure 1.2.1 for the graph representation of the inference rules. To facilitate the interpretation, the root nodes have been annotated with the name of the inference rule that leads to their descendants ('L' for 'left' inferences, *R' for 'right' rules). T=>Yo
T,Y => X
o U,X,V=> Z
[/L]
ò U,X/Y,T,V=»Z
T=> Y o
[/R]
Y,T => X
o U,X,V=* Z
[\L]
U,T,Y\X,V
Z
p U,X,Y,V=> Z
T => X/Y
[\R] P4X0
[•L] ò U,X-Y,V=> Z
T
Y\X oQ=>Y
[-R] à P,Q=>X-Y
Figure 12.1
GENTZEN PROOFS: COMPOSITION, DIVISION
As an example of the bottom-up Gentzen proof procedure for categorial sequents, let us unfold the proof tree demonstrating the validity of R2, the Composition rule. The nodes of the proof tree are labeled with sequents; the root of the tree with the theorem X/Y,Y/Z X/Z. This sequent does not match the axiom scheme; therefore, we select a connective that can be removed by one of the inference rules of Definition 1.2.3. Removal of the right-division connective '/' from the succedent type X/Z yields the following subproof. (We have highlighted the active type, i.e. the type of which the main connective is removed, in the present case X/Z.)
30
CHAPTER 1.2
o
[/R]
X/Y,Y/Z,Z=> X
6 X/Y,Y/Z=* X/Z
The new top-sequent X/Y,Y/Z,Z X still does not match the axiom case; so we select one of the remaining connectives, this time in the antecedent, and remove it with the relevant Elimination inference. Observe that there is a choice here: either X/Y or Y/Z could be selected as the target for further inferences, leading to alternative proofs for the goal theorem. With Y/Z as the active type, the partial expansion of the proof tree now assumes the following form.
[/L]
Ò X/Y,Y/Z,Z =» X
[/R]
b X/Y,Y/Z=> X/Z
Notice that the [/L] inference causes the proof tree to branch: two subproofs have to be successfully terminated to validate a conclusion with the connective '/' in the antecedent. The left leaf Z=>Z, an instance of the axiom scheme, already represents a finished subproof. The remaining connective in the right premise X/Y,Y =» X can be removed by another application of the [/L] inference rule, yielding the fully expanded proof tree of Figure 1.2.2, where all the leaf nodes are labeled with sequents matching the axiom scheme. Y=>Y q
Z=> Z o
p
[/L]
t
X/Y,Y => X
[/L]
A X/Y,Y/Z,Z=» X
[/R]
o X/Y,Y/Z=> X/Z
Figure 1 22
X=>X
THE LAMBEK CALCULUS
31
We observed in the previous section that there is a close connection between the binary Composition rule R2 and the unary type transitions R5 and R6, the two variants of Division. The prooftheoretic perspective presented here can elucidate this connection. Observe that from the end-sequent of the above proof, two further inferences can be derived, by applying Conditionalization to the antecedent types X/Y or Y/Z. These inferences correspond to the subtrees of Figure 1.2.3. By substituting these subtrees for the root of the proof tree in Figure 1.2.2, we obtain Gentzen proofs for R5 (Division, main functor) and R6 (Division, secondary functor) respectively. The proof of the validity of the other reduction laws of Section 1.2. (Rl, R3, R4) is left as an exercise. X/Y,Y/Z=> X/Z
X/Y,Y/Z=» X/Z
[/R]
Ò X/Y=> (X/Z)/(Y/Z)
[\R]
Y/Z=> (X/Y)\(X/Z) R6: Division, Subordinate functor
R5: Division, Main functor Figure 123 1.2.2. CUTS AND DECIDABILITY
The top down Gentzen proof procedure for L is presented in Lambek (1958) to establishes the decidability of the calculus: for arbitrary sequents A^—jA,, =* B, the procedure is guaranteed to terminate in a finite number of steps with an answer to the question whether the sequent is L-valid or not. How do we know that the backward application of the inference rules for the categorial connectives will eventually lead to a proof tree with axiom leaves in case of a valid theorem, and to failure in case of an invalid sequent? To answer this question, we take a closer look at the complexity properties of the elementary steps in the proof procedure, i.e. the inference rules of Figure 1.2.1. Recall from Section 1 that we defined the complexity degree of a sequent as the total number of type-forming connectives in it (Definition 1.1.6). Let us compare the complexity degree of the conclusion of the L inference rules with the complexity degree of the premises that licence this conclusion. For each inference rule, the degree of the premises is strictly smaller than that of the conclusion, since the conclusion contains a connective in either the antecedent or the succedent that is absent from the premises. In this way, establishing the validity of the premise sequents is in a well-defined respect a simpler goal than establishing the validity of the conclusion. We are dealing with
32
CHAPTER 1.2
finite sequents, i.e. every sequent contains a finite number of connectives. Each inference (applied backward) removes one connective, and there is a finite number of possibilities to apply the inference rules. The total number of attempts at constructing a proof tree constitutes the Gentzen search space. Systematically traversing this search space, we will either encounter a proof tree, i.e. a tree of which all the leaves are labeled with axiom sequents, thus establishing the validity of the sequent at the root. Or our systematic search for a proof fails, thus showing that the sequent we want to prove is invalid. As to the finite dimensions of the Gentzen search space, note that the maximal depth of a proof tree is equal to the degree of the end-sequent, i.e. the number of connectives in the theorem at its root (in the case of Figure 1.2.2, degree = 3). Note also that the proof procedure has the categorial pendant of Gentzen's subformula property: every type occurring in a premise sequent is a subtype of some type occurring in the lower sequent. The subformula property means that if a sequent of L can be proved at all, it can be proved on the basis of its subtypes only: in proving a theorem of L, we never have to make a detour. The Gentzen proof strategy ('from end-sequent to axioms') is essential to establish the decidability of L: it yields an algorithm that will blindly construct a proof if in fact a proof exists. For practical purposes, however, one will equip the categorial calculus with an extra rule of inference, the so-called Cut rule, as defined below. The Cut rule, unlike the rules of inference we have seen so far, is not a logical rule to handle a categorial connective, but a structural rule allowing for extra flexibility in manipulating subproofs on the basis of the transitivity of the derivability relation W. Assume we have a database of already established results, i.e. valid theorems of the calculus L such as R1 to R6. The proof of new theorems will in many cases contain subproofs corresponding to theorems we already know to be valid. But the cut-free proof procedure cannot use this knowledge, and will blindly redo the proofs for these embedded theorems. In the system L+{Cut}, we can use the established theorems as lemmas in the proof of new theorems, i.e. L + {Cut} allows for a goal-directed search strategy, starting from the axioms, and working towards the desired conclusion, using auxiliary lemmas along the way. From the problemsolving perspective, then, the Cut rule corresponds to a bottom-up inference step. Definition 12.4 Cut Inference for L: U,T,V=>Yif T=> X and U,X,V=>Y
THE LAMBEK CALCULUS
33
The type X, that is present in the premises of the Cut inference and absent from the conclusion, is called the cut formula. The degree of a Cut inference is the sum of the degree of the parameters of the inference, i.e. d(T) + d(U) + d(V) + d(X) + d(Y) T => X p
[Cut]
o
U,X,V=> Y
6 U,T,V=> Y
We saw above that the cut-free Gentzen proof procedure guarantees a terminating algorithm because at every step in the construction of the proof tree, the complexity degree systematically decreases: each rule of inference removes a connective from the conclusion, so that the degree of the premise sequents is always strictly smaller than that of the conclusion. With a Cut rule added to the set of inference rules, this situation changes. The premises of this inference contain a cut formula X which disappears from the conclusion. Suppose the cut formula has been derived by complexity increasing inferences (the 'right' rules); in these cases, the degree of the right premise in the Cut inference will be greater than that of the conclusion. In other words, once we allow ourselves the goal directed proofs of L + {Cut}, it is no longer clear that decidability carries over. The fundamental theorem in this respect is a categorial version of Gentzen's Cut Elimination Theorem, which guarantees that by means of the Cut rule no theorems can be derived that could not also be derived with a cut-free proof. Theorem 12.1 Cut Elimination Theorem for L (Lambek 1958): L + {Cut} is equivalent to L, i.e. the set of theorems of L is not increased by adding the Cut rule. Lambek's version of the Cut Elimination Theorem takes the form of a constructive proof, i.e. an actual algorithm is given to transform every proof that makes use of the Cut rule into a cut-free proof. The base case of the algorithm is represented by Cut inferences where one of the premises is an instance of the axiom scheme. In these cases, the conclusion coincides with the other premise, and the Cut inference can be pruned away. The recursive cases of the cut-elimination algorithm work by reduction on the degree of the Cut inference, defined as the sum of the degrees of the cut parameters. It can be shown that any Cut inference of which the conclusion has been proved without Cut can be transformed into one or two new Cut inferences of smaller degree. Since
34
CHAPTER 1.2
no inference can have a negative degree, these proof transformations will ultimately converge on the base case, where the Cut inference can be pruned away. For an illustration of the cut-elimination algorithm, compare Figure 1.2.4 and Figure 1.2.6, two proofs for the L-validity of argument lowering. Figure 1.2.4 is a proof which makes use of a Cut inference. As premises of the Cut inference, we rely on the validity of the type lifting NP=>S/(NP\S) and the cancellation of the lifted NP type against the argument type of the higher-order VP type, by simple functional application. The cut formula in this inference is the lifted NP type S/(NP\S). NP-» NPo
NP,NP\S
o S => S
S < ) [\L]
NP=> S/(NP\S)
S => S
[/R]
o S/(NP\S)
S/(NP\S)
[\L] i) S/(NP\S),(S/(NP\S))\S => S
[Cut]
NP,(S/(NP\S))\S => S
[\R] T U V X Y
6 (S/(NP\S))\S = = = = =
NP\S
[NP] d(T)=0 • d(U)=o [(S/(NP\S))\S] d(V)=3 S/(NP\S) d(X) = 2 S d(Y) =0 Figure 12.4
Note that the degree of the right premise, with the cut formula in the antecedent (degree=5), is greater than the degree of the conclusion of the Cut inference (degree=3), where the cut formula is no longer present. The degree of the Cut inference as a whole is the sum of the degrees of the parameters U,T,V,X,Y; in this case degree=5. In order to transform this proof into a cut-free proof, we concentrate on the right premise U,X,V Y, which in the case at hand does not introduce the main connective of the cut formula
35
THE LAMBEK CALCULUS
X = S/(NP\S). By inspection of the inference rules for the connectives, we know that the sequent U,X,V =» Y must have been derived from a premise U',X,V' Y', also containing the cut formula, but with a degree smaller than the degree of U,X,V Y. In our example, the inference in question is [\L], with U' = Q,V = [], and Y'=S/(NP\S). A Cut inference that satisfies these conditions can be replaced by a new Cut of the form T=»Xo
o U',X,V' => Y'
I
U',T,V' =>Y'
The degree of this inference is smaller than that of the original Cut (4 Y' to the original conclusion U,T,V Y. Figure 1.2.5 shows the result of the replacement. NP=* NPo
NP,NP\S
q S => S
S
[\L]
NP=» S/(NP\S) b [/R]
o S/(NP\S) =» S/(NP\S)
NP=» S/(NP\S)
[Cut]
p S =»S
[\L] 6 NP,(S/(NP\S))\S [\R] o (S/(NP\S))\S T = [NP] U = []
d(T)=0 d(U) = 0
X = S/(NP\S) Y = S/(NP\S)
d(X) = 2 d(Y)=2
v = G
Figure 1.2.5
d(V)=o
S
NP\S
36
CHAPTER 1.2
The replacement of the Cut inference of Figure 1.2.4 by that of Figure 1.2.5 has brought us to the base case of the cut-elimination algorithm. In the Cut inference of Figure 1.2.5, the right premise is an instance of the axiom scheme, so that the conclusion coincides with the other premise. This Cut inference does not contribute any information to the proof and can be pruned away. Figure 1.2.6 is the result of substituting the subtree rooted at the left premise of the Cut inference for the sequent at the conclusion. The resulting proof tree is the cut-free proof for argument lowering that would be obtained from the bottom-up proof procedure. NP=* NPo
oS=»S
NP,NP\S* S b [\L] NP=» S/(NP\S) o [/R]
Q
S
S
[\L] ó NP,(S/(NP\S))\S » S
[\R] ò (S/(NP\S))\S => NP\S Figure 12.6 The case where one of the Cut premises does not introduce the main connective of the cut formula is one of the major cases in the cut-elimination algorithm. For the other case, where the premises do introduce the main connective of the cut formula, we refer to Lambek's original paper. The equivalence of L and L+{Cut}, as stated in the Cut Elimination Theorem, is based on an extensional interpretation of the calculi, where a calculus is identified with the set of provable theorems. But as we saw, the intensional properties of L and L + {Cut}, as alternative problem-solving strategies, are quite different. We return to the respective merits of the top-down and bottom-up proof strategies in detail in Chapters 4 and 5. Observe that in our illustration the cut-free proof is actually simpler than the proof with cut. In order to appreciate the time-saving effects of the Cut rule, we will have to carefully distinguish the complexity of proof trees (as measured by the number of nodes) from the complexity of
THE LAMBEK CALCULUS
37
the Gentzen search space (as measured by the total number of alternative attempts at constructing a proof tree). Judicious use of Cut inferences allows one to reduce the complexity of the search problem by finding a shorter path leading to a proof.
1.2.3. LAMBDA SEMANTICS FOR GENTZEN PROOFS
So far, we have concentrated on the syntactic and proof-theoretic aspects of derivations in the flexible categorial system L. For the pure Application calculus, the connection between syntax and semantics rested on the correspondence between the basic syntactic rule of cancellation R1 and functional application of the semantic value of the functor to the semantic value of its argument. Van Benthem (1986a) shows that the desired strong link between syntax and semantics carries over to the flexible categorial calculus L. Whereas the Elimination rules ([/L],[\L]) correspond to functional application, the new inference rules Introduction and Cut have equally basic semantic pendants. The Introduction rules ([/R],[\R]) correspond to lambda abstraction over a variable for the subtype introduced by Conditionalization, and the Cut inference to the semantic operation of substitution. (The correspondence can be extended to product categories, but we ignore these here.) When we make this correspondence explicit, it turns out that the semantics for valid type transitions can effectively be read off from the Gentzen proofs. The calculus L is, in this respect, a very strong version of the program of type-driven translation: the semantics for the flexible type transitions is not stipulated rule by rule, but automatically obtained from the same procedure that proves the validity of this transition. We will work out the construction of the semantic recipes accompanying type change on the basis of the Gentzen proof procedure of the previous section. The notion of a categorial sequent is extended to accommodate the semantics. Instead of types, a sequent will now consist of pairs 'Type: Semantics', where the semantic value is a lambda term in a type-theoretic language with the standard model-theoretic interpretation. Restricting our attention to the product-free part of L, the inference rules now get the following form (for left division and analogously for V): Definition 1.2.5 Elimination: functional application [\L] U , T , Y \ X : / , V = > Z if T Y: a and U, X:/(a), V=> Z
38
CHAPTER 1.2
Definition 12.6 Introduction: lambda abstraction [\R] T Y\X: Xv.Term if Y: v, T X: Term Definition 12.7 Cut: substitution [Cut] U,T,V=*Zif T=> X: Term and U,X:Term, V=>Z The [\L] rule eliminates the connective of a functor Y\X with semantics / by proving two subgoals: to the left of the functor, a sequence T can be identified that derives type Y with semantic value a; and U,X,V derives Z, where the semantic value of X is the result of functional application of / to a. The [\R] rule derives a sequent T => Y\X, where the semantic value of Y\X is the result of abstracting over a variable corresponding to type Y in the semantic value Term which one obtains by proving that the sequence Y,T derives X. The semantic recipe for a Cut inference is obtained by computing the semantic value for the cut formula X on the basis of the sequent T=>X and substituting this for the antecedent occurrence of X in the second premise. As an example of the construction of the semantics of a type-transition, Figure 1.2.7 shows how the intended meaning for the Composition rule R2 can be read off straightforwardly from the Gentzen proof we used to demonstrate the syntactic validity.
Y m
* Y:g(v) o
Z:v => Z:v o
? X:/fe(v))
[/L]
[/L]
[/R]
X/Y:/, Y:g(v)
X/Y:/ ,Y/Z:g,
X:/(g(v))
Z:v => X:/fe(v))
Ò XfY:f, Y/Z:g => X/Z:Xv./(g(v)) Figure 12.1
*X:/fe(v))
THE LAMBEK CALCULUS
39
GENTZEN PROOFS: FINITE RANGE OF READINGS
The careful reader will have observed that for an L-sequent Aj,...,/^ => B there will in general be a number of different proofs. The number of cut-free proofs is finite, and will depend on the different orders in which the connectives can be removed; the number of proofs in L + {Cut} is infinite, since the Cut inference might introduce a detour in the proof that would be removed by the Cut elimination algorithm. The different proofs may correspond to different meanings. The following theorem reassures us that the number of distinct readings for a given result type is finite. Theorem 122 (Van Benthem 1986a): Modulo logical equivalence the number of distinct readings for an L sequent A j , . . . , ^ B is finite.
BACKGROUND
The source paper Lambek (1958) is reprinted in Buszkowski et al. (1988). Apart from Lambek's own recapitulation in Oehrle, Bach & Wheeler (1988), this collection contains two other contributions, Buszkowski (1988) and Van Benthem (1988a) that should be consulted for further references. The Lambek calculus, in its original formulation, is a calculus of syntactic types. Van Benthem presents the Lambek system as a type-shifting calculus for semantic types, and works with a non-directional version since the notion of directionality makes no sense for the types qua semantic objects. The typedriven interpretation procedure carries over to L under the standard Montague view of a homomorphic mapping from syntactic to semantic types. See Klein & Sag (1985) for the general background of type-driven translation. The mapping between the algebra of proofs and the semantic algebra, as captured in Definitions 1.2.5/1.2.7, is based on the proof-theoretic idea of 'formulas-as-types', cf. Hindley and Seldin (1986). We return to the division of labour between the syntactic and the semantic algebra in Chapter 5.
40
CHAPTER 1.3
1.3. THE CATEGORIAL HIERARCHY 1.3.1. OPTIONS FOR CATEGORIAL CALCULI
We are in a position now to offer a guided tour through the categorial landscape by comparing the properties of the system L with those of a number of weaker and stronger calculi. The weaker systems under consideration are AB and F, the classical Application system of Ajdukiewicz and Bar-Hillel, and Cohen's Free Categorial Grammars respectively. Stronger than Lambek's Directional Associative Calculus L are the systems LP and LPC, LPE (together LPCE) studied by Van Benthem which are based on structural rule extensions of L, as will be discussed below. The systems form an inclusion hierarchy
LPC A
B
S
F
S
L
C
L
P
S
C
LPCE
LPE
in the sense that all theorems of the calculi lower on the hierarchy are also theorems of the calculi higher on the scale, but not vice versa. Notice that the systems selected here only represent a set of well-studied calculi; when we discuss other systems in the next chapters, they do no necessarily fit in the linear ordering of the calculi discussed here. It is easy to characterize AB and F with respect to L on the basis of the reduction laws R1-R5 (one of the Division variants suffices). For L the laws R1-R5 constitute a representative set of theorems, among the infinite number of derivable theorems. The characteristic theorem of L, i.e. the theorem which marks the transition from L to weaker systems, is the Division rule R5. The calculus F is obtained by dropping the Division rule R5, i.e. the reduction system of F consists of the finite set of reduction schemes {R1,R2,R3,R4}. F-{R2,R3,R4} of course yields the system AB with Application R1 as the only reduction rule.
41
THE LAMBEK CALCULUS
LP + Contraction
LP + Expansion Figure 13.1
Let us review some characteristic properties that distinguish the calculus L first from weaker systems in the categorial hierarchy, then from stronger systems.
42
CHAPTER 1.3
L AND WEAKER SYSTEMS
Associativity
Whereas AB assigns a unique derivation to a non-ambiguous expression, F and L will associate the same expression with multiple derivations. L is stronger than F in this respect, in that the concatenation operator in L is fully associative: X• (Y• Z) (X• Y)• Z. As a result of Associativity, if a concatenation of types A j , . . . ^ derives B in L, it will derive B for any bracketing of A1,...^An. The validity of the Division rule R5 is an immediate consequence of the Associativity of L; full associativity is lost in systems where R5 does not hold. Nevertheless, the system F inherits a great deal of the flexibility of L; in fact, all the derivations used in Section 1.1.2. to demonstrate the flexible notion of constituent structure can be obtained on the basis of F. Example 1.3.1 below is an illustration of a theorem that cannot be derived in F, although it is L-valid. This example requires recursive application of R5 to the type C\D in order to obtain the required type for the main functor to combine with its argument by Application; the simple Composition rule R2 (an axiom of F) is not applicable. Notice that the sequent of Example 1.3.1 would be derivable under a recursive extension of the Composition rule R2 (cf. Steedman 1984, Moortgat 1984): instead of cancelling the domain subtype of the main functor C\D immediately against the range subtype of the subordinate functor, we recursively descend into the range of the subordinate functor until the required matching middle term is reached. As the following chapters will show, a wide range of natural language phenomena supports the L-valid generalized form of composition rather than the non-recursive F instantiation: the near-associativity of F does not seem to correspond to linguistically relevant distinctions. Example 13.1 (A\(B\C)),(C\D) => (A\(B\D)) [Rl] (A\(B\D)) (A\(B\C)) [R5] ((A\(B\C))\(A\(B\D))) [R5] ((B\C)\(B\D)) (C\D)
THE LAMBEK CALCULUS
43
Non-finite axiomatizability Comparing L with its weaker relatives AB and F we noticed that the combinatory possibilities of the latter can be captured in the form of a finite number of cancellation schemes with the status of axioms, {R1,R2,R3,R4} in the case of F and just {Rl} in the case of AB. The full calculus L does not share this property of finite axiomatizability. Theorem 13.1 (Zielonka 1981): No finite categorial calculus (i.e. extension of the basic Application rule R l in the form of a finite number of cancellation schemes) is equivalent to (the product-free part of) L. Zielonka's result does not imply that it is impossible to give a finite design for L, however. But it implies that in order to faithfully reflect the combinatory possibilities of L, the axiomatization must be based on recursive generalizations of the basic reduction laws, rather than on arbitrary extensions of the cancellation schemes RlR6. The recursive generalizations do justice to the non-finite axiomatizability result in that they represent an infinite family of cancellation schemes. The discussion of Division in the previous paragraph already suggested this. The Division rule R5 is the characteristic theorem of L which distinguishes it from the weaker F. But if we want to derive a theorem on the basis of R5 which is not derivable on the basis of Composition, we have to apply R5 recursively, as we just demonstrated. Figure 13.2 reproduces Zielonka's own recursive axiomatization of L, which has Lifting (R4) and Division (R5) as the basic axioms of type-change, Application as the only binary reduction rule, and two derived inference rules, Z1 and Z2, allowing for recursion of the unary type transitions on the domain and on the range components of complex categories. Where Application fails for initial type assignments, the recursively generalized unary type transitions of A2 and A3 can feed the Cut rule, until the types have been shifted to match the Application scheme. We have seen an illustration of a Zielonka-style derivation in Example 1.1.20. In the discussion of Lambek parsing in Chapter 5 we will present a recursive axiomatization of a subsystem of L which is closely related to the Zielonka axiomatization.
44
CHAPTER 1.3
X=>X
[Al] [A2] X/Y => (X/Z)/(Y/Z)
Y\X*(Z\Y)\(Z\X)
[A3] X => Y/(X\Y)
X => (Y/X)\Y
[Rl] X/Y,Y=*X
Y,Y\X => X
[Zl] X/Z => Y/Z if X=* Y
Z\X=>Z\Yif X=»Y
[Z2] Z/Y=> Z/X if X=> Y
Y \ Z => X \ Z if X=*Y
[Al] Identity [A2] Division [A3] Lifting [Rl] Reduction: base case (application) [Zl] Reduction: recursion on the range [22] Reduction: recursion on the domain + Cut (cf. Definition 1.2.4) Figure 132 Count invariance
A crucial difference between the Gentzen proof system for implicational logic and its categorial relative L is their respective counting ability. In the logical case, the antecedent is a set of premises; in the categorial case of L, it is an ordered list of type occurrences. In order to prove a sequent A1,...rAn =* B one can use the antecedent types once and only once. Double use of antecedent types is excluded (so that from NP,NP\(NP\S) one cannot derive S), and one cannot have unused antecedent types (excluding a derivation NP,NP,(NP\S) =» S, for example). The fact that L is an occurrence logic leads to an important invariance property related to the category count. Recall from Definition 1.1.6 the count function from Van Benthem (1986a), which counts the positive minus the negative occurrences of a basic type in an arbitrary type. The count function is an invariant over L derivations in the following sense: Theorem 132 (Van Benthem 1986a) : If a sequent A ^ - j A ^ B is derivable in the calculus L, then for basic categories X, the X-count of the antecedent A1,...rAn equals the X-count of the succedent B.
THE LAMBEK CALCULUS
45
For axiom sequents X => X the theorem holds immediately, and for the inference rules of the L calculus, the conclusion preserves the count values of the premise sequents, as the reader can immediately verify. The count invariant has practical proof-theoretic relevance because it allows one to evaluate nodes in the Gentzen search space before actually exploring them, as will be discussed in Chapter 4.
STRONGER THAN L : BETWEEN L AND L P C E
Despite its great flexibility, the calculus L has a number of limitations that make it necessary to look for stronger systems. The first limitation has to do with the fact that the directional calculus L is order-preserving: the Associativity property allows one to rebracket a string in any conceivable way, but it does not allow for permutations. The second limitation of L is related to the property of count-invariance, which prohibits double use of antecedent types or unused types. In the present paragraph we briefly introduce the formal properties of calculi that remove these two limitations. The systems LP and LPC, LPE, LPCE can be derived from L by moving closer to the full constructive implicational logic by adding a set of structural rules, besides the logical rules for the connectives. Definition 13.1 Structural rules for L (Van Benthem 1987a) Permutation : T=>Xifpermute(T,F) andP =>X Contraction : U,X Y if U,X,X Y Expansion: U,X,X Y if U,X => Y LP is obtained from L by adding a structural rule of Permutation. If a sequent A1,...rA11 B is derivable in the directional system L, any permutation of the antecedent A1,...vAn will also derive B in LP, i.e. LP is the permutation closure of L. It will be obvious that for the purposes of LP one division connective with its Introduction and Elimination inference rule would be sufficient: the structural rule of Permutation allows one to change the order of the antecedent types. In the calculus LP the concatenation operation is not only associative, but also commutative, i.e. X - Y » Y - X . LP loses the order-preservation property of L, but it keeps count-invariance. Whereas L was characterized in Figure 1.3.1. as a logic of ordered lists of type occurrences, LP treats the antecedent as a multiset (or bag) of types. Count-invariance is lost as soon as we move beyond LP by the further addition of structural rules of Contraction and/or Expansion, which allow for multiple use of antecedent types (LPC) and for unused types (LPE). In LPCE, as in the full constructive implicational logic, the antecedent is a set of types: if the antecedent U,X
46
CHAPTER 1.3
derives Y, enlarging the antecedent set to U,X,X will preserve derivability. Natural language phenomena that require some of the expressive power of structural rule extensions of L are easy to find. Without attempting linguistic analyses here, we just mention some illustrations. For the structural rule of Permutation, one can think of socalled stylistic movement phenomena, i.e. reordering in directional systems that does not form part of the core grammar as such. Heavy NP Shift would be an example in English. On the assumption that a verb like consider is assigned the type ((NP\S)/AP)/NP, which directly accepts the standard ordering he considers NP incompetent, the shifted ordering of the end-sequent in Example 1.3.2 is the result of a Permutation inference, which has exchanged the order of the heavy NP object and the light AP complement. Example 132 Permutation: Heavy NP Shift. Compare:
he considers Bill incompetent he considers incompetent any candidate who hasn't ...
[/L] o NP,((NP\S)/AP)/NP,NP,AP
S
[Permutation] i NP,((NP\S)/AP)/NP,AP)NP
S
The structural rules of Contraction and Expansion can be used to derive sequents with missing or redundant types in the antecedent, as would be called for in the case of deletion and copying phenomena. Gapping comes to mind as an illustration of Contraction. In the discussion of (non-constituent) coordination, we have seen that the associativity of L guarantees that arbitrary subsequences of a derivable sequence can be collected into a constituent, which can then be conjoined by the Boolean particles of type (X\X)/X, for variable X. However, the order-preserving nature of L-valid reductions requires the conjuncts to consist of material adjacent to the conjunction particles. Gapping, therefore, falls outside the scope of L, as it involves the deletion of material which is not adjacent to the conjunction particle: the right conjunct of a gapping sentence falls apart in disconnected subsequences that cannot be combined by means of L-valid reductions. The end-sequent of Example 1.3.3 is derived by means of a Contraction inference, which has deleted the types ((NP\S)/VP)/NP and VP from the second conjunct.
47
THE LAMBEK CALCULUS
Example 1 3 3 Contraction: Gapping. John promised Mary to stop smoking and Fred (promised) Sue (to stop smoking)
[/L] b NP,((NP\S)AT,)/NP,NP,VP>(S\S)/S,NP,((NP\S)AT)/NP>NP)VP^S [C] 6 NP,((NP\S)AT)/NP,NP,VP,(S\S)/S,NP,NP => S Whereas Contraction results in a sequent where the contracted types do double duty semantically, the structural rule of Expansion leads to sequents with redundant types. One can think here of the phenomena discussed by Ross (1967) under the rubric Copying. Compare the Heavy NP Shift sentence of Example 1.3.2 with the case of Right Dislocation in Example 1.3.4. There are two occurrences of the direct object NP type to satisfy one role in the thematic structure: the dislocated NP provides the semantic content of the direct object, but from a syntactic point of view, the semantically empty object pronoun NP has already satisfied the first NP subtype of the functor ((NP\S)/AP)/NP. Example 13.4 Expansion: Copying. He considers them incompetent, these new candidates who haven't...
NP=>NP o
[/L] [Expansion]
[/L] i NP,(NP\S)/AP,AP
NP,((NP\S)/AP)/NP,NP/JP
S
S
6 NP,((NP\S)/AP)/NP,NP,AP,NP =*• S
The empirical motivation for systems beyond L forms one of the main areas of research activity at the moment. It will be clear from the above examples that from a linguistic point of view we are interested in systems between L and the stronger calculi, since
48
CHAPTER 1.3
blind application of the structural inferences would massively overgenerate. The characterization of such intermediary systems is far from trivial, as we will see in Chapter 3, where we will discuss discontinuity phenomena in greater detail. Recogtiizing capacity The study of the recognizing capacity of various categorial calculi in connection with the Chomsky hierarchy has yielded a rich body of results, from which we mention only the most essential here. From the point of view of recognizing capacity, both AB and F are weakly equivalent to CF Grammars. For AB this equivalence is well-known and not so surprising; for F the result is more striking: although the strong generative capacity increases dramatically, F stays within the CF realm. The major open mathematical question is whether L itself recognizes more than the CF languages. Buszkowski (see his 1988) has been able to show CF equivalence for unidirectional Lambek systems, but not for the full system L with right- and left-division. Van Benthem (1987a,b) proves that the non-directional calculus LP recognizes all permutation closures of CF languages, which includes non-CF languages, for example Bach's MIX language consisting of an equal number of a's, b's and c's in any order seen as the permutation closure of CF (abc) + . Whether it recognizes more than the permutation closure of CF languages is open again. Finally, LPCE has very poor discriminatory power: it recognizes only a small subset of the regular languages.
BACKGROUND
For AB, see Ajdukiewicz (1935/1967) and Bar-Hillel et al. (1960). The calculus F of Cohen (1967) was the by-product of an attempt to prove conjectured CF-equivalence for L. Zielonka (1978, 1981) exposed the flaw in this proof: F does not have Division as a theorem whereas L does. The (1981) paper is interesting for a number of alternative formalizations of flexible calculi. Buszkowski (1988) has a useful survey of results on generative power. Van Benthem (1987a,b) should be consulted for LP and LPCE. See also Friedman et al. (1986) for a demonstration that weak subsystems of LP (based on the Steedman systems to be discussed in Chapter 3) recognize languages beyond CF.
THE LAMBEK CALCULUS
49
1.3.2. POLYMORPHISM AND UNIFICATION
The category objects we have been studying so far were all fully instantiated structures. We will explore in this final section some consequences of allowing the category objects to be associated with partial information that is gradually made more specific in the course of a derivation by means of unification. This is an active area of categorial research, mainly in connection with logic programming and unification based approaches to grammar in general. Here are some open research questions. First, there are two sources for polymorphism in the categorial systems we have considered: the polymorphism following from the flexible calculus of type transitions, or polymorphism resulting from partial specification in categories right from the lexical type assignment. The proper balance between these two forms of polymorphism needs investigation: How much variable polymorphism is needed beyond the inherent polymorphism of the type-shifting calculus? A second open issue concerns complexity. In general, one can conceive polymorphic versions of the various calculi presented above. But the complexity properties of polymorphic variants of the flexible categorial calculi are as yet unknown. It is not clear, for example, whether a polymorphic version of L remains decidable. Finally, it has to be shown how the mapping between syntax and semantics discussed in Section 1.2.3 can be extended to polymorphic calculi. In Chapter 5 we investigate a combination of the Lambek-Gentzen calculus with the Resolution proof procedure to address these questions in greater detail. In the present section, we will informally characterize the crucial notions of polymorphism and unification. In order to deal with partially specified category objects, we extend the original definition of CAT with a set of variables VAR so that VAR is a subset of CAT. By the inductive definition of the set of possible categories, a category can now contain variables at each level of embedding. We call a category that is not completely instantiated a polymorphic type. To distinguish variables from the atomic categories, we will write the variables in lower case italic. Example 13.5 Polymorphic types: (x/(NP\c)),y, ((AP/y)/(AP/y)),...
Intuitively, the unification of two category objects will be a category that combines the information of both categories if this is compatible, by instantiating the variables in both categories to such values that the categories become identical.
50
CHAPTER 1.3
Definition 1 3 2 Unification. Two types T 1 and T 2 unify as follows: (1) If Tx and T 2 belong to BASCAT, they unify if they are identical. (2) If T x is a variable and T 2 anything, they unify, whereby Ti is instantiated to T2. Conversely, if T2 is a variable, then T 2 is instantiated to Tj. (3) If T 1 and T 2 are complex types then they unify only if (a) T j and T 2 have the same main connective, and (b) their immediate subtypes unify. Otherwise, unification of T x and T 2 fails. The unification algorithm can be clarified on the basis of the graph representation introduced in Section 1.1. Consider Figure 1.3.3, where we try to unify two polymorphic categories.
((AP/x)/(AP/x)) u ((y/PP)/z) = ((AP/PP)/(AP/PP)) with {x = PP, y=AP, z = (AP/PP) } Figure 1 3 3 In this example, an attempt is made to unify two complex categories. The main connectives match, so we can recursively descend to the immediate subtypes, and check whether these can be unified. In the case of the range subtypes (AP/jc) and (y/PP), unification can proceed since the connectives are identical. The unification of AP and y succeeds with the instantiation { y = A P } ; similarly for the unification of x and P P with instantiation { x = P P } . The unification of the domain subtypes of the top-level categories is immediately a non-recursive case of matching a complex category (AP/r) with a variable category z; unification succeeds with an instantiation {z = ( A P / P P ) } , as x was already constrained to equal PP. In the following paragraphs, we see unification at work for the two types of polymorphism introduced above.
THE LAMBEK CALCULUS
51
TYPE SHIFTING POLYMORPHISM
The reduction laws R1-R6 are to be interpreted as rule schemata with implicit universal quantification: they yield specific instances when we instantiate the variables in the schemata. With the notions of polymorphism and unification at our disposal, we can clarify the relation between the rule schemata and their specific instances. The derivations in Section 1.1.2 up till now have kept a mysterious tinge: what in fact we have shown there is the final result of all the variable instantiations in a successful derivation on the basis of R l R6; we haven't actually shown how these instantiations were found. We assumed up till now that Application and Composition cancel a negative occurrence of a subtype against a positive occurrence under identity. Let us investigate here what happens when instead of identity, we require the cancelled subtypes to unify. R 1 Application: R2 Composition: R4 Lifting:
xfy, y xfy, ytz =» xlz x =*y/(x\y)
Consider Example 1.3.6, the left-branching analysis for the Dutch embedded sentence discussed above in Section 1.1.2. In the first step of the derivation, we try to reduce S/S and NP, which in the case of Application rule R1 fails on the basic type assignment since the argument category S cannot unify with NP: unification of two atoms fails when these are non-identical. So we lift the NP type of John to the polymorphic (x/(NP\c)), a functor looking for a functor with NP domain and some range type x, to give an ultimate result x. Lifting itself does not instantiate the information concerning the range category: we want this information to be contributed by the context, through unification. Reduction of S/S with the lifted (xr/(NP\x)) now succeeds by means of Composition R2. Composition requires the unification of S (the domain of type S/S) with x (the range of (*/(NP\x))). This unification leads to the instantiation { * = S } and gives the result type (S/(NP\S)) for the combination omdat John. In order to combine this functor with the AP type, the latter again can be lifted to (y/(AP\y)), so that the reduction of (S/(NP\S)) with (y/(AP\y)) can be attempted. Composition is successful here, because the middle term (NP\S) unifies with y, giving the result (S/(AP\(NP\S))) for the prefix omdat John gek. For the final step in the derivation no further variable instantiations are necessary: Application can cancel (AP\(NP\S)) under identity, the trivial case for unification. In the derivation below, we show the substitutions that have to be made at each reduction step.
52
CHAPTER 1.3
Example 13.6
[RL] [R2] [R2]
omdat John gek is because John mad is 'because John is mad'
S (S/(AP\(NP\S))) (S/(NP\S))
(Y = ( N P \ S ) } {x = S}
(S/S) 3 omdat [R4]
(X/(NPV))
NPb John [R4]
(Y/(AP\Y)) A P S gek ( A P \ ( N P \ S ) ) b is
POLYMORPHISM: BASIC TYPE ASSIGNMENT
For basic type polymorphism, consider the Boolean expressions not, and, or. Given a category theory that allows for variable types, we can assign these chameleon words the type x/x (negation) and (xVO/jc (coordination), as suggested in Lambek (1958). It makes no sense to assign them a large supply of fully instantiated categories, since the discussion of non-constituent coordination has made it clear that in principle these expressions must be able to assume an infinite number of categorial identities depending on the context they find themselves in. Observe that for negation, the various instances for incomplete expressions can be obtained from a semantically primitive case S/S on the basis of L(P) valid type transitions (Division R5). But for conjunction, this does not work: a VP conjunction type ((NP\S) \(NP\S)/(NP\S) cannot be derived from (S\S)/S by L(P) valid means, since this type transition would violate count invariance, as the reader can check. Within L, then, conjunction can serve as an example of variable polymorphism which is not reducible to the inherent type-shifting case. (The alternative here, as shown in Van Benthem (1988c) is to step beyond the occurrence logic of L, and derive the Boolean type transitions within L(P)C on the basis of a restricted form of Contraction, allowing multiple use of premise types.)
THE LAMBEK CALCULUS
53
Example 13.7 Unification in polymorphic conjunction type (x\c)/x. omdat John gek en Mary dorn is because J. silly and M. stupid is [Rl]
[Rl] [R2] [R4]
S
{*1 = S}
(j:1/(AP\(NP\x1)))
{x2=xi}
(^/(APXCNPVO)) {yi = (NPVx)} (*i/(NP\*i))
NPs John [R4]
(yi/CAPVO)
[Rl]
APs gek ((^ 2 /(AP\(NP\r 2 )))\fe/(AP\(NPV 2 )))) {*=* 2 /(AP\(NPV 2 ))} ((x\x)/x) 3 en
[R2]
fe/(AP\(NPV2)))
[R4] [R4]
(r2/(NP\x2)) NPs Mary
{y2 = (NPV 2 )}
(y2/(APV2)) A P b dom
(AP\(NP\S))s is In Example 1.3.7 above we repeat the derivation of the nonconstituent coordination omdat John gek en Mary dom is from Example 1.1.13, making explicit the interaction of polymorphism and unification that was left implicit above. The conjuncts John gek and Mary dom are the prototypical examples of non-constituents: the combination of two atomic maximal projections, NP and AP. Lifting NP to (jc/(NP\x)) and AP to (y/(AP\y)), and composing the higher order functors with the instantiation {y = (NP\x)} leads to the conjunct type (*/(AP\(NP\*))). At this stage, it is unclear what the final range of this higher-order functor will be: that it is going to be an S is a piece of information to be contributed later by the verb is with type (AP\(NP\S)). This illustrates an important point about unification: when unifying partially specified pieces of information, we look for the most general unification, i.e. we commit ourselves as little as possible in the instantiation of variables. The Boolean connective en now finds itself between two conjuncts of equal type, be it that this type, as we just saw, is still the partially instantiated (x/(AP\(NP\*))). The polymorphic conjunction (x\c)/x now has to commit itself to be a conjunction of two (jr/(AP\(NP\x:)) types in order to consume its two arguments by Application. Only when the complete conjunction John gek en Mary dom is combined with the verb of type (AP\(NP\S)) the identity of the range category x can be determined by instantiating the variable to S.
54
CHAPTER 1.3
BACKGROUND
We have restricted our attention to the simplest form of unification (term unification) for category objects with a fixed number of components at each level of recursion. See Shieber (1986) for the more general case of unification that would be called for when we consider feature structures instead of unanalyzed basic categories. The designations 'Unification Categorial Grammar' (Zeevat, Klein & Calder 1987) and 'Categorial Unification Grammar' (Uszkoreit 1986, Bouma 1987) sufficiently stress the centrality of unification on partially specified category objects in the works mentioned. Van Benthem (1988b) discusses the effects of polymorphism on decidability. For unification-based parsing approaches to categorial calculi weaker than L, see Pareschi (1987), Wittenburg (1986,1987).
CONCLUSION
We started this chapter with a catalogue of reduction laws that have been introduced in the literature as a growing range of linguistic phenomena was subjected to categorial analysis. Then we moved to the more abstract perspective of the Lambek-Gentzen calculus which establishes the underlying coherence among these reduction laws by interpreting them as consequences of a logic of the type-forming connectives. Depending on the set of logical and structural rules one wishes to adopt, one can unfold a hierarchy of calculi, embodying distinct theories of categorial derivability. In the first part of this book, we will investigate the linguistic motivation for the flexible theory of categorial derivability. The next chapter adduces empirical support for the associativity property of the directional calculus L by studying restructuring phenomena in syntax and morphology. In Chapter 3, we turn to discontinuous dependencies which invite us to explore the theoretical space between the order-preserving system L and the permutation-closed variant LP. In the second part, we will shift to a logical perspective, and study the Lambek-Gentzen proof theory as an attractive representative of the 'parsing-as-deduction' approach (cf. Pereira & Warren 1983). We will complement the Lambek-Gentzen calculus with the Resolution proof procedure in order to deal with partial information in the course of categorial derivations. The two parts of the book rely on the material discussed in this introductory chapter; apart from that dependence, they are set up as self-contained and can be read in any order.
CHAPTER 2 ASSOCIATIVITY AND RESTRUCTURING
2.0. INTRODUCTION
The property which distinguishes the directional calculus L from weaker type-shifting systems is the full associativity of syntactic concatenation, as the discussion of the categorial hierarchy has demonstrated. In the present chapter, we adduce empirical motivation for the associativity property on the basis of an investigation of the relationship between prosodic constituent structure and semantic interpretation. Standard generative approaches to language structure distinguish a number of autonomous levels of representation, accounting for the properties of semantic, morphological, syntactic and phonological structure in an independent way. The autonomous design of the different representational levels makes the fit between levels imperfect, and additional non-trivial readjustment principles are needed to govern the mapping between them. Within the broad algebraic framework of Montague's Universal Grammar, more attractive options are available: we refer to Oehrle (1988) for a discussion of the general methodological background. As an alternative to the 'readjustment' approach towards the mapping between levels of representation we will study the directional calculus L. With respect to syntactic constituent structure, the associativity of L entails structural completeness: if a sequence of expressions reduces to a result type Y, given a set of initial (lexical) type assignments to the atoms, the resulting type can be derived for any grouping of the subexpressions into constituents. That is, from the point of view of the calculus L the notion of autonomous syntactic constituent structure becomes empirically void. We will argue that the observable structure that remains is to be interpreted as the autonomous prosodic organization of the string: the intonational rather than the syntactic phrasing. The Lambek calculus offers a theoretic framework in which the prosodic structure of an expression can be directly interpreted semantically, without the mediating agency of morphosyntactic constituents differing from the prosodic constituents. The autonomy of syntax, from this perspective, is replaced by the autonomy of types: the informational content of syntactic constituent structure in a conventional phrase structure approach is transmitted to the internal structure of the types in a flexible categorial system, and the valid type transitions preserve this information.
58
CHAPTER 2.1
The structure of the chapter is as follows. In Section 2.1, we introduce the notion of structural completeness, and the consequences for the notion of autonomous syntactic structure. In Section 2.2, we augment the semantically interpreted Gentzen derivations of L sequents with a phonological interpretation procedure, which associates each Gentzen proof with a prosodic phrasing. In the remaining sections, we discuss some standard mismatches between prosodic and morphosyntactic structure from this perspective: conflicts between syntactic and intonational phrasing at the phrasal level, and between phonological and morphological structure at the word level, and cliticization restructurings.
2.1. STRUCTURAL COMPLETENESS
From the point of view of the type-shifting calculus L an expression does not belong just to one type (category) but to a family of related types. The lexical assignment function associates an expression with a basic type (abstracting from lexical ambiguity); from this basic assignment derived types can be generated with an interpretation which is completely reducible to the interpretation of the basic type. As a corollary of the above, a string of atoms (morphemes) is associated not with a rigid constituent structure, but with a set of simultaneous representations reflecting the different ways the string can be reduced. Buszkowski (1988) shows that in fact the set of available representations, as far as bracketing is concerned is complete, in the following sense. Structural completeness of L. If a sequence of categories X ^ . . . , ^ reduces to Y, there is a reduction to Y for any bracketing of Xj,...,^ into constituents. Among these representations, there is no privileged one as far as the categorial calculus is concerned. Compare the set of derivations for the three-word prepositional phrase in the garden. In terms of bracketing, there are two options: a left-branching or a rightbranching structure. Example 2.1.1 presents these options first on the lexical type assignment to the atoms. On top of the branching alternatives, type-shifting can switch the functor-argument relations, as in the derivations with a lifted type for N which turns this element into the functor with the ultimate PP range. Semantically, the derivations all converge on one and the same thematic structure.
ASSOCIATIVITY
59
Example 2.1.1 Structural completeness. Right-branching analysis: [Rl] [Rl]
PP: in(the(garden)) (PP/NP): in NP: the(garden) (NP/N): the N: garden
Left-branching analysis: [Rl] [R2]
PP: in(the(garden)) (PP/N): \x.in(the(x)) (PP/NP): in (NP/N): the N: garden
Right-branching, main functor garden ((PP/N)\PP): [Rl] [R2] [R6] [R4]
PP: in(the(garden)) (PP/NP): in ((PP/NP) \PP): Xx1.x1(the(garden)) ((PP/NP)\(PP/N)): XxXyjf(the(y)) (NP/N): the ((PP/N)\PP): \z.z(garden) N: garden
Left-branching, main functor garden ((PP/N)\PP): [Rl] [R2] [R4]
PP: in(the(garden)) (PP/N): Xx.in(the(x)) (PP/NP): in (NP/N): the ((PP/N)\PP): Xz.z(garden) N: garden
Discussing the rebracketing potential of L, Buszkowski (1988:80) remarks that for this calculus the notions of strong generative capacity and weak generative capacity coincide: instead of a rigid notion of constituency, we have a notion which is flexible to the point that all possible bracketings are available. In the previous chapter, we have demonstrated that Boolean conjoinability forms strong empirical support for the availability of the unorthodox constituent bracketings. But one could rightly ask whether there is any motivation for this total analytic freedom in the case of simple
60
CHAPTER 2.2
non-Boolean expressions. When we talk about constituents with respect to L-derivability, we in fact refer to derivational constituents, i.e. subproofs in the derivation of an L-sequent. It is an open issue whether one wants to associate these derivational entities with objects at some grammatical level of representation. The intuitive idea that a non-ambiguous expression is associated with a specific structural representation, not with all possible bracketings, can be given a theoretical underpinning if we interpret the relevant notion of structure in the prosodic rather than the morphosyntactic domain. The Lambek calculus, from this perspective, suggests a way to eliminate so-called mismatches between levels of representation, i.e. between ultimately the (autonomous) prosodic representation, and the semantic representation (or an isomorphous morphosyntactic representation). The basic type assignments determine the thematic structure of an expression, i.e. the government relations between the elements; these semantic dependencies are preserved (modulo scope ambiguities) by the valid combinatory operations. As a result, the prosodic structure itself can be directly interpreted compositionally. The level of autonomous syntax turns out to be a dispensable artefact between the two indispensable levels of sound and meaning.
2.2. GENTZEN PROOFS: PROSODIC INTERPRETATION
To give content to the discussion of syntax/prosody mismatches, our task now is to define the relevant aspects of a phonological interpretation procedure for Gentzen derivations. We will limit our attention to the grouping of the phonological material into coherent prosodic constituents: intonational phrasing at the phrasal level, hierarchical phonological (rather than morphological) structure at the subphrasal level. Selkirk (1984) shows that the morphosyntactic constituent structure, as defined by a pure application analysis, does not determine intonational phrasing, and therefore assumes that the surface structure of an expression can be freely partitioned into intonational phrases, that phrasing then being subject to prosodic well-formedness conditions. We will take this characterization of the syntax/prosody interface as our starting point, and show how the intonational phrasing can be semantically interpreted. Our investigations remain incomplete in one important respect: we will take the notion of prosodic well-formedness as given. An independent categorial characterization of the grammar of prosody goes well beyond the scope of this chapter; it could take its starting point in the work of Bach & Wheeler (1981), Wheeler (1981), Schmerling (1982), Dogil (1985), Wheeler (1988) on the phonological algebra.
ASSOCIATIVITY
61
The design of the phonological interpretation procedure can be completely analogous to the construction of the semantic recipes interpreting Gentzen proofs. The semantic interpretation procedure associates a term in the type-theoretic language of the semantic algebra with every Gentzen proof for an L sequent. The semantic interpretation is constructed compositionally: for every rule in the syntactic algebra (i.e. for every rule of inference in the construction of a proof) there is an operation in the semantic algebra, with functional application corresponding to Elimination inferences, and lambda abstraction to Conditionalization steps. Let us investigate then how, in the phonological algebra, we can compositionally assign a prosodic phrasing to a Gentzen derivation. We will designate the basic operation of phonological concatenation as Out of two prosodic entities ap, the phonological operator '»' builds the prosodic constituent (asb). We will assume that the phonological concatenation operator is non-commutative and nonassociative. It is non-commutative since, obviously, the phonological concatenation of a and b is different from the concatenation of b and a. It is non-associative, because we want the prosodic phrasing to reflect the degree of coherence between prosodic (sub)constituents. Definition 22.1 Operationphonological concatenation non-associative: (a®fr)®c * a®(b®c) non-commutative: (a®b) * (b&a) For reasons that will become clear when we discuss the Introduction rules [/R],[\R], we will also assume the phonological algebra contains an identity element Id, with the self-evident properties with respect to the phonological concatenation operator'»'. Definition 2 2 2 Identity element: Id {Id ® a) = a (a® Id) = a In order to associate a prosodic phrasing with a Gentzen derivation, we must associate the basic building blocks of the syntactic proof with basic operations in the phonological algebra. In Figures 2.2.1 and 2.2.2 below, the notational conventions introduced for the semantic interpretation procedure are adapted to our present purposes. A sequent consists of pairs (Type: Phonology), and the variables X,Y,Z are interpreted as standing either for such pairs or for the Type component of a pair, as the context makes clear. Consider first the Elimination rules [/L] and [\L]. The phono-
62
CHAPTER 2.2
logical interpretation is straightforward. Syntactically, Elimination inferences combine a functor expression X/Y (or Y\X) with the sequence T which makes up the argument expression Y. Prosodically, the phonological value of the functor expression is joined into a prosodic constituent with the phonological value which one obtains as a result of the derivation of the argument type Y from the sequence of types T. T => Y\Ph2 o
oU,X:(P/t7®P/i2),V=>Z
[/L] o \J,X/Y J}hl>T,V => Z
o U,X:(Phl®Ph2),V => Z
T=»YJ%Jo
[\L] Ò U,T,Y\XiP/J2,V => Z Figure 22.1 Elimination Rules Consider next the Introduction inferences [/R] and [\R]. Syntactically, if we can prove that a sequence of types T,Y (or Y,T) reduces to X, we may conclude that the sequence T without the peripheral element Y reduces to an incomplete functor expression X/Y (or Y\X). For the semantic interpretation of Introduction inferences, we prove the premise T,Y => X (alternatively, Y,T X) on the basis of a variable of the semantic type corresponding to the conditionalized subtype Y, and we abstract over this place-holder variable in the conclusion. Likewise, in the phonological algebra, we construe the prosodic structure Phon for the premise sequent T,Y 4 X on the basis of the phonologically vacuous element Id, and require the prosodic structure derived from the premise to match the prosodic structure of the conclusion, which it will do as a result of the simplifications of Definition 2.2.2.
63
ASSOCIATIVITY
oT,Y-Jd^XJPhon
[/R] oT=>X/YJ'hon
p Yjy,T => X-J'hon
[\R] oT=»Y\X.Phon Figure 222 Introduction Rules
The interpretation procedure, as it stands, associates a cut-free Gentzen proof with a prosodic phrasing that coincides with a pure application derivation, i.e. a phrasing isomorphous to the orthodox concept of syntactic constituency. The reader should compare the prosodic interpretation of cut-free proofs in L with the phonological values associated with pure AB derivations in Zeevat, Klein & Calder (1987), where categorial derivations are also unfolded simultaneously in the phonological, syntactic and semantic dimension, but where the syntax is restricted to functional application. Let us illustrate this fact with some examples. Observe, first of all, that type-transitions on the basis of Conditionalization inferences are phonologically vacuous, as indeed we expect them to be. In Figure 2.2.3 we present a phonologically interpreted derivation of the Lifting theorem (R4): the noun phrase the girl has the same prosodic realization for the lifted type S/(NP\S) as it would have for the non-lifted NP type. S:(i/»eegi>/)®M => S:(the®girl)®Id < i NP:(the NP:(the®girl) o
NP:(theogirl),NP\SiM => S:(the®gir[)®Id 6 [\L] N:girl => N^iri o
[/L] o NPW:the,N-girl,NP\Sjd => S:(the®girl)&Id [/R] 6 NPfN-.the,N:girl Figure 2 23
S/(NP\S):(fAe8gi>/)
64
CHAPTER 2.2
Consider next the Elimination inferences. We have seen in Chapter 1 that a non-ambiguous sequent, in cut-free L, has a number of alternative proofs, depending on the ordering of the Elimination steps. When faced with an antecedent with different candidate connectives eligible for elimination, the proof procedure can nondeterministically select an active type. Semantically, the alternative ways of ordering the Elimination steps for a non-ambiguous sequent are equivalent, i.e. they converge on a unique thematic structure representing the government relations between functors and arguments. Likewise, the prosodic representation ignores the ordering of [/L],[\L] inferences: of the total information contained in a proof tree, the prosodic representation preserves the identification of functor expressions and their argument sequences T. Figure 2.2.4 illustrates with the embedded sentence dat ie vertrekt ('that he leaves'). An alternative proof is obtained by switching the order of the [\L] and [/L] inferences, but the prosodic phrasing associated with this proof remains the same. In the cut-free system, there is no proof with the associated phrasing (dat»ie)«vertrekt, the phrasing we would like to obtain, given the fact that the complementizer forms a prosodic unity together with the enclitic form ie of the third person masculine singular pronoun. S :dat®(ie®vertrekt) => S :dat&(ie®vertrekt) o S:(ie®vertrekt) => S:(ie®vertrekt) o
S/S:dat,S:(ie®vertrekt) => S:dat®(ie Xj>hon o
[Cut] o U,T,V
o \J,X\Phon,V => Y
Y
Figure 22 3 In the next sections, we will analyze a number of standard mismatches between morphosyntactic and prosodic constituent structure on the basis of the Cut inference. An important problem that may puzzle the reader is not dealt with here, but treated in detail in the second part of this book. We refer to what can be called the parsing problem. The structural completeness of L gives us the theoretical assurance that cuts can be placed as the intonational phrasing requires. But how do we actually find the proof for the cut inferences T=*X, and is this problem decidable? The derivations in this chapter will be presented in terms of the familiar reduction rules R1-R6, all theorems of L, as we have seen. In Chapter 5 we discuss a recursive axiomatization of an associative subsystem of L which directly addresses the parsing problem.
2.3. INTONATIONAL VERSUS MORPHOSYNTACTIC PHRASING
For a first illustration of conflicting structural demands, we turn to the much discussed mismatch between intonational and syntactic phrasing in complex noun phrases with postnominal modifiers, such as the lettering on the cover of the book (see for example Selkirk 1984, Neijt 1985). On the uncontroversial assumption that articles and prepositions are proclitics prosodically, the phonological component will break up the noun phrase in the three prosodic subphrases: (ithe®lettering)®(pn®the®cover)®(of®the®book). This intonational phrasing is at odds with the uniformly rightbranching structure which would be assigned by a conventional phrase-structure analysis, or by a cut-free L derivation, as in Ex-
66
CHAPTER 23
ample 2.3.1 below. What we would like to suggest is that the motivation for this right-branching syntactic representation is in the end purely semantic in nature. Semantically, the postnominal prepositional phrases are restricting functions, in the sense of Keenan & Faltz (1985): cover of the book entails (i.e. is a subset of, on the assumption that nouns denote sets) cover and similarly lettering on the cover of the book entails lettering. To get the proper interpretation of the complete noun phrase, we want these restricting functions to be in the scope of the determiner. The syntactic structure generated by a conventional phrase structure grammar, or its categorial equivalent, a pure application analysis, offers a level of representation where semantic scope can be defined in terms of the structural notion c-command. Example 23.1 [Rl] analysis. Semantic scope coincides with c-command. Prosodic phrasing: the®(lettering®(on®(the®(cover&(of®(thetóook)))))) [Rl] [Rl] [Rl] [Rl] [Rl] [Rl] [Rl]
NP: the(on(the(of(the(book))(cover)))(lettering)) (NP/N): the N: on(the(of(the(book))(cover)))(lettering) N: lettering (N\N): on(the(of(the(book)) (cover))) ((N\N)/NP): on NP: the(of(the(book)) (cover)) (NP/N): the N: of(the(book)) (cover) N: cover (N\N): of(the(book)) ((N\N)/NP): of NP: the(book) (NP/N): the N: book
To convince oneself of the necessity of a semantic level of representation where the modifiers are in the scope of the determiner, consider polarity environments. The idiomatic Dutch negative polarity item ook maar (any at alt) is licensed by monotone decreasing (downward entailing) functions (cf. Zwarts 1986). The grammatically of the examples below is accounted for on the assumption that the postnominal prepositional phrases are in the scope of the determiners geen (tio) and alle (all), which are monotone decreasing in their first (nominal) argument. (A functor / is
ASSOCIATIVITY
67
monotone decreasing if for X < Y, /(Y) < /(X), where 'X/Y if T,Y=>X, by a simple Conditionalization inference: o T,Y=»X [/R] 6 T=>X/Y With respect to the cliticization problem, we can paraphrase the inference as follows: if a sequence T,Y with right-peripheral element Y reduces to X, then the sequence T reduces to X/Y, an expression which is looking for the missing right-peripheral Y to yield an expression of type X. But with respect to the peripheral element Y we can also infer that if X/Y,Y =* X then Y =*> (X/Y)\X, an inference which the reader will recognize as the lifting theorem R4: o X/Y,Y=>X [\R] b Y=>(X/Y)\X These two inferences together guarantee, in the abstract, that the prosodic restructuring required by peripheral cliticization is derivable within L: the sequence T,Y making up the argument of a peripheral clitic X \ Z can always be partitioned in a peripheral element Y, which reduces to (X/Y)\X with the range subtype X yielding the domain subtype for the clitic X\Z, and a rest T, reducing to X/Y. Let us then discuss some worked-out examples of this general approach. Consider first verbal 's cliticization in the simple case, where the subject is a one-word phrase, as in Example 2.5.1. Assume that the full form is has the lexical type (NP\S)/PP, i.e. it maps PP into VP ( = NP\S) ~ it 'belongs' to the VP semantically. Assume further that the lexical entry for the reduced form 's has the same initial type assignment (the full and reduced forms denote the same function), but that the entry includes the independent prosodic characterization
76
CHAPTER 2.5
of 's as a word-based enclitic. Nothing more need be said in the associative calculus to account for the migration of the clitic from the VP to the subject NP domain. For example, by the valid type transition R3 a right-oriented functor with left-oriented range can be shifted to a semantically equivalent left-oriented functor with right-oriented range. The type shift is triggered by the fact that 's, as an enclitic, must associate to the left prosodically. The rightbranching derivation for full form is and the left-branching derivation for the enclitic form reduce to the same semantic representation. Example 2.5.1 's Cliticization. Prosodic phrasing: (mary®(is®here)) [Rl]
S
is(here)(mary) mary is(here) is here
N P s Mary (NP\S) ((NP\S)/PP) a is PP s here
[Rl]
Prosodic phrasing: ((tnary®'s)®here) [Rl] [Rl] [R3]
S (S/PP) NPb Mary (NP\(S/PP)) ((NP\S)/PP) 3 ' s
PP s here
is(here)(mary) \y.is(y)(mary) mary XxXy.is(y)(x) is
here
The hard case arises when the subject is not a simple word, but a phrasal expression, in which case the clitic forms a prosodic word with the right-peripheral element of that phrase, whatever the internal structure of this phrase may be. In Example 2.5.2 we present a derivation where the prosodic Cut inferences have partitioned the sentence in the major constituents (itheoqueen) ® (of®(England®'s)) ® here, with the word-based peripheral clitic attached to the right-peripheral element of the subject NP. Let us inspect the syntax of the derivation first, and concentrate on the crucial subconstituent (of®(England®'s)). We have slightly simplified the problem by entering the basic N expression queen with the lifted type N/(N\N), which would be immediately reachable by means of R4. The orientation of the verbal clitic is shifted from (NP\S)/PP to NP\(S/PP) by R3, as
ASSOCIATIVITY
77
in the simple case of restructuring above. Clitic 's now is interpreted as a functor looking for an NP argument to the left. The expression England is of the right type, but it does not constitute the full subject NP: it is the right-peripheral element of this phrase. The type shift NP (NP/NP)\NP reflects this syntactic role, and yields the middle term NP which can be cancelled by the Composition rule R2, resulting in the expression England's of type (NP/NP)\(S/PP). In order to collect the preposition of in the prosodic phrase (of®(England®'s)), the right-oriented basic type (N\N)/NP has to be turned into a left-oriented derived type with range subtype (NP/NP). The inverse Division rule R6 has that effect. Example 232 Peripheral cliticization. Prosodie phrasing: ((the®queen)®(of®(England®'s)))®here) [Rl] [Rl] [R2] [R2] [R6] [R2] [R4] [R3]
(S/PP) (NP/(N\N)) (NP/N) 9 the (N/(N\N)) 9 queen ((NP/(N\N))\(S/PP)) ((NP/(N\N))\(NP/NP)) ((N\N)/NP) 3 of ((NP/NP)\(S/PP)) ((NP/NP) \NP) NPb england (NP\(S/PP)) ((NP\S)/PP) 3 ' s PP s here
(the®queen) (of®(England®'s))
(here)
The stepwise construction of the semantic interpretation for the derivation in Example 2.5.2 is presented below. As remarked above, we have abstracted from the type-shift N => N/(N\N) for the expression queen, which means that the primed constant translating the type N/(N\N) is to be interpreted as Xx.x(queen). The interpretation for the complete sentence can therefore be further reduced as follows: S: is(here)(the(queen'(of(england)))) « is(here)(the(of(england) (queen))) .
78
CHAPTER 25
Example 2.53 Peripheral cliticization: semantic interpretation. Semantic interpretation: [Rl] [Rl] [R2]
[R2] [R6] [R2] [R4] [R3]
S: is(here)(the(queen'(of(england)))) (S/PP): Xy3.is(y3)(the(queen'(of(england)))) (NP/(N\N)): Xz^theCqueen^Zi)) (NP/N): the (N/(N\N)): queen' ((NP/(N\N))\(S/PP)): Xx3Xy3.is(y3)(x3(of(england))) ((NP/(N\N))\(NP/NP)): Xy2Xx2.y2(of(x2)) ((N\N)/NP): of ((NP/NP)\(S/PP)): X x ^ . i s ^ X x ^ e n g l a n d ) ) ((NP/NP) \NP): \z.z(england) NP: england (NP\(S/PP)): XxXy.is(y)(x) ((NP\S)/PP): is PP: here
As we observed above, conservative subsystems of L with a finite number of axiom schemes such as the calculus F cannot guarantee that the clitic-host combination will be interpretable as a semantic unit; it is essential here to rely on the associativity of L. The crucial point in the derivation where a characteristic L theorem is called for is the Division type-transition R6, which switches the directionality of the preposition of R6, as we have seen in Chapter 1 is not a theorem of F. In Cohen's system, of course, one could reduce the expression the queen of to NP/NP: NP/N, N/(N\N), (N\N)/NP
NP/NP,
by two applications of the (non-recursive) Composition scheme. And the expression England's can be reduced to a functor with NP/NP domain: NP, (NP\S)/PP
(NP/NP)\(S/PP) ,
on the basis of the reduction schemes R2, R3 and R4 used in Example 2.5.2 which are all valid with F. But F does not allow a rebracketing with the prosodic interpretation (the®queen)®(of®Eng• land&s), as the reader can check. Since peripheral cliticization requires prosodic restructuring under full associativity, we cannot rely on a calculus weaker than L if we want to model the direct interpretation of prosodic constituency.
ASSOCIATIVITY
79
CONCLUSION
The associative calculus L suggests a line of research which would be doomed to fail in rigid constituent theories: instead of making the prosodic organization parasitic on syntax, one can independently characterize the grammar of prosody and provide a direct semantic interpretation for the structures it defines. The feasibility of such an approach derives from the structural completeness of the flexible categorial calculus, and from the fact that the information standardly encoded in tree geometry is locally available at the level of the internal structure of types. Against the background of this program, the aims of this chapter have been modest: we hope to have indicated how the flexible constituent theory of Lambek's categorial calculus resolves some typical mismatches between levels of representation, and to have stimulated the imagination of the reader to further explore the possibilities of this approach.
CHAPTER 3 BETWEEN L AND LP: DISCONTINUOUS DEPENDENCIES
3.0. STRATEGIES FOR EXTENDING L
In the present chapter we discuss two ways of extending the pure directional calculus L in the light of discontinuous dependencies. We will investigate the theoretical space in between L and LP and try to identify a calculus which is stronger than L in allowing for empirically motivated discontinuities, but weaker than LP, Le. a system which retains part of the order-preserving quality of the Invalid type transitions. The first strategy consists in adding specific theorems from LP (or the stronger systems LPC/LPE) as extra axioms for L-derivability. We show that this form of extension forces one to give up the notion of a free syntactic algebra, which is such an attractive feature of pure L. We demonstrate that the extra axioms motivated by discontinuities make L collapse into LP, given a transitive notion of derivability. In order to avoid degeneration into LP, the type transitions borrowed from LP, instead of being universally quantified axiom schemes, must take the form of schemes with (inequality constraints on the type parameters. We therefore limit the use of type-restricted axiom extensions to phenomena that can be analyzed in terms of lexically governed unary type transitions, in conformity with the view that lexical type assignment is the only locus for stipulation. We illustrate this approach with complement inheritance in morphology and verb-raising clusters in Standard Dutch.
Copying/Deletion
LPCE
LP
Discontinuity
L+{Cut} Figure 3.0.1 Adding type-restricted axioms from LP, LPCE
Free syntactic algebra
82
CHAPTER 3.0
The second approach is more radical and faces discontinuity at the syntactic level. It localizes the problem of dealing with discontinuity within L in the fact that the set of L connectives {/,*,\} with their logical rules of inference embody a concatenative theory of incomplete expressions, whereas discontinuous dependencies ask for a richer, non-concatenative theory of functors. We recast the proposals for non-concatenative operations of Bach, Schmerling, Oehrle, and others in terms of the Lambek-Gentzen sequent approach, by extending the set of L connectives with a set of extraction/infixation operators, {}. The behaviour of the new connectives is governed by the usual Elimination and Introduction inferences, and their application/abstraction interpretation, i.e. we carry over in the stronger system the decidable notion of Gentzen derivability and the lambda semantics for Gentzen proofs. We prove that the desired order-preservation properties of the non-concatenative operations, instead of being stipulated, follow as theorems in L + { < , t , > } . LP degeneration can be avoided by switching to a partial logic for the non-concatenative connectives, with one-way inferences regulating their behaviour instead of the two-way Elimination/Introduction logic. A full logic for the non-concatenative connectives is beyond the expressive power of the Gentzen implementation of L as a result of the monotonicity of Gentzen derivations.
LP
L'
non-concatenative connectives L+{Cut}
Figure 3.0.2 Extending L with non-concatenative connectives L' = L + { < , > } : infixation/extraction operators
DISCONTINUITY
83
3.1. L P ADDITIONS TO THE L AXIOM BASE
The area beyond L is a field of intensive study because of the linguistic importance of phenomena of discontinuity and multiple dependency that require stronger combinatory powers than those of the directional calculus L. We have seen in Chapter 1 that from a formal point of view, discontinuity and multiple dependency require the addition of structural rules to the logical rules of inference. Structural rules for L Permutation : T => X if permute (T,P) and P => X Contraction: U,X => Y ifU,X,X =*>Y Expansion: U,X,X =>YifU,X=>Y
LP LPC LPE
The resulting formal systems LP, LPC, LPE have clear-cut mathematical properties, but are too crude from a linguistic point of view. Consider permutation phenomena. Linguistically, we are not so much interested in the permutation closure of L which results from the addition of the Permutation rule to the free syntactic algebra, but in systems between L and LP, that would be expressive enough to account for phenomena of discontinuity ('movement') in otherwise order-preserving systems. A useful concept in this connection is the notion of the permutation dual of an L-valid type transition. Definition 3.1.1 Permutation duals X=>Z\Y if X=>Y/Z
X=>Y/Zif X=> Z\Y
The permutation duals are LP pendants derived from L-valid type transitions by switching the directionality of the connective. They translate the manipulation of the antecedent sequence effected by a structural rule of Permutation into a manipulation on the connective. The type-shifting rules that have been proposed in the literature to deal with discontinuity can be reduced to the common denominator of permutation duals. Table 3.1.1 gives the characteristic theorems of lifting, main and subordinate division in their L-valid form (R4,R5,R6), and in the permuted LP variant (R4',R5',R6'). The reader will keep in mind that the combination of one of the division rules with the application rule R1 yields the composition rule, as we have seen in Chapter 1. We return to this equivalence below.
84
CHAPTER 3.1
RIGHT
LEFT-
L
X => Y/(X\Y)
X=>(Y/X)\Y
LP
X=>Y/(Y/X)
X=*(X\Y)\Y
R4: LP Lifting: harmonic RIGHT
LEFT-
L
X/Y=> (X/Z)/(Y/Z)
Y\X => (Z\Y)\(Z\X)
LP
X/Y => (Z\X)/(Z\Y)
Y\X
(Y/Z)\(X/Z)
R5: LP Division : disharmonic variant: main functor RIGHT
LEFT-
L
Z\Y => (Z\X)/(Y\X)
Y/Z => (X/Y)\(X/Z)
LP
Y/Z
Z\Y =» (X/Y)\(Z\X)
(X/Z)/(Y\X)
R6: LP Division : subordinate functor Table 3.1.1 In the following section, we informally present some of the salient syntactic phenomena that have been adduced in support of combinatory rules that, from the Lambek perspective, amount to LP extensions to the axiom base. The proposals stem mainly from recent work by Steedman (e.g. 1987a, 1988) and Szabolcsi (e.g. 1987), where the reader can find careful linguistic analysis of the phenomena in question. We then investigate the formal consequences of such LP extensions with respect to the recognizing capacity of the type calculus.
DISCONTINUITY
85
3.1.1. PERMUTATION DUALS: LIFTING, COMPOSITION, SUBSTITUTION
Unbounded dependencies such as w/i-extraction or topicalization represent the prototypical examples of discontinuous dependency. In the embedded sentence (I know) who John is related to the NP object of the preposition to does not appear in the position where it belongs semantically, but in front of the sentence. How can we derive this construction while sticking to the surface-based methodology, i.e. without introducing movement or abstract empty categories? Within L it is impossible to derive who John is related to as a permutation of John is related to who, since the L-valid type transitions are order-preserving. But staying within L, we can collect the adjacent string of categories for John is related to into a constituent given the Associativity of this calculus. Lifting R4 and Composition R2 will derive this sequence in the category (S/NP), i.e. an incomplete sentence that is still looking for an NP type at its right periphery. Suppose now that we assign the w/i-element the type (S/(S/NP)), to make the complete embedded sentence derivable. This second-order type would be reachable from the primitive NP type by means of a version of the Lifting rule that, unlike the order-preserving R4, has two identical slashes, in this case rightdivision. Let us refer to the permutation form of Lifting as R4'; because the directionalities of the introduced connectives match, we call this version harmonic. Definition 3.12 R4': Harmonic Lifting X: a =» Y/(Y/X): \v.v(a) Harmonic Lifting is a valid type transition in LP. It may be useful here to unfold the Gentzen proof in order to localize the precise role of the structural rule of Permutation. As the Gentzen proof below shows, R4' is be assigned the same semantic recipe as the Livalid counterpart. The Permutation inference manipulates the order of the types in the antecedent, but it does not change the lambda terms interpreting these types.
86
CHAPTER 3.1
o Y:v(a) => Y:v(a)
X:a => X:a o
[/L] o Y/X:v, X:a => Y:v(a) [Permutation]
A X:a, Y/X:v => Y:v(a)
[/R] 6 X:a => Y/(Y/X):Xv.v(a) Figure 3.1.1 For w/t-elements we could assign the second-order type in the lexicon, as their basic type assignment. But for topicalization constructions like John, she wouldn't like to be related to we need a type-transition (triggered by the distinctive intonational pattern for topicalized phrases) from basic NP's to the permutation lifted type. Example 3.1.1 Peripheral extraction: (I know) who John is related to [Rl] [R4'] [R2] [R2] [R2] [R4]
S
(S/(S/NP)) NPs who (S/NP) (S/PP) (S/AP) (S/(NP\S)) NPs John ((NP\S)/AP) 3 is (AP/PP) 3 related (PP/NP) s to
The above example showed an instance of peripheral extraction. Unfortunately, the extension of L with R4' does not suffice to make non-peripheral cases of extraction derivable. In the sentence (I know) what John put on the table the extraction site is in between the subsequences John put and on the table. The Associativity of L allows one to compose any subsequence of adjacent elements of a derivable sequence into a constituent, but we cannot bridge the gap site in this case on the basis of the order-preserving form of Composition. Consider then a disharmonic variant of Composition, R2', which makes the non-peripheral case of extraction derivable.
DISCONTINUITY
87
Definition 3.13 R2': Disharmonie Composition Y/Z: g, Y\X:f=> X/Z: Xx.f(g(x)) Unlike the L-valid form of Composition, with matching directionalities of the connectives, R2' allows for the combination of functors with opposite directionalities. Like all the reduction laws discussed so far, R2' (and R4' above) come in symmetric duals. We present here just the instantiation needed for the example sentences; in this case, the combination of a left-oriented main functor Y\X with a right-oriented subordinate functor Y/Z. Figure 3.1.2 again localizes the Permutation step in the derivation of R2'.
Y:g(x)=>Y:g(x)o
Z:x
Z:x o
oX:f(g(x)HX:f(g(x))
[\L] ¿> Y:g(x), Y\X:f => X:f(g(x))
[/L] ¿> Y/Z:g, Z:x, Y\X:f => X:f(g(x)) [Permutation]
4 Y/Z:g, Y\X:f, Z:x => X:f(g(x))
[/R] i Y/Z:g,Y\X:f=>X/Z:\x.f(g(x)) Figure 3.12 For the derivation of Example 3.1.2 below, the argument PP is lifted in the L-valid way into a functor with the second order type ((NP\S)/PP)\(NP\S), i.e. a functor that will give a verb phrase (=NP\S) with an ((NP\S)/PP) to its left. The lifted PP argument is combined with the lexical type for put through Disharmonie Composition, cancelling the middle term ((NP\S)/PP), so that the resultant ((NP\S)/NP) inherits the missing object argument NP.
88
CHAPTER 3.1
Example 3.1.2 Non-peripheral extraction. (I know) what John put on the table [Rl] [R2] [R4] [R2'] [R4] [Rl] [Rl]
S (S/(S/NP)) s what (S/NP) (S/(NP\S)) NPb John ((NP\S)/NP) (((NP\S)/PP)/NP) s put (((NP\S)/PP)\(NP\S)) PP (PP/NP) s on NP (NP/N) a the N b table
Again R2' would be valid in the permutation closed calculus LP. But note that the version of Disharmonic Composition R2' needed for the above phenomena is intermediate between the order-preserving form R2 and the variants in LP that are completely order-insensitive. Within L only functors with equal directionality can be composed. For R2', we want the individual order constraints of the main and the subordinate functor to be respected: the subordinate functor has to appear where the directionality of the main functor wants it, and the composed functor inherits the directionality of the subordinate functor. This gives two new possibilities, that of Definition 3.1.3 with a right-oriented main functor X/Y, and a symmetric case with a left-oriented main functor Y\X. Within the permutation closed calculus LP all of the above four variants are valid, but also all further permutations of main and subordinate functor in their different directional instantiations as long as the result respects count-invariance, i.e. the cancelled subtype occurs negatively in the main functor and positively in the subordinate functor. We return to this point in Section 3.3.
MULTIPLE DEPENDENCIES
The discussion of discontinuity still stays within the family of count-invariant calculi: discontinuous dependencies force us to relax the order-preserving constraint on type transitions in L, but the antecedent sequence is still treated as a multiset of premise occurrences. For phenomena that require L(P)C extensions because they
DISCONTINUITY
89
require double use of antecedent types, we turn to parasitic gaps, as discussed in Steedman (1987a) and Szabolcsi (1987). In the relative clause of Example 3.1.3, (the article) which John filed without reading, the fronted w/i-element has to be associated with two gap sites in its right context, witness John filed (the articles) without reading (the articles). Semantically, the NP meaning associated with which does double service, as object of file and as object of read in the VP modifier without reading NP. The parasitic gap case is derivable as soon as a further combinatory rule of Substitution is added to the calculus. Definition 3.1.4 Substitution. [Sub] Y/Z:g, (Y\X)/Z:/=> X/Z: \x.f(x)(g(*)) Example 3.13 Substitution. (the article) which John filed without reading [Rl] [R2] [R4] [Sub] [R2]
(N\N) ((N\N)/(S/NP)) 3 which (S/NP) (S/(NP\S)) NPs John ((NP\S)/NP) ((NP\S)/NP) 3 filed (((NP\S)\(NP\S))/NP) (((NP\S)\(NP\S))/Ving) 3 without (Ving/NP) s reading
Observe that the type transition of Definition 3.1.4 is not countinvariant: the antecedent Z-count is -2, as there are two negative occurrences of the subtype Z; the Z-count of the succedent is -1. From the Gentzen perspective, the succedent X/Z is derived by Conditionalization over the subtype Z, but two occurrences of this subtype have to be inserted in the antecedent sequence to make X derivable. Hence the lambda abstraction that corresponds to Conditionalization in the semantics binds two occurrences of a variable. See the crucial part of the Gentzen proof for our example in Figure 3.1.3 below, with semantics indicated for the relevant types. In order to derive the sequence John filed without reading in type S/NP, [/R] conditionalizes over the succedent subtype NP, abstracting over a
90
CHAPTER 3.1
variable v of type NP corresponding to the NP subtype added to the antecedent. The Contraction step doubles this antecedent NP occurrence.
!»NP,VP/NP,NP:v,(VP\VP)A'ing,Ving/NP,NP:v^S:Term [Contraction] [/R]
A NP,W/NP,(W\W)/Ving,Ving/NP,NP:v => SrTerm o NP,W/NP,(VP\VP)/Ving,Ving/NP => S/NP:\v.Term Figure 3.13
Notice that from the point of view of order-preservation, the version of Substitution given in Definition 3.1.4 is of the disharmonic variety like R2': two functors with opposite directionality are combined, but the individual ordering requirements of the functors are respected. In Example 3.1.3, the main functor (Y\X)/Z is the left-oriented VP modifier phrase without reading, which combines to the left with the right-oriented secondary functor Y/Z, i.e. the transitive verb filed. The resulting X/Z wants its Z argument to the right, where the transitive filed wanted it.
3.1.2. COLLAPSE INTO L P
The general picture emerging from the discussion above can be characterized as follows: the directional calculus L forms a suitable basis for the syntax of configurational languages, but it needs to be extended with extra combinatory possibilities in order to handle phenomena such as discontinuity and multiple dependency. The question has to be asked now whether the addition of LP theorems to the axiom base of L indeed characterizes a calculus in between L and LP. In Moortgat (1988a) we conjectured that the answer is negative, and the proof of Theorem 3.1.1 below corroborates this conjecture. The addition of the permutation dual of Lifting R4' makes the directional calculus L collapse into the permutation closed variant LP, i.e. adding R4' as an extra axiom to L, one completely destroys the order-preserving quality of the directional system. Because the linguistic role of R4' is somewhat marginal, we will also show that addition of the disharmonic composition rule R2' alone
91
DISCONTINUITY
already entails LP collapse for sequences of length > 2. Theorem 3.1.1 (Moortgat 1988a). Collapse into LP. L+R4' = LP. For a demonstration of the theorem, it suffices to show that in the extended system arbitrary pairs of neighbors can be interchanged; any permutation can then be derived as the composition of such local interchanges, as in the original proof for the permutation closure of LP (Van Benthem 1986a). Consider first the harmonic form of Lifting R4'. Figure 3.1.4 shows in completely general terms that in the extended system L + R4\ if X, Y => Z then also Y,X X,Y
Z Z/Y
Y=>Z/(Z/Y) o[R4']
[Cut]
Z.
[Lemma]
[Assumption] [/R]
o Z=»Z
[/L] o Z/(Z/Y),X=>Z
Y,X => Z
Figure 3.1.4 We can use the above Lemma to demonstrate the permutability of arbitrary neighbors by embedding the local interchange into a longer sequence. Figure 3.1.5 (see next page) shows, on the basis of the Lemma obtained with the R4' type transition, that if U,X,Y,V=*Z then also U,Y,X,V=»Z. The demonstration has been made compact by equivocating between U,V as sequences of types, and as the corresponding product type. By a number of Introduction inferences, we isolate the pair X,Y which can be permuted to Y,X on the basis of the Lemma inference. We then reinstall the conditionalized context material U,V by a number of self-evident Elimination inferences (abbreviated in Figure 3.1.5), leading to the desired conclusion.
92
CHAPTER 3.1
[Assumption] o U,X,Y,V
Z
[\R] ¿X,Y,V=>U\Z
[/R] o X,Y
(U\Z)/V
[Lemma] i Y,X
(U\Z)/V
[Cut] oY,: 6 Y,X,V
[/L] 9 (U\Z)/V,V U\Z
U\Z
[Cut] o U,Y,X,V
[\L] o U,U\Z =» Z
Z
Figure 3.1.5 Notice that the context elements U and V in the above proof could both be empty: for a sequence of length 2, the Lemma immediately applies to obtain the permuted sequence. But as we remarked above, a slightly weaker effect of permutation disturbance can already be obtained by extending L with the disharmonic composition rule R2\ In Figure 3.1.6 (next page) we show that, for sequences of length > 2, arbitrary pairs of neighbors can be permuted through the interaction of the order-preserving form of Lifting R4 and the disharmonic Composition rule R2\ The right branch of the proof projects the structure of the antecedent onto the last element by successive instances of Conditionalization [\L]. The antepenultimate type X is lifted in the L-valid order-preserving way to (U\Z)/(X\(U\Z)). See the node labeled [R4]; the self-evident L inferences leading to this sequent have been omitted. The lifted antepenultimate type (U\Z)/(X\(U\Z)) is then combined with the lifted right-peripheral type Y\(X\(U\Z)) through the disharmonic version of Composition R2', 'skipping' the type Y. A number of straight-forward cuts leads then from the premise U,X,Y,V=»Z to the conclusion U,Y,X,V=»Z. In the above deduction, we use the lifted right-peripheral element as a support type to permute X and Y; the context element U could be empty. In the case where V is empty, one can analogously use U as supporting type, by switching to the symmetric duals for R2' and R4.
DISCONTINUITY
93
[Assumption] Disharmonie composition R2':
o U,X,Y,V=»Z
[\L] ] p (U\Z)/(X\(U\Z)),Y\(X\(U\Z)) =>Y\(U\Z)
[R4] ) X=*(U\Z)/(X\(U\Z))
[Cut] à X,Y\(X\(U\Z))=>Y\(U\Z)
[\L] ¿Y,V=>X\(U\Z)
[\L] 0 V=>Y\(X\(U\Z))
[\L,&c] A U,Y,Y\(U\Z)=>Z [Cut] X,V=»Y\(U\Z)
[Cut] b U,Y,X,V=>Z Figure 3.1.6 Different conclusions can be drawn from the above discussion. Observe first of all that the demonstration of LP-collapse relies crucially on the validity of the Cut inference, i.e. on the transitivity of the derivability relation V . As we have seen in the discussion of Cut Elimination (Theorem 1.2.1), the transitivity of V is a derived property of L: the Cut inference does not increase the set of provable theorems in the pure directional system, since any proof with cuts can be transformed into a cut-free derivation. Whereas the cut-free implementation of L and the alternative version with the Cut inference are extensionally equivalent, we see here that these implementations exhibit different properties when we consider extensions with LP theorems, a point which deserves further logical scrutiny. From a linguistic point of view, it will be clear that we do not want to give up Cut. The Cut inference, as we observed in Chapter 1 and will elaborate in Chapter 5, is the motor behind a bottom-up perspective on categorial derivations; working with a cut-free calculus would make this perspective unavailable. The position defended by Steedman and Szabolcsi is that one can work with a calculus that has R2' and R4' among its combinatory rules by
94
CHAPTER 3.2
abandoning the idea that the categorial reduction system is a free syntactic algebra: unlike the L-valid forms of Lifting and Composition that can be used in a derivation whenever the conditions for their application are met, R2' and R4' must be type-restricted by putting (inequality constraints on the parameters of these laws, i.e. by limiting the categorial identity of the subtypes X,Y,Z to some range of permissible values. For example, the backward disharmonic form of composition R2' would be part of the grammar of English not in its universally quantified form of Definition 3.13, but with the following type constraints on the transmitted argument subtype (Steedman 1987b:ll): Y/Z: g, Y\X:/=> X/Z: Xx./fe(*)) if Z e {NP,PP,AP,VP',S'} We think the abandonment of a free syntactic algebra is too high a price to pay for LP extensions. In the following section, we therefore propose to confine the LP extensions to the lexicon, where they assume the form of lexically governed type-transitions. The implication of this proposal is that LP extensions to the axiom base are available only for discontinuity phenomena that can be tied to lexical atoms. Purely syntactic forms of discontinuity (such as the composition of the domain of unbounded extraction) require more drastic changes to the design of L, which we will discuss in Section 3.3.
3.2. DISCONTINUITY AT THE LEXICAL LEVEL
The cases of prosodic restructuring discussed in the previous chapter as support for the Associativity of L were effected in terms of order-preserving modes of combination: in general, morphological bracketing paradoxes, simple cliticization, etc. required rebracketing, not reordering, of a sequent. In a number of publications (Moortgat 1984, 1987a,b, 1988a) we have analyzed complement inheritance and Dutch verb-raising as lexical phenomena which call for both orderpreserving and discontinuous forms of combination. We will review the argument, as far as it is relevant for our present concerns. We start with a presentation in terms of the binary composition rules R2W, and then investigate the consequences of recasting the analysis in terms of lexically governed unary type-transitions R5(')/R6(0.
95
DISCONTINUITY 3.2.1. COMPLEMENT INHERITANCE
Complement inheritance occurs when an affix is combined with a stem requiring a complement, and the subcategorization for the complement is transmitted from the base to the derived expression. Example 3.2.1 gives illustrations from English and comparable Dutch cases. The inheritance constructions qualify as relatedness paradoxes in the sense of Williams (1981). Semantically, the affixes have a whole phrase in their scope (stem + complement), but morphosyntactically they combine with a proper subpart of this phrase (the head): the affix breaks the sisterhood relation between head and complement. Because of this mismatch, inheritance has been construed as a challenge to strong forms of the compositionality principle, and to AronofPs (1976) theory of word-based morphology. Compositionality, in its intuitive form, requires that for the computation of the meaning of a derived word, the meaning of the parts must be sufficient; what we see here is that the affix has scope over the stem plus its complement, not just the stem. The common strategy to save the principle (advocated in e.g. Fabb (1984)) consists in rejecting the restrictive view on the interaction between lexicon and syntax: believer in magic would be construed as an -er derivation from the phrase believe in magic, by means of an affix movement operation which attaches the affix to the verbal head. Example 3.2.1 Complement inheritance believer in magic workers in GB searchers after truth indebtedness to the king reliance on Bill removal from the board accessibility to all students
tevredenheid met de soep dreiger met zelfmoord geroep om hulp vergelijkbaar met wijn verzaking aan de duivel
Semantic scope: cf. NOM(cry(for help)) Constituent structure:
ge
roep
om hulp
96
CHAPTER 3.2
The discussion of Associativity in the previous chapter has made it clear that phrasal scope of bound morphemes in the semantic algebra does not preclude an analysis which respects the prosodic integrity of the word, i.e. an analysis which treats the combination of affix and base as an object in the morphosyntactic and the semantic algebra. The argument carries over to the case of complement inheritance, as soon as we envisage both harmonic and disharmonic forms of composition. Let us have a look at the left-hand column of Example 3.2.2. These are the basic cases of derivations with the affixes in question. (We present structural equivalents from Dutch and English rather than glosses.) Given the category of the base and the category of the derived expression, we can determine the category of the affixes by solving a number of simple equations like: if A • -ness = N, then -ness = A \ N . The basic type assignments lead to unproblematic pure application derivations as far as the left-hand column is concerned. Complement inheritance occurs when the base is itself a functor, with the category required by the affixes as range. These combinations do not have a pure application analysis. On the assumption that composition R 2 « is available as an alternative combinatory possibility the affix functors can combine with the complex bases, cancelling the equal middle term and thus transmitting the argument of the base to the level of the derived expression. On the basis of R 2 « the inheritance constructions get a surface derivation which respects the prosodic integrity of the base + affix combinations. Example 322 Complement inheritance: harmonic and disharmonic (a)
on gelukkig un happy A/A A
(b)
on afhankelijk un committed A/A A/PP
van NP to NP PP
ge NOM N/V
lach laugh V
ge NOM N/V
roep cry V/PP
om for PP
hulp help
blij deaf A
heid ness A\N
tevreden indebted A/PP
heid ness A\N
met NP to NP PP
slaap dream V
er er V\N
dreig believe V/PP
er er V\N
met NP in NP PP
97
DISCONTINUITY
lees baar read able TV TV\A
vergelijk compare TV/PP
baar able TV\A
met NP with NP PP
Observe that if we want to give a unified analysis to inheritance phenomena, the composition rule cannot be restricted to the harmonic case R2, as the illustrations of Example 3.2.2 show. Prefixation to a right-oriented base (cf. un-) is derivable by means of the L-valid form of composition, because the composed functors harmonize in their directionality. Suffixation to a right-oriented base (cf. -ness) is not derivable within pure L, because of the conflicting directionalities of the composed functors, which gives rise to a discontinuous dependency between base and complement. The existence of the discontinuous forms of inheritance side by side with the order-preserving cases is exactly what motivates the addition of a mixed form of composition to L: inheritance is insensitive to the distinction between standard and mixed composition. The derivations of Example 3.2.3 show how the semantics of functional composition, in its order-preserving or mixed form, resolves the mismatch between morphosyntactic constituent structure and semantic scope that gives the construction the dubious reputation of a 'relatedness paradox'. Although the affixes form a prosodic constituent with a lexical base, they are interpreted semantically as having scope over the combination of base plus complement. Example 3 2 3 Interpretation of complement inheritance. uncommitted to NP [Rl] [R2]
A: un(committed(to(mary))) (A/PP): Xx.un(committed(x)) (A/A): un (A/PP): committed PP: to(mary) indebtedness to NP
[Rl] [R2']
N: ness(indebted(to(mary))) (N/PP): Xx.ness(indebted(x)) (A/PP): indebted (A\N): ness PP: to(mary)
98
CHAPTER 3.2
believer in NP [Rl] [R2']
N: er(believe(in(ghosts))) (N/PP): Xx.er(believe(x)) (V/PP): believe (V\N): er PP: in(ghosts) comparable with NP
[Rl] [R2']
A: able(compare(with(mary))) (A/PP): Xx.able(compare(x)) (TV/PP): compare (TV\A): able PP: with(mary)
BOOLEAN EVIDENCE
Let us compare the approach defended here with alternative analyses that characterize inheritance as phrasal derivation and make use of non-concatenative operations to get the affixes in their surface position. Observe that the prosodie words indebtedness, believer, comparable, etc. in the derivations above are well-formed entities morphosyntactically and semantically. This is a crucial property of our surface-compositional approach, which makes it possible to generalize the analysis to Boolean combinations. The Boolean generalization, as we will see below, is unavailable under an affix-movement approach, because the prosodie words heading inheritance constructions do not form a conjoinable entity in the semantic algebra. To implement a non-concatenative affix-movement analysis, one can adopt a variety of methods: phrasal infixation (Bach 1984), affix-raising in logical form (Pesetsky 1985), syntactic affixation (Fabb 1984), head-adjunction (Hoeksema 1985), etc. For the purposes of our discussion, these proposals are alternative executions of the same idea. As a worked-out illustration of a non-concatenative analysis, Example 3.2.4 presents a derivation of inheritance in terms of Pollard's (1984) head-wrapping operations. (The constituent nodes are annotated with the corresponding headed string, with the head in italics, and the interpretation of the nodes in bold-face.)
99
DISCONTINUITY
Example 3.2.4 Inheritance: head-wrapping. N: devotedness to Bill: ness(devoted(to(Bill))) -Wrapping A: devoted to Bill: devoted(to(Bill)) A\N: ness: ness -Concatenation-ness A/PP: devoted: devoted
PP: to Bill: to(Bill)
devoted
to Bill
I
I
The first step combines devoted with its complement PP by a concatenation operation; the semantic operation is functional application of the head to the complement. The second (crucial) step adds -ness, by wrapping the first argument (the complex adjectival phrase) around -ness, placing the head of the wrapped argument immediately to the left of -ness. Notice that throughout the derivation semantic scope coincides with structural c-command, as a consequence of the fact that surface compositionality is abandoned; the notion of constituent structure (semantic/syntactic) has become abstract — 'parts' for the compositionality principle are no longer 'visible parts': the derivation above contains nowhere a syntactic/semantic constituent devotedness. It is clear that the wrapping approach is equivalent to the use of functional composition for simple cases. But wrapping does not generalize to more complex cases of Boolean coordination. Consider Example 3.2.5. Recall from Chapter 1 that given the generalized conjunction theory of Keenan & Faltz (1985), Gazdar (1980), Partee & Rooth (1983), one may require of an adequate semantic theory that it can interpret any syntactic operation of conjunction/disjunction by means of the generalized semantic operators MEET and JOIN. When we combine the affix -ness with the base devoted by means of the composition rule R2W, the result is a syntactic constituent of type N/PP, with a corresponding semantic type, a function from expressions of type(PP) to type(N); in other words, there is a semantic object in the model corresponding to the prosodic constituent devotedness. The category N/PP is of a conjoinable type, so given the generalized conjunction hypothesis, one can
100
CHAPTER 3.2
expect Boolean combinations, which can be straightforwardly interpreted using the generalized coordination operators. These Boolean combinations indeed occur: we see in Example 3.2.5 that composite functions can be conjoined with identical categories, basic or derived, and if derived, derived by the same or by a different rule of morphology. Under the wrapping approach, as we saw above, there is no semantic object corresponding to the phrasal infixation step in the derivation. It is impossible, then, to assign a surface interpretation to the conjunctions of Example 3.2.5. Example 3.2.5 Inheritance: Boolean combinations VP/N: ness+ness: their [preparedness and willingness] to start the fight VP/N: ce+ity: John's [reluctance or inability] to accept the offer PP/N: basic+ness: his [fidelity and devotedness] to the king PP/N: heid+heid: [aansprakelijk+heid en verantwoordelijk+heid] voor schade PP/N: heid+basic: [begaan + heid en medelijden] met de gewonde PP/N: heid+basic: [on+voldaan+heid en spijt] over de mislukking Sample derivation: A/A A/PP A\N (X\X)/X N/PP PP on voldaan heid en spijt over de mislukking 'un happy ness and grief over the failure' [Rl] [R2'] [R2]
[Rl]
(N/PP): MEET(spijt)(\x.heid(on(voldaan(x)))) (N/PP): Xx.heid(on(voldaan(x))) (A/PP): \y.on(voldaan(y)) (A/A): on (A/PP): voldaan (A\N): heid ((N/PP)\(N/PP)): MEET spijt (((N/PP)\(N/PP))/(N/PP)): MEET (N/PP): spijt
MEET(spijt)(Xx.heid(on(voldaan(x)))) o Xvpp.MEET(spijt(v))(heid(on(voldaan(v))))
DISCONTINUITY
101
Notice in particular the cases where the product of R 2 « is conjoined with a basic expression. These form crucial evidence against phrasal infixation: one cannot say here that the affix is distributed over the heads in case the base is a conjoined phrase. The same point can be made for Example 3.2.6: conjunction of the lexical comparative with the syntactic comparative. Fabb (1984:120) uses the construction unhappier with me to show that er-affixation has to be done in the syntax in order to get the proper scope relations ( = more not happy with me). As we have shown, functional composition allows one to reject the correlation between semantic scope and syntactic c-command. The conjunction in Example 3.2.6 moreover shows that we need a syntactic/semantic constituent unhappier if we want to be able to derive the Boolean combinations directly. (A' stands for the type of comparative adjective phrase.) Example 3.2.6 Conjunction: lexical and syntactic comparative. he became un happy er and more impatient with me A/A A/PP A\A' A'/A A/PP PP
3.2.2. VERB-RAISING, DUTCH VERSUS GERMAN
A strong form of evidence for functional composition in the lexicon is represented by the much-discussed verb-raising constructions in Dutch and German. Whereas complement inheritance phenomena exhibit the limited productivity characteristic of wordformation rules, verb-raising clusters are formed in a completely productive way. In a number of publications, Steedman (1984, 1985) has shown how this type of discontinuous dependency can be derived in a one-level syntax by means of functional composition. One can compare this approach to Bach's (1984) or Pollard's (1984) proposals to derive the verb-raising clusters by means of non-concatenative operations, such as postponed application or head-wrapping. Generalized conjunction evidence supports the composition approach: verbraising clusters can appear in Boolean combinations with clusters of the same syntactic category/semantic type, hence they must be analyzed as well-formed constituents both in the syntactic and in the semantic algebra. As with the inheritance construction, it is crucial for our purposes to show that it is not necessary for the conjoined clusters to have the same derivational history: in the illustrations below, one conjunct is a basic expression, the other one a product of R2'. These examples show not only that the verb-clusters must be (syntactic/semantic) constituents that can be conjoined with basic expressions of the appropriate category, but also that the com-
102
CHAPTER 3.2
plement structure of verb-raising clusters must be identical to the complement structure of basic expressions of the same category. Surface approaches to verb-raising such as Bresnan et al. (1982) posit a complement structure for verb-raising clusters in which the NP objects are embedded in VP-complements lacking their verbal head; the headless VP structures are essential to determine the dependencies between the NP's and the verbs in the cluster. This view of the complement structure of verb-raising clusters cannot be reconciled with surface coordination in cases such as those presented below. Example 3.2.7 Verb-raising clusters: Boolean conjoinability. TTV: basic (voorlas) and verb-raising cluster (liet navertelleri) NP\(NP\VP) (NP\VP)/VP NP\VP navertellen dat ik haar een verhaal voorlas en liet that I her a story read and let retell 'that I read her a story and had her retell it' [Rl] (NP\(NP\VP)): MEET(XxJiet(navertellen(x)))(voorlas) (NP\(NP\VP)): voorlas [Rl] ((NP\(NP\VP))\(NP\(NP\VP))): MEET(\xJiet(navertellen(x))) (((NP\(NP\VP))\(NP\(NP\VP)))/(NP\(NP\VP))): MEET [R2'] (NP\(NP\VP)): Xx.liet(navertellen(x)) ((NP\VP)/VP): liet (NP\VP): navertellen TV: basic (neerstak) + verb-raising cluster (liet ontsnapperi) dat ik een inbreker neerstak maar daarna liet ontsnappen that I a burglar knocked-down but then let escape 'that I knocked down a burglar but let him escape afterwards' TV: verb-raising cluster (zag binnenkomen) + basic ([omhelsde) dat hij een meisje zag binnenkomen en meteen omhelsde that he a girl saw enter and immediately embraced 'that he saw a girl come in and embraced her immediately' A comparison between the Dutch and the German version of the verb-raising construction further illustrates the relation between the L-valid form of composition R2 and the mixed permutation dual R2'.
103
DISCONTINUITY
Abstracting from certain stylistic permutations, the order of the verb cluster in German embedded clauses is the mirror-image of the Dutch order, as can be seen from the examples below. That is, in German the verb-raising triggers are left-oriented functors, just like the verbs they form a cluster with. This makes the German construction derivable by means of the L-valid composition rule: the directionalities of the composed functors harmonize. In Dutch, the triggers are right-oriented functors, whereas ordinary verbs take their arguments to the left: hence the crossed dependency in Dutch between arguments and verbs. A unified account of the syntax and semantics of verb-raising clusters in Dutch and German requires the strong form of composition which allows for harmonic and mixed realizations. Example 3.2.8 Verb-raising: Dutch versus German. Dutch: Disharmonie Composition R2' wil wants [R2'J [R2'J
proberen try
te lezen to-read
'wants to try to read'
(NP\(NP\S)): Xx.wil(proberen(te + lezen(x))) ((NP\S)/(NP\S)): wil (NP\(NP\S)): \y.proberen(te + lezen(y)) ((NP\S)/(NP\S)): proberen (NP\(NP\S)): te + lezen
German: Harmonic Composition R2 zu lesen to-read [R2] [R2]
versuchen möchte try would-like
'wants to try to read'
(NP\(NP\S)): Xx.möchte(versuchen(zu + lesen(x))) (NP\(NP\S)): zu+lesen ((NP\S)\(NP\S)): Xy.möchte(versuchen(y)) ((NP\S)\(NP\S)): versuchen ((NP\S)\(NP\S)): möchte
The derivation of the verb-raising clusters in the lexicon, we claim, is empirically supported by the fact that in Standard Dutch, the clusters have to consist of basic expressions; it is impossible ~ to use the transformational metaphor — to raise non-maximal projections. On the assumption that the verb clusters are formed in the lexicon, this atom condition follows from standard lexicalist
104
CHAPTER 3.2
assumptions about the lexicon/syntax interface: lexical rules cannot be fed by syntactic rules. For a syntactic derivation of verb-raising it has to be stipulated (as it is, quite explicitly, in e.g. Houtman's (1984) analysis by means of the feature specification [ + L] which requires the complement of a verb-raising trigger to be a lexical expression). In Section 3.3, we return to verb-raising constructions in Germanic dialects where the atom condition does not hold, and where a syntactic derivation is called for. Example 32.9 Dutch verb-raising: atom condition. dat John in de tuin wil liggen that John in the garden wants he 'that John wants to he in the garden' * ...wil in de tuin liggen
3.2.3. DIVISION VERSUS COMPOSITION
The analysis of complement inheritance and verb-raising in the previous section was presented in terms of the binary reduction rule R 2 « . Hoeksema (1985), discussing the proposals of Moortgat (1984), criticizes the composition approach pointing out that if inheritance and verb-raising are indeed lexically governed, a combinatory rule such as does not accurately localize the dependence on specific lexical items, and he proposes as an alternative a type-assignment scheme that can be lexically constrained in the desired way. We take this criticism to be well-founded, and will show that it is easily incorporated in the approach defended here, since the type-assignment schemes Hoeksema proposes are, from the Lambek perspective, instances of the Division rule ~ a theorem of the calculus. Moreover, within the Lambek calculus, there are two options for a unary reformulation of composition, depending on whether the division type-transition is taken to affect the main or the subordinate functor. We will see that there may be empirical use for these options. The reader will recall from the discussion in Chapter 1 that the binary Composition rule can be decomposed into two more elementary subcomponents: Division of the main functor or of the subordinate functor, followed by simple application. As we have shown, the division type-transitions are more elementary than Composition in that any derivation on the basis of Composition is also derivable on the basis of Division, but not vice versa. The decomposition of L-valid R2 in Division plus Application carries over to the permutation dual R2' and the permutation duals of the division rules, R57R6'.
DISCONTINUITY
105
Example 3.2.10 Composition as Division plus Application. [R2']
(X/Z): Xv.f(g(v)) (Y/Z): g (Y\X): f
[Rl] [R6>]
(X/Z): Xv.f(g(v)) ((X/Z)/(Y\X)): XuXv.u(g(v)) (Y/Z): g (Y\X):f
[Rl]
(X/Z): Xv.f(g(v)) (Y/Z): g ((Y/Z)\(X/Z)): \uXv.f(u(v)) (Y\X): f
[R5']
VERB-RAISING: R5'
Let us return then to the phenomena of verb-raising and complement inheritance, and rephrase the composition analysis in terms of lexically constrained unary type transitions. In the case of verbraising, it is clear that the LP type transition responsible for the crossed dependency has to be associated with the main functor: the so-called verb-raising triggers form a closed class of lexical atoms, and the categorial identity of the subordinate functor puts no extra constraints on the construction. The type transition of Definition 3.2.1 (for triples Type:Phonology:Semantics) makes disharmonic Division available for expressions belonging to the restricted set of verb-raising triggers. Definition 3.2.1 RS' for verb-raising triggers. X/Y:Verb:Sem => (Z\X)/(Z\Y):Verb:XuXv.Sem(u(v)) if Verb e VR , where VR = {willen, moeten, kunnen,...} Observe that the R5' type transition introduces the division subtype Z, which is to be interpreted as a variable type that will assume a categorial identity on the basis of the context. The second part of this book is devoted to a detailed discussion of the Resolution Principle, which computes the required variable bindings in polymorphic types in the form of unifying answer substitutions. We suffice here with some illustrations. In Example 3.2.11, the verb-
106
CHAPTER 3.2
raising trigger wil (want) of type (NP\S)/(NP\S) has to be combined with the bare infinitive liggen (lie) of type PP\(NP\S) into a verbcluster wil liggen of type PP\(NP\S). The R5' type transition is instantiated here as (NP\S)/(NP\S) * (Z\(NP\S))/(Z\(NP\S)) , which leads to a successful Application reduction with the subordinate functor PP\(NP\S) under the unifying substitution {Z=PP} for the variable division subtype. As the reader can verify, the R5' analysis yields the same semantic interpretation of the verb-cluster as the composition analysis on the basis of R2\ Example 32.11 Verb-raising: Division versus Composition. (omdatJohn) indetuin wil liggen (because John) in the garden wants lie '(because John) wants to lie in the garden' Division R5': [Rl] [Rl] [R5>]
(NP\S): wil(liggen(in(de(tuin)))) PP: in(de(tuin)) (PP\(NP\S)): \y.wil(liggen(y)) ((PP\(NP\S))/(PP\(NP\S))): XxXy.wil(x(y)) ((NP\S)/(NP\S)): wil (PP\(NP\S)): liggen
Composition R2': [Rl] [R2']
(NP\S): wil(liggen(in(de(tuin)))) PP: in(de(tuin)) (PP\(NP\S)): Xy.wil(liggen(y)) ((NP\S)/(NP\S)): wil (PP\(NP\S)): liggen
In the case of Example 3.2.12, the Division rule scheme R5' is instantiated with a different binding for the division subtype Z. The subordinate functor zijn is of type AP\(NP\S); the verb cluster wil djn is derivable if Z is instantiated as AP, the argument subtype oizijn that has to be transmitted.
DISCONTINUITY
107
Example 3.2.12 (omdatJohn) alleen wilajn (because John) alone wants be '(because John) wants to be alone' [Rl] [Rl] [R5']
(NP\S): wil(zijn(alleen)) AP: alleen (AP\(NP\S)):Xy.wil(zijn(y)) ((AP\(NP\S))/(AP\(NP\S))):XxXy.wil(x(y)) ((NP\S)/(NP\S)): wil (AP\(NP\S)):sdjn
INHERITANCE: R6' Complement inheritance, as we have shown in Moortgat (1987a), can be lexically governed in two respects. Compare first the following -er and -able derivations. Governed prepositional phrases (i.e. PP's with a semantically empty head selected by the base) are inheritable complements, qua type, whereas semantically controlled complements are not. That is, we regard the starred examples below as ungrammatical products of word-formation, not as instances of limited productivity. A type restriction of this kind can be captured, as we have seen above, by constraining the identity of the subtype that will be inherited. Example 3.2.13 Inheritable vs. non-inheritable complements. listeners to BBC programs believers in the power of words * stayers in town *wishers to leave
derivable from NP comparable to NP *regardable as stupid *persuadable to go
On the other hand, even for complements that qualify as inheritable qua type, the actual well-formedness of inheritance cases seems to be judged on an idiosyncratic one-by-one basis. Compare the illustrations below. Example 3.2.14 Inheritance: lexical idiosyncracies dreiger met zelfmoord vertalers uit het Spaans sympathizers with this proposal searchers after truth
*vrijer met Marietje *walger van thee * counters on progress *longers for peace
108
CHAPTER 3.3
There seems to be no alternative here but to assume with Hoekstra & Van der Putten (1988) that the language-learner acquires the existing cases of inheritance on the basis of positive evidence, and stores them into a set of bases with inheritable complement, the membership of which is as idiosyncratic as that of the class of verb-raising triggers. R6' would then be licensed for bases belonging to that set. Example 3.2.15 Inheritance: R6' [Rl] [R6']
(N/PP): \x.er(believe(x)) ((N/PP)/(V\N)): XyXxy(believe(x)) (V/PP): believe (V\N): er
3.3. NON-CONCATENATIVE CONNECTIVES IN THE SEQUENT CALCULUS L
The type-forming connectives of the directional system L embody a theory of concatenative operations on adjacent elements: product types X-Y are interpreted as concatenations of an expression of type X and an expression of type Y, and fractional types are the inverses of products, i.e. incomplete expressions yielding an expression of the range type when concatenated with an expression of the domain type. Discontinuous dependencies suggest enrichment in the form of non-concatenative operations besides left- and right-division. In the present section, we will study some prototypical non-concatenative operations in terms of the Lambek-Gentzen sequent calculus by extending the set of type-forming connectives with extraction and infixation operators. This reformulation makes is possible to investigate the logical consequences of the non-concatenative approach from the proof-theoretic Gentzen perspective, with surprising results.
3.3.1. EXTRACnON/INFTXATION
Recall from Chapter 1 the basic relationship between the Gentzen inference rules of L and the more primitive semantics of the typeforming operators: we started from the intended interpretation of the concatenative operators { / , r e p e a t e d below in Definitions 3.3.1 and 3.3.2, and then demonstrated that the Elimination and Introduction inferences were sound with respect to this intended interpretation, i.e. that the inferences faithfully reflected the logic of the type-forming connectives. If we want to enrich L with a non-
DISCONTINUITY
109
concatenative component, we can extend the set of type-forming operators with a number of connectives that have a non-concatenative interpretation, and then investigate, as the Gentzen design requires, what the rules of inference for these connectives would be. Definition 33.1 Interpretation of the concatenative connectives A»B = {xyeS | x e A & y e B } [Def-] C/B = { x e S | Vy e B, xy e C} [Detf] A\C = { y e S | V x e A , x y e C } [Def\] Definition 3 3 2 Concatenation and its inverses A-B £ C if and only if A £ C/B A-B £ c if and only if B S A\C The calculus L deals with a two-place operation on strings, string concatenation. Let us write concat(x,y,z) if the string z is the result of concatenating the strings x and y in that order. Clearly, concat is a function: for given strings xy the value of z is uniquely determined. Now we will consider a relation on strings: the substring relation, and more specifically, the infixation relation infix(x,y,z), which holds between strings x,y,z if x is a substring of z and extracting x from z leaves the string y. Whereas the string concatenation concat(x,y,z) yields a unique value z for given strings x,y, the relation infix(x,y,z) yields a set of values, depending on the division of the string y into substrings y^2- For a11 example, let x = ab and y=cde, then the following values for z fall under the relation inftx(x,y,z): abode, cabde, cdabe, cdeab Now, as we considered the left and right inverses for string concatenation, let us inspect the inverses for the first and for the second argument in the relation infix(x,y,z). Suppose x is of type A, y of type B and z of type C. Focussing on the x argument of the infix relation, we will assign the type CJ,B to expressions x which, infixed into an expression y of type B, yield an expression z of type C. Focussing on the y argument, we assign the type Cf A to expressions y which, together with an infix x of type A, yield an expression z of type C. Because infix is a relation rather than a function, we have two theoretical options for the interpretation of the CjB and Cj"A types. We could assign the type C4.B to expressions x which yield an expression z of type C for all possible ways of infixing x
110
CHAPTER 3.3
into an expression y of type B (the universal interpretation), or for some way of infixing x into the expression y of type B (the existential reading). Similarly, the type C f A could be assigned to expressions y which have an expression x of type A missing somewhere, and which give an expression z in combination with such an x (the existential interpretation), or to expressions y which yield an expression z for all insertions of an expression x of type A. The linguistic applications to be discussed below suggest the existential reading for the extraction types C f A and the universal reading for the infixation types CjB. Definitions 3 3 3 Non-concatenative operators Extraction CfA = {yeS | 3 y 2 =y, Vx e A, V l x y 2 e C} Infixation' | C j B = {x e S | V yj • y2 e B, yj • x • y2 e C}
[Deff] [Def|]
We comment on Definitions 3.3.3. Consider first the interpretation of the most elementary extraction operator. Just like the fractional types C/A or A\C, a type C f A (read 'C gap A') designates an incomplete expression that wants an argument expression of type A to form an expression of type C. But whereas fractional types C/A (or A\C) concatenate with their argument under adjacency, the type C f A is assigned to incomplete expressions that have an argument expression of type A missing somewhere, not necessarily at the periphery, and that will yield an expression of type C in combination with such an argument. The affinity of the extraction operator and the GPSG gap feature SLASH of Gazdar et al. (1985) will be apparent; as we will see below, a variant of the extraction operator ' I ' was proposed in Bach (1981), and further developed in Bouma (1987), to deal with long-distance dependencies that would require slash categories on a GPSG account. The extraction operator "f" forms types to designate the second argument in the relation infix(x,y,z). To pick out the first argument, we have the infixation operator 'J.'. A functor C4.B (read 'C infix B') forms an expression of type C in combination with an argument expression of type B by being infixed anywhere within the sequence that makes up the argument type B. The type Aj,B would be appropriate for the most general infixation category. In the remainder, we will concentrate on the specialized forms A > B and B < A , which are interpreted as right-infixation before the last, and left-infixation after the first element of the argument type, respectively. (We refer to Bach (1984) and references cited there for the linguistic motivation of these specialized forms of infixation. See also Hoeksema and Janda (1988) for an inventory of
DISCONTINUITY
111
illustrations in the field of morphology.) In line with our remarks in Chapter 2, we would like to parametrize the notions first/last element in prosodic, rather than type-theoretic terms (i.e. first/last word, phonological phrase, etc.). Definition 33.4 Peripheral Infixation. A > B = {xeS | VYl •... • y n sB,n>l, yx• x• y2•...• yn e A } B < A = {xeS | V yi •...• y n eB,n>l, yi •...• y ^ • x• yn e A }
[Def>] [Def } are functors, i.e. incomplete expressions just like the usual fractional types. The type-theoretical notions of domain/range subtype, order, complexity degree and category-count can be extended straightforwardly to the new types. Also, when we investigate the logical rules for the new connectives, we can take care that they comply with the required proof-theoretic invariance properties of the derivability relation '=>'. Definitions 3 3 £ Type-theoretic properties of { < ,t, > } dom(X > Y) = dom(Y Y) = r a n ( Y < X ) = ran(XfY) = X subtype(XfY) = { ( X f Y ) } u subtype(X) u subtype(Y) subtype(X>Y) = { ( X > Y ) } U subtype(X) u subtype(Y) subtype(Y Y ) = 1 + d(X) + d(Y) d ( Y < X ) = 1 + d(X) + d(Y) order(XfY) = max(order(X), order(Y) + 1) order(X > Y) = max(order(X), order(Y) + 1) order(YY) = count(X,Z) - count(X,Y) count(X,Y X:Term [fR] o T',T"
XtY:\v.Term
Figure 332 [|R] Semantics: lambda abstraction
113
DISCONTINUITY
ILLUSTRATION: UNBOUNDED DEPENDENCIES
For an illustration of Extraction Introduction, we return to unbounded dependencies. We have seen in Section 3.1 that the extraction domain, in cases of non-peripheral extraction such as (I know) what John put on the table cannot be put together on the basis of L-valid order-preserving modes of combination. The composition of the extraction domain, however, is clearly not a lexical matter, so we cannot adopt the proposal of Steedman (1987a) to derive the expression John put on the table on the basis of type-restricted LP extensions: as discussed in the previous section, we want to confine type-restricted LP type transitions to the lexicon. Bouma (1987) discusses the directionality problem in cases of non-peripheral extraction in detail, and proposes a hybrid categorial system, with the familiar directional connectives 7\ augmented by a GPSG gap feature mechanism to deal with unbounded dependencies. As the foregoing discussion may suggest, a category-forming connective with the desired extraction interpretation seems to be a more congenial way of extending L, keeping the mapping between the algebra of proofs and the denotational recipes intact. Observe that the Gap Introduction schemes, which have to be postulated in Bach (1981), Huck (1988) and Bouma (1987) to change the type of a functor X/Y to that of an X with an Y argument missing somewhere, have the status of theorems in the system proposed here. Y=> Y o
[/L]
oX=» X
Y=* Y o
X/Y,Y=> X
oX=> X
[\L] o Y,Y\X => X
[fR] ÓX/Y=> X | Y
[fR] óY\X=s- X | Y Figure 3 3 3
Figure 3.3.4 below gives a Gentzen proof for the example sentence (I know) what John put on the table on the basis of the
[fR] inference. We assume that the extracted w/i-elements have the lexical type S/(SfNP), a lifted second order type, which requires to its right an incomplete S expression with an NP missing somewhere. The expression John put on the table indeed derives type SfNP,
114
CHAPTER 3.3
because John put NP on the table derives type S. Semantically, the crucial subsequent NP,((NP\S)/PP)/NP,PP => SfNP is associated with the lambda term SfNP: \v.put(v)(on(the(table)))(john) , an abstraction over a variable of type NP in an expression of type S, which the semantics of the vWi-element can then operate upon according to one's favorite theory of interrogatives, a subject we do not have to go into here. NP,((NP\S)/PP)/NP,NP,PP =>S NP,((NP\S)/PP)/NP,PP
o [/L,&c]
S f N P 6 [fR]
S/(S|NP),NP,((NP\S)/PP)/NP,PP (I know) what S/(S|NP)
S
iS=»S
o [/L]
Figure 33.4 John put NP ((NP\S)/PP)/NP
on the table PP
3.3.3. INFIXATION ELIMINATION
Let us turn now to the logic governing the infixation types. We have intuitively characterized an infixing functor X | Y as an incomplete expression that penetrates into the argument expression Y to form a result expression of type X. The following inference rule captures the intended interpretation. In order to derive Z from an antecedent containing an occurrence of an infixing functor XjY, we have to find subsequences T',T" adjacent to the functor, which together derive the argument type Y. The concatenation T',T", as before, must be non-empty. The semantic operation interpreting the [J,L] inference is functional application of the meaning of the infixing functor X^Y to the meaning one obtains for the argument type Y by executing the subproof T',T" => Y. The 'J,' Elimination rule preserves decidability: the premises of the [J,L] inference have a lower degree than the conclusion, since the antecedent occurrence
115
DISCONTINUITY
of the connective has been removed, and the premise sequents consist of subtypes of the conclusion. T',T"
Y 0
pU,X,V=»Z
[JL] ¿ U,T,XJY,T",V=>Z T',T" => Y: o
o
oU, X:/(a), V=>Z
[|L] ¿ U, T , Xj-Y:/, T", V
Z
Figure 33.5 Infixation Elimination: syntax and semantics The special instances of right- and left-peripheral infixation ' > ' and ' < ' are derived from the general case of Figure 3.3.5 by requiring that either T' or T" be a single type, instead of a sequence of types. We present the syntax of the [>L] inference below, with the variable conventions that Y' is a type, T" a sequence of types, possibly empty. The reader can easily supply the symmetric double for [ < L], and the semantic interpretation for these inferences. T",Y'=*Y o
o U,X,V
[>L]o U,T",X>Y,Y',V
Z
Z
Figure 33.6 Right-peripheral Infixation Elimination
ILLUSTRATION: VERB-PROJECTION RAISING
For an illustration of infixation, we can turn to the Germanic dialects where the atom condition on verb-raising clusters, which characterizes the construction in Standard Dutch, does not hold. In
116
CHAPTER 3.3
several dialects of Flemish and Swiss German (see Haegeman and Van Riemsdijk (1986)), the verb-raising cluster can contain phrasal material, i.e. the subordinate functor can consume some of its phrasal arguments, before entering the crossed dependency with the main functor. In these cases, then, the construction does not qualify as lexical. The crossed dependency can be derived by assigning the verb-raising triggers a right-peripheral infixation type, i.e. by characterizing them as right-oriented phrasal infixes that penetrate into the argument expression. Example 33.1 Phrasal infixation: Flemish verb-raising. NP NP PP PRT VP'>VP PRT\(PP\(NP\VP)) dat hij haar van de stoel af wil duwen that he her of the chair off wants push 'that he wants to push her off the chair' dat hij haar van de stoel wil af duwen dat hij haar wil van de stoel af duwen Of the three alternatives in Example 3.3.1, the first two are acceptable in Standard Dutch, because the clusters af wil duwen and wil af duwen qualify as lexical prosodically. The third alternative, where the prepositional phrase van de stoel appears to the right of the verb-raising trigger, is acceptable in Flemish only. The derivations in Figures 3.3.7 and 3.3.8 below contrast the first and the last alternative. (To distinguish domain and range of the infixing functor, we have written VP' > VP.) o NP,PP,PRT,PRT\(PP\(NP\VP))
VP
o VP'=* VP'
[>L]o NP,PP,PRT,VP' > VP,PRT\(PP\(NP\VP))=>VP' Figure 33.7 (dat hij) haar van de stoel af wil duwen The Elimination rule [>L] must find a type V to the right of VP'>VP, and a rest T (possibly empty) to the left, so that T',Y' together derive the argument subtype VP. In the case of Figure 3.3.7, the [>L] inference can be performed immediately, with the bindings
117
DISCONTINUITY
{Y' = PRT\(PP\(NP\VP)), T = [NP,PP,PRT]} . In Figure 3.3.8, two [\L] inferences reduce the functor PRT\(PP\(NP\VP)) to the type (NP\VP), before the application of [>L], with bindings { V = NP\VP, T' = [NP]} .
NP,NP\ VP => VPi )
PP=>PP
PRT=>PRTo
o
VP'=> VP'
[>L] o NP,VP'> VP,NP\VP
VP'
[\L] < NP,VP'>VP,PP,PP\(NP\VP) => VP'
[\L] (!) NP.VP' > VP,PP,PRT,PRT\(PP\(NP\VP))
VP'
Figure 33.8 (dat hij) haar wil van de stoel af duwen The derivations of Figures 3.3.7 and 3.3.8 concentrate on the logic of in fixation Elimination. They do not reflect the prosodic and semantic constituency which we claimed to be necessary for Boolean coordination of verb (projection) clusters. But of course, the restructuring approach of Chapter 2 is applicable here too: for the cut-free derivations of Figures 3.3.7 and 3.3.8, there are alternative proofs with a Cut inference grouping the verb-raising trigger and the adjacent parameter Y' into a constituent, as exemplified by the following mixed form of composition, where a right-peripheral infixation functor penetrates into a left-oriented domain: VP' > VP, PRT\(PP\(NP\VP)) => PRT\(PP\(NP\VP')) wil duwen
DIRECTIONALITY PRESERVATION PROPERTIES AS THEOREMS
Let us investigate more closely the relationship between the LP based forms of mixed composition R2' discussed in the previous section, and their alternative in the non-concatenative extension of
118
CHAPTER 3.3
L. In the infixation calculus for the connectives ' > ' and ' < ' , mixed composition, in its most elementary realization, assumes the form X > Y,Z\Y => Z\X
Y/Z,YX/Z,
i.e. the main functor, instead of a right- or left-oriented concatenative type X/Y (or Y\X), is interpreted here as a right- or left-peripheral infixation type X > Y (or Y xyz
X/Y, Y/Z => Z\X X/Y,Z\Y => xyz
Y/Z,X/Y X/Z Y/Z,X/Y Z\X Z\Y,X/Y X/Z Z\Y,X/Y=»Z\X
Y\X,Y/Z => X/Z Y\X,Y/Z=>Z\X Y\X,Z\Y X/Z Y\X,Z\Y =» Z\X
Y/Z,Y\X X/Z Y/Z,Y\X=>Z\X Z\Y,Y\X=>X/Z Z\Y,Y\X => Z\X
X/Y,Z\Y
Z\X
Table 33.1 Figure 3.3.9 below presents the LP proof of a pathological case: the combination of a left-oriented main functor Y\X with, to its right (i.e. at the wrong side), a right-oriented subordinate functor Y/Z, with as result a left-oriented composite functor Z\X, i.e. a functor violating directionality inheritance. Notice the crucial Permutation inference step which permutes the antecedent types. The subproof above the Permutation inference is unfolded entirely in terms of L inferences.
DISCONTINUITY
0 Y=»Y
I z=»z
119
Q X=*X
[\L] ò Y,Y\Z =¥ X
[/L] i Y/Z,Z,Y\Z=*X [Permutation] i Z,Y\X,Y/Z => X [\R]
Y\X,Y/Z=>Z\X Figure 33.9
The L-valid forms of composition are the order-preserving versions at the extreme ends of the spectrum. As we have seen in Section 3.1, the extension of L with mixed forms of composition envisages only two extra forms, which respect the directionality requirements of the main and the subordinate functor: X/Y,Z\Y => Z\X
Y/Z,Y\X=>X/Z.
The notion of LP derivability, in other words, is much too strong to characterize the empirically motivated extension of L. The required restrictions have to be explicitly stipulated in the form of two directionality principles in Steedman (1987a:407ff). Directional Consistency rules out the top right cell of Table 3.3.1, and the bottom left cell, i.e. the LP forms of composition where the main functor X/Y follows the subordinate functor, or where a principal functor Y\X precedes its argument. Directional Inheritance rules out the cases where the orientation of the inherited subtype Z in antecedent and succedent does not match. Principle 33.1 Directional Consistency. All syntactic combinatory rules must be consistent with the directionality of the principal function.
120
CHAPTER 3.3
Principle 3 3 2 Directionality Inheritance. If the category that results from the application of a combinatory rule is a function category, then the slash defining the directionality for a given argument in that category will be the same as the one defining directionality for the corresponding argument(s) in the input function(s) If we reinterpret mixed composition in the extended calculus L + { < , > } , as the combination of an infixation main functor with an ordinary concatenative subordinate functor, the principles of Directionality Inheritance and Consistency fall out as theorems with respect to the infixation connectives. Among the full set of LP-valid forms of composition, L + { } carves out exactly those mixed forms that are motivated on the basis of the empirical evidence. Figure 3.3.10 unfolds the relevant Gentzen proof for the right-peripheral infixation case X>Y,Z\Y => Z\X; the proof of the symmetric dual proceeds as the mirror image. o Z=>Z
p Y=»Y
[\L] 6 Z,Z\Y=»Y
[>L]o Z,X > Y,Z\Y
o X=»X
X
[\R] 6 X > Y,Z\Y => Z\X Figure 33.10 Mixed Composition: Right Infixation In Figure 3.3.11 we show why the unwanted cases of Table 3.3.1 are underivable: the extension of L with an Elimination inference for the infixation connectives { < , > } is conservative with respect to the concatenative operators {/,\}- In order to derive X>Y,Z\Y=»X/Z, the [/R] inference for the succedent type X/Z adds the subtype Z to the right of the antecedent sequence, thus blocking a successful proof, since the Z subtype is now on the wrong side of its functor Z\Y.
121
DISCONTINUITY
fail
fail
I
i) Z\Y=>Y
x,z=>x
[>L]o X>Y,Z\Y,Z [/R] 6 X > Y,Z\Y
X X/Z
Figure 33.11 3.3.4. EXTRACTION, INFIXATION: PARTIAL LOGIC
The attentive reader will have noticed that the inference rules for the non-concatenative connectives discussed so far represent a partial logic: we have discussed an Introduction rule for succedent occurrences of the extraction operator and Elimination inferences for antecedent occurrences of the infixation operators. As the following paragraphs will show, it turns out to be impossible, within the bounds of the Gentzen sequent calculus, to give a complete logic for the non-concatenative connectives, i.e. to complement the set of inference rules with an Elimination rule [|L] for extraction, and an Introduction rule for infixation. These inferences are beyond the expressive power of the sequent approach because of the monotonicity of Gentzen derivations: a subtype obtained in the course of a Gentzen derivation, and the corresponding objects in the phonological and the semantic algebra, cannot be decomposed any more. Consider first what an Elimination inference for the extraction connective ' f would have to look like. A little reflection on the semantics of this connective, repeated here for convenience, shows that an antecedent occurrence of an incomplete expression CfA would have to consume its argument A by circumftxation, i.e. by wrapping itself around the argument expression. CfA - {yeS | 3 y i y 2 = y , V x e A , y i x y 2 e C }
[Deff]
Functor wrapping, in the sense intended by Bach (1981, 1984), Huck (1988), etc. is beyond the expressive power of the sequent calculus. The attempts at a [fL] inference in Figure 3.3.12 represent degenerate cases, where wrapping reduces to simple left- or rightapplication, i.e. they do not faithfully reflect the full semantics of the connective.
122
CHAPTER 3.3
???
T=>Yo
oU,X,V=>Z
T=>Yo
[|L] 6 U,T,XfY,V
Z
???
o U,X,V => Z
[|L] 6 U.XfY.T.V
Z
Figure 33.12 As we remarked above, the inexpressibility of Extraction Elimination is a consequence of the monotonicity of Gentzen derivations. As soon as the type X f Y has been derived (with corresponding objects in the semantic/prosodic algebra), the partial objects construed have the status of unanalyzable atoms that can be used as building blocks in the derivation of larger entities, but that cannot be changed any more. A Gentzen proof has no memory for past derivational history. The unavailability of Extraction Elimination, we would like to suggest, is a built-in formal constraint of the Gentzen approach which, from an empirical point of view, is a strength rather than a shortcoming. A full logic for the extraction connective immediately entails LP-collapse. The demonstration is straightforward: if next to the Introduction rule [fR], we have an Elimination rule, even in the degenerate form of Figure 3.3.12, we can switch the directionality of the pure concatenative connectives, i.e. a full logic for the connective ' f ' is no longer conservative with respect to the connectives 77\'.
B=»Bo
oA=>A
o B => B
o A=>A
A/B,B=>A A[/L]
[fL] 6 B^AfB => A
A/B
[\R] 6 A f B => B\A
A|B 6 [fR]
[Cut]
A
A/B
B\A
Figure 33.13
DISCONTINUITY
123
Turning to Infixation, the mirror image of the Extraction situation obtains: a succedent inference which would faithfully reflect the intended interpretation of the infixation connectives cannot be formulated in the sequent calculus. Figure 3.3.14 again represents the degenerate case where infixation (right-peripheral infixation, in the example) reduces to right-division. o T,Y=>X ??? [>R]o T=*X>Y Figure 33.14
CONCLUSION
Our explorations in this section are tentative but, we hope, relevant for further research in this area. Let us summarize the findings. The Lambek-Gentzen sequent calculus models the antecedent as an ordered list of type occurrences. The expressive power of such a system is limited when we want to take non-concatenative operations into account. Nevertheless, L can be extended with a partial logic for non-concatenative operations of infixation and extraction, offering a basis for the study of the linguistic phenomena which motivate such operations. A full logic is unavailable because of the monotonicity of Gentzen derivations, a built-in property of the sequent design. On empirical grounds, the full logic seems to be undesirable, since it would entail LP-collapse. We have seen in Chapter 1 that there is a precedent for the full/partial logic opposition. The distinction between a full and a partial logic for the division connectives {/,\} is exactly what makes the difference between (pure concatenative) L, where the connectives have Elimination and Introduction rules, and the classical categorial system AB, which has only Elimination rules. With the partial logic for the extraction/infixation operators presented here, Gap Introduction is derivable as a theorem on the basis of the [j~R] rule. But the missing element in an expression with a gap cannot be satisfied at arbitrary places, since there is no Elimination rule [|L] for the extraction connective The only way to use a type X f Y in a derivation is to have it as the argument of a lifted gap-filler type, such as the fronted wh-type S/(S|NP). Conversely, infixation types X|Y can be assigned to atoms in the lexicon, and eliminated in the course of a derivation by means of [|L] inferences, but they are not derivable.
124
CHAPTER 3.3
A line of research which is strongly suggested by the discussion in this section, but left unexplored here, would be to change the design of the axiomatization of the sequent calculus itself, in such a way that the rules of inference for the type-forming connectives would treat the antecedent not as a list of type occurrences, but, for example, as a partially ordered multiset, with the inference steps preserving this partial order. We leave this as a topic for further research.
OUTLINE OF PART TWO
In the second part of this book, we shift from an empirical to a logical perspective. Within the framework of logic programming, we study the combination of the Lambek-Gentzen sequent calculus with the resolution proof procedure, a system which we will designate as LR. The central research question to be addressed here is the problem of variable type assignment (polymorphism) in L. Resolution and unification have been proposed at the type level as generalizations of functional application in weaker calculi, for example, within an AB framework in Zeevat, Klein & Calder (1987), or in Uszkoreit (1986) for what is essentially an F calculus. We will extend the use of resolution to the inference rules of the sequent calculus L as such, thus reducing the problem of polymorphism to that of computing answer substitutions for the unknowns in nonground sequents. Chapter 5 discusses these issues in depth, and Chapter 4 prepares the ground. Chapter 4 is devoted to the design of the conceptual tool for our investigation: a reformulation of the Lambek-Gentzen sequent calculus into Horn clause logic. We proceed in two stages of abstraction. First, a direct Horn clause embedding of the system L is presented by transforming the L inference rules into a clausal axiomatization of the derivability relation W. The proof for the validity of a sequent assumes the form of a resolution refutation, i.e. a demonstration that the negation of the goal sequent is incompatible with the clauses for '=¥'. We then move to a more abstract point of view, and build an interpreter for LR. The interpreter treats the rules of the sequent calculus L as a grammar formalism, and assumes full responsibility for turning these rules into Horn clause logic, automatically building a Gentzen proof tree as a record of a successful resolution refutation. The move from object-language encoding of L to meta-level interpretation is motivated by a desire for modularity in the light of the extensions to be presented in the following sections, and for independence of Prolog-specific implementation aspects of the resolution principle. LR is extended with the semantic interpretation procedure, on the initial assumption of a rigid category-to-type mapping, an assumption that will be modified in the following chapter. The type-theoretic terms interpreting L sequents are built up compositionally together with their Gentzen proof tree. Finally, we make a brief excursus into the LambekGentzen search space, studying two proof-theoretic properties which allow for significant reductions of the search complexity, while preserving the completeness of the proof procedure. The property of count-invariance makes it possible to prime away all paths that
128
PART TWO
cannot lead to a solution. The complexity degree of an inference yields an evaluation-function for the remaining nodes, so that the proof for a sequent can be reached with the least effort. Chapter 5 then turns to our central concern when proposing the system LR: the problem of solving L equations for sequents containing variable types. The discussion so far was restricted to ground goal sequents. In Chapter 5 this restriction is removed, and we face the problem of computing answer substitutions for non-ground sequents in polymorphic variants of L. Because L itself is a calculus of type-change, the discussion can focus on irreducible polymorphism, i.e. variable type-assignment which is not covered by the concept of count-preserving L-valid type transitions. Boolean coordination, as we saw in the previous chapters, presents the prime instance of irreducible polymorphism within L. In order to probe the set of answer substitutions for polymorphic sequents, we have to replace the incomplete search regime of the Prolog theorem prover by a complete one. The decidability problem is traced back to the logical infinity of L. We complement the top-down normal form control regime of LR with a bottom-up alternative, by adding the Cut, together with an inference engine responsible for generating auxiliary lemmas for the Cut inference. The bottom-up inference engine, the system M, recursively generalizes the rule of functional application on the basis of the monotonicity properties of the categorial operators, absorbing conditionalization. The M reduction laws have the status of L-valid derived inference rules. We show that M is properly embedded within L. From the parent system, it inherits the property of associativity: if a sequent A l v ..,A n B is derivable, it is derivable for any bracketing of the antecedent A1,...,An types into constituents. But the solution set for M equations is finite, unlike the solution set in the full L counterpart: for input types Aj.—.A,,, the M reductions compute result types B in terms of the finite set of subtype occurrences of the antecedent types A1,...rAn. The properties of the system M are illustrated with a discussion of left-associative parsing and conjunction of nonconstituents. Finally, we return to the relationship between the syntax of Gentzen proofs and their semantic interpretation. We replace the rigid category-to-type mapping by a flexible one, and complement the calculus of syntactic type-transitions with a calculus of semantic type change. Throughout this part of our study, a basic familiarity with logic programming and automatic theorem proving will be helpful, although it is no prerequisite for an understanding of the theoretical issues: the necessary concepts are informally introduced where needed. A useful introduction to logic programming in connection with natural language analysis is Pereira and Shieber (1987). The discussion of the LR interpreter in Section 4.2 closely follows their book. Our
OUTLINE
129
major source for the foundations of automatic theorem proving has been Gallier (1986), where the reader can find an in-depth treatment of Gentzen sequent calculi and their relation to resolution theorem proving. In order to enable the inquisitive reader to experiment with the calculi discussed here, an Appendix has been added. It includes the complete programs for the extended LR interpreter together with the inference rules for the cut-free Lambek-Gentzen calculus, the extension to L+{Cut} and the lemma database M, and the interpretation procedure for these various systems. The examples in this book have been derived on the basis of the programs in the Appendix. The objective of the programs, we should stress, is to provide a flexible experimental tool for the theoretical issues discussed in the following chapters: procedural tricks that would be motivated by efficiency considerations have been omitted. For research on efficient implementations of the calculi discussed here we refer to Van der Wouden & Heylen (1988) or Van Paassen (1988).
CHAPTER 4 THE LAMBEK-GENTZEN CALCULUS WITH RESOLUTION
4.1. THE LAMBEK-GENTZEN SEQUENT CALCULUS AS HORN CLAUSE LOGIC
In this section, we transform the Gentzen implementation of L into a logic program, in order to combine the proof procedure for Horn clause logic (resolution) with the Gentzen sequent proof procedure. The transformation, as we have shown in Moortgat (1988b), can be very transparent. The rules of the sequent calculus as we have presented them in Chapter 1 can in fact be read as Horn clauses in thin disguise. Let us introduce informally the relevant concepts from logic programming (see e.g. the first chapter of Sterling and Shapiro 1986), and establish the correspondence with the concepts introduced in Chapter 1 in the discussion of the sequent calculus L. A logic program is a finite set of Horn clauses (clauses for short) collectively defining the relationships between objects. A clause (also called a rule) is a logical expression (which is implicitly taken to be universally quantified) of the form
A and Bi are atomic formulae {literals) of the form R(t1,...Jm)i where R is the name of a relation with arity m and the f, its argument terms. We interpret clauses as the above as 'A is implied by the conjunction of the Bf, i.e. as inference rules defining a relation R in the consequent A. Given the interpretation of as logical implication, the consequent A counts as a positive literal, and the antecedent fij's as negative literals. The consequent A is called the head of the clause, and the conjunction of B{s the body. Clauses with an empty body (i.e. n= 0) are also called facts. Clauses with an empty head are negative, or goal clauses. A computation of a logic program is a demonstration that a given goal is logically implied by the program. We will see below how this demonstration of logical consequence proceeds. Recall now from Chapter 1 the definition of the sequent implementation for the Lambek calculus. For convenience, we repeat the necessary constructs of axiom and inference schemes as Definitions 4.1.1 and 4.1.2.
132
CHAPTER 4.1
Definition 4.1.1 The axioms of L are sequents of the form X
X.
Definition 4.12 The inference rules of L (Type-forming connectives: {/,-,\}; X,Y,Z are types, P,T,Q,U,V sequences of types, P,T,Q non-empty). [/R]
T=>X/Yif T,Y=»X
T => Y\X if Y,T=>X
[\R]
[/L]
U,X/Y,T,V => Z if T ^ Y and U.X.V =» Z
U,T,Y\X,V => Z if T ^ Y and U,X,V=*Z
[\L]
[•L]
U,X'Y,V=>Zif U,X,Y,V=>Z
P,Q=>X«Yif P=>Xand Q=>Y
[-R]
This set of rules axiomatizes the two-place relation of Lambek derivability. For convenience, we write the relational symbol W in infix notation between its arguments. The objects of which the relation holds are sequences of types Antecedent, Succedent, i.e. Antecedent =¥ Succedent, where Antecedent is non-empty and Succedent restricted to sequences of length one, i.e. sequences consisting of a single type. There is one unit clause for the relation V : the axiom scheme X=$X. The inference rules have the general pattern Conclusion if Premise(s), or, in Horn clause form, Conclusion:- Premise(s) , where Conclusion and Premise(s) are sequents. Each inference rule picks out a type-forming operator, either in the Antecedent (left rules) or the Succedent (right rules) of the Conclusion sequent, and eliminates it by breaking up the Conclusion into one or two Premise sequents, in which the operator in question does not appear any more. In Definitions 4.1.2, this factorization of sequents is represented informally, with the variable conventions that P,T,Q,U,V denote sequences of types, P,T,Q non-empty, and X,Y,Z types. The comma notation 'U,V' represents the concatenation of the types in U,V. The most direct way to turn the inference rules into executable Horn
133
RESOLUTION
clause logic is to make the informal representation of sequent factorization explicit, as it is done in Program 4.1.1. Factorization of the Antecedent sequence into subsequences P,T,Q,U,V is effected by the familiar append relation which holds between two sublists Listl, List2 and their list concatenation. (For example: append holds between the lists [a,b], [c,d] and [a,b,c,d].) We use the same variable name conventions as before except for the factors P,T,Q, where the Prolog notation [Head | Tail] for a list with a first element Head and rest Tail allows for a direct representation of the condition that these variables be non-empty. The terms [T | Rest], [P | Rest], [Q | Rest] will not unify with the empty list Q. (Notice that here and in what follows, we will identify a predicate symbol p of arity n as p/n. Variable names start with an uppercase character, names of constants with a lower case symbol.)
Lambek-Gentzen derivability V/2: Antecedent
Succedent
Axiom (cf. Definition 4.1.1) [X]*[X], Inference rules (cf. Definition 4.1.2) [\R] [/R] [ 'L\
[/L]
[\L]
[T | Rest] => [Y\X] :[Y,T|Rest]=>[X]. [T | Rest] => [X/Y] append([T | Rest], [Y], PremiseAntecedent), PremiseAntecedent [X]. Antecedent => [Z] :append(U, [X • Y | V], Antecedent), append(U, [X,Y| V], PremiseAntecedent), PremiseAntecedent => [Z]. Antecedent =» [Z] :append(U, [X/Y | Right], Antecedent), append([T | Rest], V, Right), [T | Rest] =» [Y], append(U, [X | V], PremiseAntecedent), PremiseAntecedent => [Z], Antecedent => [Z] :append(Left, [Y\X| V], Antecedent), append(U, [T | Rest], Left), [T | Rest] => [Y], append(U, [X | V], PremiseAntecedent), PremiseAntecedent [Z].
134
CHAPTER 4.1
[•/?]
[A,B | Rest] [X • Y] :append([P | RestP], [Q | RestQ], [A,B | Rest]), [P | RestP] [X], [Q|RestQ]=»[Y].
append/3: append(Listl,List2, ConcatenatedList) append([], List, List). append([H | Listl], List2, [H | List3]) :append(Listl, List2, List3). Program 4.1.1 The sequent calculus L: direct Prolog embedding.
RESOLUTION
As we remarked above, the computation of a logic program is a proof that a given goal clause is a logical consequence of the program. We turn now to the inference procedure by which this proof is obtained. The inference procedure for Horn clause logic is a specialized form of Robinson's resolution procedure (Robinson 1965, 1979). Resolution theorem proving is guided by the same idea as the top-down implementation of the Gentzen sequent calculus, i.e. it is a proof by refutation. The resolution procedure, when confronted with the task of proving the validity of a goal Sequent, systematically tries to generate the empty clause from the negative goal Sequent and the database of clauses axiomatizing the derivability relation W. The empty clause, i.e. the clause with empty head and empty body, is to be interpreted as a contradiction. Deriving the empty clause therefore amounts to the construction of a proof that the negation of the goal Sequent together with the clauses for W is inconsistent, thereby in fact proving the validity of the goal. Whereas the Lambek-Gentzen sequent calculus consists of six inference rules (right and left rules for the three type-forming operators, cf. the six recursive clauses for '=»' in the Horn clause reformulation), the only inference rule used by the Horn clause theorem prover is resolution. In the face of a goal :- Sequent,
i.e. a negative clause for W,
the theorem prover matches the goal Sequent against the head Conclusion of a clause for '=>' in the database, Conclusion:- Premise1,...,Premisem
RESOLUTION
135
by computing the most general unifier for Sequent and Conclusion. The most general unifier is the substitution & {X 1 =Term 1 ,... ) X n = T e r m n } for possible variables X^.-.X,, in Sequent and Conclusion which makes Sequent and Conclusion equal. Resolving Sequent and Conclusion produces a new goal :- [Premise1,...,Premisem]{X1 = Term1,...,Xn = Term n } , the resolvent of the original goal Sequent and the database clause obtained by applying the unifying substitution {X 1 =Term 1 Xn = Tern^} to the body Premise^...,Premisem of the clause for V . Resolution can now resume its work on the new goal, selecting one of the subgoals Premise^ from [Premise lv ..,Premise m ]$, and resolving it against a database clause Head :- Body, until eventually the empty clause is derived. Observe that resolving a sequent against the axiom clause for Lambek-Gentzen derivability, i.e. the unit clause [XH[X]:immediately produces the empty clause (because the unit clause has no body). The resolution refutation of a conjunction Premise^..., Premise m will have succeeded when it has produced the empty clause for each of the premises, i.e. reduced them all to the axiom case for V , as far as the premises concern derivability. It may be useful here to reflect on the precise interaction between the Lambek-Gentzen sequent calculus encoded in the clauses for and the resolution proof procedure of the Horn clause theorem prover. The clauses of the Gentzen sequent calculus, as we have seen in Chapter 1, have a pure declarative interpretation. A Gentzen proof can be read top-down as the analysis of an endsequent into premise sequents, or bottom-up as the synthesis of a conclusion sequent from premise sequents. In its top-down implementation, the sequent calculus guarantees in the abstract that we can find out in a finite number of steps whether a sequent is valid or not by systematically removing the type-forming operators on the basis of the inference rules, until there are no more operators left. But again, the declarative reading of the Gentzen inference rules gives no indication as to how we will actually unfold a derivation in the face of choices, for example, the selection of a specific operator in an expansion step, or the selection of antecedent or succedent rules. The Horn clause formulation of L supplements the declarative Gentzen sequent calculus with a procedural interpretation by supplying the inference engine that will unfold the Gentzen derivation for a sequent as a resolution refutation.
136
CHAPTER 4.1
ILLUSTRATION
To illustrate the combination of the resolution procedure and the Lambek-Gentzen calculus, Figure 4.1.1 traces the resolution refutation for the goal :- [s/s,np/n,n,ap,ap\(np\s)] => [s], where the antecedent would be the type sequence corresponding to a Dutch embedded clause omdat het meisje gek is 'because the girl mad is' because the girl is mad The tracing concentrates on the logical aspects of the proof, i.e. the recursive unfolding of the proof with respect to the derivability relation W. The house-keeping aspects of factorizing the goal sequent by means of the auxiliary append/3 relation, can be safely ignored, as the effects are visible in the subgoals generated by a given factorization. The valid subsequents that eventually contribute to the successful proof have been highlighted for further reference. In Example 4.2.1 we return to the relationship between the Gentzen proof and the resolution trace.
0 CALL: [s/s,np/n,n,ap,ap\(np\s)Ms] 1 CALL: [np/nMs] [np/n]=»[s] 1 FAIL: 1 CALL: [np/n,n]=>[s] 2 CALL: [n]=»[n] 2 EXIT: [n]=>[n] 2 CALL: [np]=>[s] 2 FAIL: [npj=>[s] 2 REDO: [n]=>[n] 2 FAIL: [n]=>[n] 1 FAIL: [np/n,n]=>[s] 1 CALL: [np/n,n,apMs] 2 CALL: [nMn] 2 EXIT: [n]=>[n] 2 CALL: [np.apMs] 2 FAIL: [np,apHs] 2 REDO: [n]=>[n] 2 FAIL: [n]=^[n] 2 CALL: [n,ap]=»[n]
[/L], {X/Y = s/s} {[T|Rest] = [np/n]} {[T|Rest] = [np/n,n]}
{[T | Rest] = [np/n,n,ap]}
RESOLUTION
2 FAIL: [n,ap]=>[n] 1 FAIL: [np/n,n,ap]=>[s] 1 CALL: [np/n,n,ap,ap\(np\s)Ms] 2 CALL: [n]=»M 2 EXIT: [n]=>[n] 2 CALL: [np,ap,ap\(np\s)]=>[s] 3 CALL: [np,ap]=>[ap] 3 FAIL: [np,ap]=>[ap] 3 CALL: [ap]=»[ap] 3 EXIT: [apMap] 3 CALL: [np,np\s]=s[s] 4 CALL: [np]^.[np] 4 EXIT: [np]=>[np] 4 CALL: [sHs] 4 EXIT: [s]=>[s] 3 EXIT: [np,np\s]=>[s] 2 EXIT: [np,ap,ap\ (np\s) ] [s] 1 EXIT: [np/n,n,ap,ap\(np\s)]=>[s] 1 CALL: [sMs] 1 EXIT: [s]=>[s] 0 EXIT: [s/s,np/n,n,ap,ap\(np\s)]=>[s]
137
{[T | Rest] = [np/n,n,ap,ap\(np\s)]
{U = Q,X = s,V = [],Z = s}
yes Figure 4.1.1 Observe the way the resolution refutation, in its Prolog incarnation, traverses the search space. The goal sequent does not resolve against the first clause for W, the base case [X]=*[X]. Neither does is resolve against the head of the [\R] or [/R] clauses (since the succedent of the goal, [s], does not unify with the succedents [Y\X] or [X/Y]). The goal sequent matches the head of the [-L] clause, but immediately fails when pursuing this path, since the antecedent of the goal does not resolve against a factorization containing a product type X*Y. Resolution of the goal sequent with the [/L] clause succeeds, so that we enter the first level of recursion with the unifying substitution {X/Y = s/s} for the active type. The first logical premise for the [/L] rule now generates a subgoal [T | Rest] => [s], which the resolution procedure attempts to prove for the different instantiations of [T|Rest]: [np/n], [np/n,n], [np/n,n,ap] and [np/n,n,ap,ap\(np\s)]. Each instantiation is fully explored until failure to derive the empty clause is detected, which then causes backtracking, and selection of an alternative instantiation. In the current example, it is the ultimate possibility of instantiating [T | Rest] that will eventually lead to a successful refutation of the first W premise subgoal for the top sequent, [np/n,n,ap,ap\(np\s)]
138
CHAPTER 4.2
[s], and subsequently to an immediate refutation of the second W premise, U,[X],V=>[Z], which resolves against the axiom case with the unifying substitution {U = [], X = s, V=[], Z = s}, thus producing the empty clause. The above example illustrates an important point. Resolution is an abstract theorem proving principle for Horn clause logic, based on (nondeterministic) selection of a database clause to resolve a goal against, and on (nondeterministic) selection of a subgoal from a conjunction of subgoals to generate new resolvents. The Prolog implementation of the resolution procedure selects subgoals left to right and tries database clause alternatives top to bottom exploring them with a depth-first control regime, as Figure 4.1.1 makes clear. When in the next chapter we turn to the central problem of solving L equations, we are interested in the combination of the LambekGentzen calculus L with the abstract resolution mechanism, and the companion operations of unification and substitution. We are not interested in the specific options taken by the Prolog inference engine when faced with nondeterminism, i.e. with choice points in its traversal of the resolution search space. As we will see in Section 5.1, the Prolog search regime is an incomplete implementation of the resolution proof procedure, which means, with respect to the problem of solving Lambek equations, that it may fail to find solutions where in fact these solutions exist.
4.2. AN INTERPRETER FOR THE LAMBEK-GENTZEN SYSTEM
In order to make it conceptually easier to abstract from Prolog specific aspects in our theoretical exploration of the combination L + Resolution, we will now present an interpreter for the Horn clause version of L. For the interpreter, the logic program 4.1.1 axiomatizing the relation of L-derivability V is just data. Our aim in this section is twofold. First, the interpreter makes us independent from proof strategies specific to the Prolog tool when this is desirable on theoretical grounds. As we will see in the next chapter, exploration of the solution space for Lambek equations requires a search regime different from the standard Prolog proof strategy. From the interpreter perspective, such direct manipulation of the proof strategy is straightforward. Secondly, we can obtain a modular architecture for the Lambek theorem prover, by confining the resolution aspects in the proof of a sequent to the interpreter. When we shift to alternative versions of the Lambek-Gentzen calculus, for example the system L + {Cut} studied in Chapter 5, the resolution aspects of the proof procedure remain unchanged: we can simply substitute a different axiomatization for V and offer that to the interpreter.
RESOLUTION
139
For a lucid exposition of interpreter techniques, and the motivation behind them, we refer the reader to Pereira and Shieber (1987, Chapter 6), which is the source for this section. We will gradually introduce the abstraction process involved in moving from an objectlanguage logic program to a meta-level interpreter which treats the program as data. First we assign the interpreter a very modest task: the construction of a Gentzen proof tree as a record of the successful resolution refutation of a sequent. Next, we will delegate the factorization process to the interpreter, instead of explicitly coding it into the L inference rules by means of the append/3 predicate. The direct execution of Program 4.1.1 is a little uninformative. When a sequent is valid, the system will reply with a polite 'yes', and when the resolution refutation fails, with a laconic 'no'. In exploring the properties of L, we are interested not so much in these answers, but in the Gentzen proof tree which demonstrate the validity of a goal sequent, so that we can manipulate and study this object in its own right. One could also, of course, directly encode the construction of the Gentzen proof tree in the clauses for W, by adding an extra argument to the derivability relation which gradually accumulates the proof tree for a successful resolution refutation of a sequent. But as Pereira and Shieber (1987) demonstrate, the automatic generation of proof trees is a standard application when moving from an object-language program to a higher-level interpreter for that program. The interpreter of Program 4.2.1 below executes our earlier axiomatization of L-derivability, and automatically builds the Gentzen proof tree for a derivation. We comment on the clauses of prove/2 below. First we have to decide on a handy term representation for Gentzen proof trees. Prove/2: prove (Sequent,ProofTree) prove(true,true). prove(T => X, (T => X X, Premises), prove(Premises, ProofTree). prove((Clause, Premises), ProofTree) :extralogical(Clause), call(Clause), prove(Premises, ProofTree). prove((Premisel, Premise2), (ProofTreel, ProofTree2)) :prove(Premisel, ProofTreel), prove(Premise2, ProofTree2). extralogical(Clause):- not functor (Clause,'^',2). Program 4.2.1 Gentzen proof trees.
140
CHAPTER 4.2
In the discussion of Figure 4.1.1, we noticed that the clauses for the Lambek-Gentzen theorem prover can be divided into logical and auxiliary predicates. The clauses for W/2 directly pertain to the logic of the Lambek-Gentzen calculus; the list manipulation performed by append/3 is necessary to make the program run, but external to the logical aspects of the calculus. When we construct the proof tree, we want to keep track of the recursive calls to the V predicate that lead to the successful resolution refutation of a sequent, and ignore the extralogical factorization aspects. Program 4.2.1 performs this task, defining prove/2 as a relation between a sequent and the Gentzen proof tree for that sequent. We have to settle on a well-formed representation for proof trees. A proof tree for a sequent will be straightforwardly represented here as a structure 'Conclusion Premise(s)' if Conclusion follows from Premise(s) by virtue of the Lambek-Gentzen inference rules. The symbol ' is just an arbitrary binary infix operator, representing a tree as a structure 'Root ' as facts for a meta-predicate. We return to this point in the next section.) Resolving the sequent T [X] against the head of a clause for '=>', prove/2 attempts a resolution refutation of the premises for that clause, constructing the proof tree T
[X] [X] derivable and ignore the list manipulation. To prove a conjunction of subgoals (Goal, Premises), where Goal is an extralogical clause (i.e. a clause with a functor different from V/2 ~ append!3, in the case at hand), we just execute the extralogical goal, and construct the proof tree for Premises. The remaining clauses for prove/2 are self-explanatory. We leave it to the reader to design an output routine that will display the Gentzen proof tree in a readable indented list format. The output routine given in the Appendix produces the unadorned trees of Example 4.2.1 below. In the text, we will generally use the more verbose annotated indented list format of Example 4.2.2, where nonterminal nodes are labeled with the L inference rule used to expand them. Example 4.2.1.
[omdat,het,meisje,gek,is] because the girl silly is 'because the girl is sill/
Lexical look-up associates the sequence of morphemes with a sequence of types: [omdat, [s/s,
het, np/n,
meisje, n,
gek, ap,
is] ap\(np\s)]
We ask the theorem prover whether this sequence of types forms a sentence by submitting the following question to the interpreter: ?- prove([s/s,np/n,n,ap,ap\(np\s)] => [s], ProofTree) . The computed answer substitution for ProofTree is then displayed by the output routine in the following indented list format. THEOREM: [s/s,np/n,n,ap,ap\(np\s)]=>[s] PROOF: [s/s,np/n,n,ap,ap\(np\s)]=*[s] [np/n,n,ap,ap\(np\s)Ms] [n]=>[n][s][s][/L]
v=n
144
CHAPTER 4.2
This causes the proof tree to split, and the procedure starts to investigate the subgoals [21] and [22], The second of these is an instance of the axiom scheme, i.e. it is a successfully terminated node. [21] [22]
[(np/n),n,(pp/np),np,(pp\(np\s))] [s]*[s]
[s]
Node [21] is not terminated, so the leftmost connective is removed by factorizing the antecedent with the instantiations U = [], [T|Rest] = [n] and V = [(pp/np),np,(pp\(np\s))]. This generates the subgoals [211] and [212], Node [211] can be successfully labeled as an axiom and the search continues with node [212]. The remaining steps will cause no problems. Observe that when the [\L] rule is applied (nodes [2122] and [21222]), the sublist T has to be identified to the left of the active type, as the directionality of the slash requires.
FULL INTERPRETER FOR THE LAMBEK-GENTZEN SEQUENT CALCULUS
The automatic construction of L proof trees is a very modest extension of the direct execution of Program 4.1.1, introduced here to familiarize the reader with the abstraction process when turning an object-level program into data for the meta-level interpreter. In the present section, we remove the discrepancy between the Horn clause encoding of the Lambek-Gentzen system (with the explicit append/3 calls) and the original formulation of the calculus by delegating the complete factorization process itself to the interpreter. The Lambek-Gentzen rules thereby get the status of a grammar formalism, comparable to the Definite Clause Grammar notation, which the L interpreter turns into executable Horn clause logic. Figure 4.2.2 presents the Lambek-Gentzen calculus in the format which the extended L interpreter of Program 4.2.3 will work on. In our earlier version, the database for W was inspected by means of the system predicate clause12. In order to avoid confusion between the interpreter and the Lambek-Gentzen clauses which function as data for the interpreter, we will from now on explicitly distinguish implication and premise conjunction at these two levels. We write the clauses axiomatizing the (object-language) sequent calculus with the operators ' and '&' for Gentzen implication and conjunction of premises respectively, so that an inference rule Sequent if PremiseSequentl (and PremiseSequent2)
RESOLUTION
145
can be represented as Sequent
PremiseSequentl ( & PremiseSequent2) .
The interpreter for the Lambek-Gentzen clauses is encoded with the standard Horn clause implication ':-', and the comma operator for goal conjunction in the body of its rules. Appropriate operator declarations for ' and '&' can be found in the Appendix. Compare then the Lambek-Gentzen format of Figure 4.2.2 with the original Definitions 4.1.1 and 4.1.2. [X] => [X]
true.
[T | Rest] => [X/Y] [T | Rest],[Y] =» [X].
[T | Rest] [Y\X] [Y,T | Rest] => [X].
U,[X/Y],[T | Rest],V => [Z] [T | Rest] [Y] & U,[X],V => [Z].
U,[T | Rest],[Y\X],V [Z] 4[T | Rest] => [Y] & U,[X],V=*[Z],
U,[X • Y],V =» [Z][Z].
[P|R],[Q|R1]^[X.Y] [Z]. Unifying a single factor with a sequence of types is just standard unification, as encoded in the base clause for unify/2. Unifying a sequence List with a conjunction of factors (Factor, Factors) involves the (non-deterministic) unification of Factor with a prefix of List, and the recursive unification of the remaining Factors with what is left of List. Resolve/3 then computes the unifying substitution for a goal sequent and the head of a Lambek-Gentzen clause. Their resolution succeeds if their succedent types unify, and if their antecedents ~ which can represent two distinct factorizations of a single sequence of types - can be matched with the same sequence. prove/2: prove(Sequent,ProofTree) prove(true, true). prove(Sequent, (Sequentl[Y]) :unify(Sequencel, Sequence3), unify(Sequence2, Sequence3). unify/2: unify(Factorization,Sequence) unify(T,T) :not functor(T,',',2),!. unify((Factor, Factors), List) :append(Factor, Rest, List), unify(Factors, Rest) . Program 4 2 2 Resolution for factorized sequents.
RESOLUTION
147
RESOLUTION, UNIFYING SUBSTITUTION: EXAMPLES
We illustrate the effect of unify/2 and resolve/3 with some examples. Suppose we ask the L interpreter to compute an answer for the following query: ?-
Sequent = [np/n,n,ap,ap\(np\s)]=*[s], (Head[ap] Premise2 = ([np/n,n],[np\s],Q)=»[s] Resolution = [np/n,n,ap,ap\(np\s)]=*[s] The input sequent can be matched with the head of the [\L] rule. Resolving Sequent and Head leads to a unifying substitution which instantiates the factorization of Head as U = [np/n,n], [T | Rest] = [ap], Y\X = ap\(np\s), V = |] and Z = s . Since the head of the L inference rule and the body share these variables, the resolvent (Premisel & Premise2) is instantiated with the same bindings. Observe that the Resolution sequent in the above example happens to be identical to the goal Sequent, because the antecedent of the latter consisted of a single factor, i.e. sequence of types. In the example below, we repeat the query for the value of Premise2 just obtained. Premise2=([np/n,n],[np\s],[|)=»[s] and Head = (Q,[np/n], [n],[np\s])=*[s] in this case represent two distinct factorizations of a single sequent Resolution, which will appear in the Gentzen proof tree. ?-
Sequent = ([np/n,n],[np\s],Q)=>[s], (Head[n] Premise2 = (Q,[np],[np\s])=>[s] Resolution = [np/n,n,np\s]=»[s]
148
CHAPTER 4.2
SEMANTIC INTERPRETATION
In Chapter 1 we discussed the mapping between the algebra of proofs and the lambda terms serving as denotational recipes. The system just described can be extended with the semantic interpretation procedure without changes to the interpreter. We will discuss semantics in detail in the next chapter. Here we just introduce the notational conventions for the Prolog encoding of the lambda terms constructed for a Gentzen proof. For the product-free part of L, the inference rules of the sequent calculus are associated with the lambda semantics given in Figure 4.2.3. [X]
[X] B' and conclusion A"=>B", the tree T whose root is labeled with A'WB" and whose subtree T/l is equal to T1 is a proof tree. (2) For any pair of proof trees Tl, T2 whose roots are labeled with sequents A=>B and C=>D respectively, and for every instance of a two-premise inference with premises A=>B and C=>D and conclusion E=>F, the tree T whose root is labeled with E=>F and whose subtrees T/l and T/2 are equal to T l and T2 respectively is a proof tree. Definition 4 3 2 The set of L deduction trees is defined inductively as the least set of trees containing all one-node trees (not necessarily labeled with an axiom sequent), and closed under the treebuilding operations (1) and (2) as above. The LR interpreter of Program 4.2.1 can be easily extended to produce deduction trees rather than proof trees, thus providing us with a tool to plot the complete search space. The base case for the construction of proof trees is the axiom case; to yield deduction trees, we can add a second base case which labels a node as finished if it cannot be expanded any further, i.e. if there is no clause for W it can resolve against. See Program A.3 for an implementation. In the examples below, dead nodes (i.e. unexpandable nonaxiom sequents) will be represented by the structure Sequent
fail.
152
CHAPTER 4.3
Given the distinction between the full set of deduction trees for a sequent and the subset of proof trees, we can depict the total Lambek-Gentzen search space for the simple goal sequent of Example 4.3.1 as the OR state space graph of Figure 4.3.1. Example 43.1 Finished deduction trees for ?- prove([np/n,n,(n\n)/np,np]=>[np],DeductionTree) . The search space contains seven fully expanded deduction trees, three of which are proof trees. The numbering refers to the tree addresses in the OR search graph representation of Figure 4.3.1. [Ill]
[np/n,n,(n\n)/np,np]=*[np] «[n]=>[n]«-true [np,(n\n)/np,np]=>[np] «[np]=*[np] «- true [np,n\n]=>[np] «[np]=>[n] «-fail [n]=»[np] «-fail
[2] [np/n,n,(n\n)/np,npMnp] «[n,(n\n)/np]=»[n] «- fail [np,np]=*[np] «- fail [311] [np/n,n,(n\n)/np,np]=»[np] «[n,(n\n)/np,npHn] «[np]=>[np] «- true [n,n\n]=>[n] «[n]=>[n] «- true [n]=>[n] «- true [np]=*[np] «-true [411] [np/n,n,(n\n)/np,np]=>[np] [np]=>[np] «-true [np/n,n,n\n]=»[np] «[n]=>[n] «- true [np,n\nHnp] «[np]=»[n] «-fail [n]=>[np] «-fail
«-
Proof tree
RESOLUTION
[421] [np/n,n,(n\n)/np,np]=»[np] [npHnp] «-true [np/n,n,n\n]=»[np] [n,n\n]^[n] [n]=*[n] «-true [n]=>[n] [np] [np] [np]=>[np] [n] [n]=>[n] [np] ', in other words, that the negation of the goal sequent together with the database for V are unsatisfiable. For ground sequents, then, the resolution refutation comes down to answering the true-or-false question whether the sequent follows from the database; in the case of a positive answer, the Gentzen proof tree is constructed by the interpreter as a by-product of the successful proof. Our interest in the Horn clause version of the Lambek-Gentzen system rests on the fact that we can use the resolution procedure not just to answer yes-no questions about L-derivability, but also to compute answer substitutions for non-ground goal sequents. So we can turn to polymorphic variants of L, where sequents can contain variable subtypes at any level of recursion. The resolution refutation then results in a set of bindings {Xj = Termj X„ = Term,j} for the variable subtypes X1,...,Xn in a sequent such that the database clauses for L-derivability W logically imply the sequent one obtains by substituting these bindings into the original non-ground goal: [Antecedent =» Succedent] {X 1 =Term 1 ,...,X n =Term n } . Where the resolution refutation for ground sequents demonstrates L-validity, the computation of answer substitutions in the nonground case represents the problem of solving L equations. Since the decidability problem is open for the polymorphic variant of L in general (Van Benthem 1988b), we will try to demarcate a decidable subsystem of L which is significant enough to resolve irreducible polymorphism, i.e. polymorphism which does not reduce to the notion of count-preserving L-valid type transitions. As we have seen in Chapter 1, this is essentially the polymorphism of the generalized
166
CHAPTER 5.0
Boolean conjunction and disjunction types (X\X)/X. Polymorphic type assignments to non-Boolean expressions, such as the higher-order noun-phrase types X/(np\X) proposed in Zeevat, Klein & Calder (1987) are not essential for our purposes, as they are validly derivable from the basic noun-phrase type in the calculus L. The present chapter is organized as follows. When one tries to investigate the set of answer substitutions produced by resolution refutation for non-ground sequents, one immediately faces a practical problem: non-termination for the LR proof procedure as defined above. The non-termination problem is caused by the incompleteness of the particular Prolog depth-first search algorithm. Although SLD resolution as a proof procedure for Horn clause logic is complete (see Lloyd 1987), the Prolog search procedure is not, i.e. it may fail to find a proof when in fact there is one. This problem can be alleviated (at the cost of efficiency!) by instructing the LR interpreter to switch from the standard depth-first backtracking search to a consecutively bounded depth-first regime which finds solutions where Prolog search is lost in non-terminating branches. The real problem with L then turns out to be the logical infinity of the calculus: for a given non-ground sequent goal, LR produces an infinite set of non-equivalent correct answer substitutions. However, for the purposes of solving Boolean polymorphism, we are interested in a finite solution space: the minimal solutions for a generalized syntactic account of Boolean conjoinability. Section 5.2 is devoted to the characterization of a finite subsystem of L with the desired properties: the system M. We illustrate the properties of M in Section 5.3, on the basis of strict left-associative bottom-up parsing and conjunction of non-constituents. Finally, in Section 5.4, we return to the division of labour between the syntactic and the semantic algebra, and we complement the syntactic treatment of Boolean conjoinability with a flexible theory of the relationship between syntactic and semantic types, so that the scope ambiguities which were ignored so far can be taken into account. 5.1. COMPLETE AND INCOMPLETE SEARCH STRATEGIES
A successful Lambek-Gentzen proof has to deal with two complexity factors: the degree (number of connectives) and the length of the end-sequent. In order to find a finite path to axiom leaves X => X, the expansion steps in the construction of the proof tree have to decrease these complexity factors in a systematic way. We have already seen that the recursive clauses for V (i.e. the L inference rules) decrease the degree of a sequent at each expansion step. But observe that the one-premise rules [/R], [\R] and [*L] are lengthincreasing: the premise sequent for these inferences is longer than the conclusion.
167
POLYMORPHISM
T=>Yo
O U,X, V => Z
[fL] 6 U,X/Y,T,V => Z T=»Yo
[\L]
oT,Y=»X
[/R] ¿T=»X/Y
oU,X,V4Z
U,T,Y\X,V =» Z U,X,Y,V
Z
[•L] ¿U,X-Y,V=*Z
oY,T=>X
[\R] o T=> Y\X P=»X
[X]. The resolution refutation has to compute answer substitutions for the succedent variable X which make the sequent derivable, i.e.
168
CHAPTER 5.1
answers to the question: What types are L-derivable from the antecedent sequence [pp/np,np/n] ? Recall that the resolution procedure tries to resolve the goal sequent against a clause for W, trying the clauses top-to-bottom as they occur in the database. The first clause matching the goal is the [/R] Introduction rule [T | Rest]
[X/Y] X'" {x'=(x'7x"")} [/R] = 0, (Head«- Premises), resolve(Sequent, Head, Sequentl), NewDepth is Depth-1, prove(NewDepth, Premises, ProofTree). prove(Depth, (Premisel & Premise2), (ProofTreel, ProofTree2)) :prove(Depth, Premisel, ProofTreel), prove(Depth, Premise2, ProofTree2). cbdfjprove/3: cbdfj>rove(InitialBound, Sequent, Proof) cbdf_prove(Start, Sequent, Proof):prove(Start, Sequent, Proof). cbdf_prove(Max, Sequent, Proof):NewMax is Max+1, cbdf_prove(NewMax, Sequent, Proof). Program 5.1.1 Consecutively bounded depth-first search. Prove/3 limits the depth of a Gentzen deduction by adding a Depth argument to the original prove/2. The depth limit is decreased at each inference step, and not allowed to become negative. The new top predicate cbdf_prove/3 tries to find resolution refutations for a sequent with a given initial depth bound. Only when no more solutions at this initial depth level can be found, the second clause for cbdfjprove/3 becomes operative, which increments the depth bound and explores the search space with the new limit. A suitable initial value for the depth limit in the case of LR proofs for nonground sequents would be the degree of the goal sequent (where the degree of an uninstantiated type equals zero). For a ground sequent, as we have seen in Chapter 1, the complexity degree sets the maximal depth of the proof tree, so that the first clause of cbdfjprove/3 would immediately produce all proofs. For a nonground sequent, answer substitutions for variable types are computed with an increasing complexity.
170
CHAPTER 5.1
LOGICAL INFINITY OF L
The consecutively bounded search regime, needless to say, is hopelessly inefficient, as complete implementations of SLD resolution in general will be. The point of the discussion, however, is theoretical: exploration of the solution set for L equations will confront us with the logical infinity of the calculus, an intrinsic property of the polymorphic Lambek-Gentzen system which no search optimization can eliminate. Observe that we are dealing here with two different kinds of infinity. On the one hand, when the search space is explored with the standard depth-first strategy, we are confronted with infinite search branches, as we saw above. On the other hand, when we switch to the consecutively bounded search regime, we are confronted with an infinite number of success branches, yielding alternative solutions for polymorphic types in non-ground sequents. To close this section, we give a general characterization of this second source of infinity, illustrating the infinity of the solution set for L-equations with unknown types in the antecedent (Example 5.1.2) or in the succedent (Example 5.1.3). In the next section, we zoom in on the specific subproblems that will be addressed in the remainder of this book.
Example 5.1.2 Answer substitutions: unknown succedent. As a first illustration of the answers produced by the consecutively bounded depth-first control regime, we return to the goal sequent [pp/np,np/n] [Type], a non-ground goal with an unknown succedent type. Below one finds the third generation answers to the query ?- prove(3, [pp/np,np/n] => [Type], ProofTree), i.e. answer substitutions for the variable succedent Type with a proof tree of depth 3. [pp/np,np/nM((Y/(np/n))/(pp/np))\Y]^ [(Y/(np/n))/(pp/np),pp/nP)np/nh[Y] Y have a finite set of generators, in the sense of the following theorem (demonstrated for LP but transferable to L if we take directionality into account).
172
CHAPTER 5.1
Theorem 52.1 (Van Benthem 1986a) For any sequence of types Xjj—.Xjj with atomic types {A1,...,Ak}, X^...,^ Y if and only if Y is derivable from one of the following types derivable from X^.^X,,: (X1,(...(Xn,Ai)...)),Ai) (1 < i < £) and at most one of the atoms A; (1 < i < k) . The answer substitution Z/((np/n)\((pp/np)\Z)) with the instantiation {Z=pp}, which we discussed above, represents the directional version of one of the inflated generators (X1,(...(Xn,Ai)...)),Ai). The theorem, although it guarantees that all possible outcomes for a sequence X1,...,Xn are derivable from a finite set of generators, is of little practical use when we turn to the parsing problem below: the generators (X1,(...(Xn,Ai)...)),Ai) just shift the complexity of the input sequence X-^.^X,, onto an inflated single type, without performing any cancellation of matching subtypes in Xx, ...,Xn. Example 5.1.2 looked for bindings for an unknown succedent type. But infinity also affects the solution set for unknowns in the antecedent, as the following illustration shows. Example 5.13 Infinite solution set for antecedent variables. Consider the following non-ground goal sequent, where answer substitutions for the antecedent variable Type have to be computed. ?- [pp/np,Type,n] => [pp]. Exploration of the solution space for this equation (with the consecutively bounded depth-first regime, i.e. in ascending order of depth complexity) produces an infinite set of substitutions for Type, starting with the following answers: Type Type Type Type Type Type Type Type Type Type Type Type Type
= = = = = = = = = = = = =
np/n ((pp/np)\pp)/n (pp/np)\(pp/n) np/((X/n)\X) ((pp/np)\pp)/((X/n)\X) np/(X/(n\X)) ((pp/np)\pp)/(X/(n\X)) ((X/(pp/np))\X)\(pp/n) (X/((pp/np)\X))\(pp/n) (((X/(pp/np))\X)\pp)/n ((X/((pp/np)\X))\pp)/n (((X/(pp/np))\X)\pp)/((Y/n)\Y) ((X/((pp/np)\X))\pp)/((Y/n)\Y)
POLYMORPHISM
Type Type Type Type Type Type Type Type Type Type Type Type Type
= = = = = = = = = = = = =
173
(((X/(pp/np))\X)\pp)/(Y/(n\Y)) ((X/(( P p/np)\X))\pp)/(Y/(n\Y)) (pp/np)\(pp/((X/n)\X)) (pp/np)\(pp/(X/(ii\X))) ((X/(pp/n P ))\X)\( P P /((Y/n)\Y)) ((X/(pp/np))\X)\(pp/(Y/(n\Y))) (X/((pp/np)\X))\(pp/((Y/n)\Y)) (X/((pp/np)\X))\(pp/(Y/(n\Y))) np/((X\(Y/n))\(X\Y)) ((pp/np)\pp)/((X\(Y/n))\(X\Y)) (((X/(pp/np))\X)\pp)/((Y\(Z/n))\(Y\Z)) ((X/((pp/np)\X))\pp)/((Y\(Z/n))\(Y\Z)) ...
Again the derivability relation V can establish some order in this set of substitutions. The minimal answer (in terms of complexity degree) is the determiner instantiation {Type = np/n}. The other solutions are either derivable from this type, or they can be simplified to np/n or types derivable from it. Examples of higherorder answers derivable from np/n are represented by the following arrows: np/n (pp/np)\(pp/n), np/n => ((pp/np)\pp)/n, where the first type transition is an instance of the familiar inverse division theorem R6, the second of lifting of the range subtype. The second case ~ answers that can be simplified to np/n ~ is represented by derivability arrows such as np/((X/n)\X) np/n, ((pp/np)\pp)/((X/n)\X) => ((pp/np)\pp)/n. Both of these represent instances of argument-lowering of a negative subtype. As the set of bindings for Type shows, all negative subtypes also appear in a polymorphic lifted form, and the lifted version can be of arbitrary order. This higher-order information would be relevant if we would require the solutions to generate all possible semantic readings. But as we indicated above, our aim in the following sections is more modest: we will be looking for the most simple syntactic solutions to L equations, i.e. solutions that make a non-ground sequent syntactically derivable. In Section 5.5, we will account for higher-order interpretative readings for argument subtypes in the semantic algebra, and demonstrate that thenelevated semantic status is reconcilable with the modest syntactic rank that results from argument-lowering.
174
CHAPTER 5.2
5.2. BOTTOM-UP PROOFS IN L + {CUT}
In the preceding section, we briefly considered the general problem of solving L-equations as summed up in the question: What substitutions make a non-ground sequent L-derivable? The general problem is beyond the scope of the present chapter; its complexity, as demonstrated above, is caused by the logical infinity of the calculus L. In the following sections, we will study the problem of variable substitutions from the perspective of categorial parsing. In this context the general problem of variable substitutions manifests itself in the form of two subproblems which any empirically adequate categorial theory has to face. • First of all, we want to have a general solution for Boolean polymorphism in L, i.e. we want to be able to derive arbitrary Boolean coordinations. To solve the problem of Boolean conjoinability in L, we must be able to decide, for arbitrary conjunct sequences P and Q, whether there is a type X such that X is L-derivable from P and from Q. • Second, it has been claimed repeatedly (e.g. Steedman 1985, Haddock 1987) that flexible categorial systems can directly model the left-to-right mental processing of expressions and the concomitant incremental construction of the semantic interpretation, thus offering an attractive competence-based processing theory. An incremental left-associative derivation for a sequent A1,A2,...rAn => B requires a succession of reductions of the left corner A lr A 2 X, followed by the derivation of X , . . . ^ => B, where again the parsing algorithm has to compute answer substitutions for the unknown types X. Both problems, we will argue, can be fruitfully approached in the Gentzen theorem-proving framework if we shift from cut-free L to L + {Cut}. A cut-free proof is unfolded on the basis of a topdown problem-solving strategy, starting with the goal sequent (the root of the proof tree), and breaking up this goal into smaller subgoal sequents by means of the L inference rules, until axiom leaves are reached. The top-down proof procedure yields a decision procedure in the case of ground sequents, but, as we have seen above, it is inefficient for non-ground sequents, because it lacks goal-directedness. Faced with a non-ground sequent, the cut-free proof procedure blindly enumerates the infinite set of answer substitutions that follow logically from L. In the following sections, we will extend L with the Cut inference of Figure 5.2.1 and with a set of derived inference rules
POLYMORPHISM
175
(the Lemma database) to derive the cut formula X as the answer substitution in the proof of the premise T X.
[Cut] 6 U,T,V=> Y Figure 5.2.1 Cut rule Whereas the cut-free proofs are developed top-down, working backward from the goal sequent toward axiom sequents, the Lemma database for the Cut inference will result in a bottom-up problemsolving strategy, starting from axiom leaves, and computing new types (the cut formulae X) as the synthesis of information yielded by given types. It is important to realize that the proposed extensions do not change the logical properties of L. The Lemma database will consists of L-valid axiom schemes and inference rules, i.e. rules that can be proved in (cut-free) L. And first of all, the Cut rule itself is a derived rule of inference, as we know from the Cut Elimination Theorem: all theorems derivable in L+{Cut} also have a cut-free proof. But, although the logical properties of L are unaffected, the bottom-up resolution strategy yields an efficient search procedure to compute the minimal answer substitutions for the unknown cut formula X in the cut premise T X. Once we have made the shift from L to L+{Cut}, the problems of incremental left-to-right processing and Boolean coordination can be represented as special cases of the Cut inference. Consider first left-associative parsing. In L + {Cut}, a left-associative derivation can be represented as a succession of Cut inferences of the following form, which instantiates the parameters of the general Cut scheme as {U = 0,T = [X1,X2]}. To prove a sequent [XI,X21 V]=»[Y], we reduce the left corner [XI,X2] to the cut formula X, and then prove that [X|V]=»[Y], where the proof of the second premise in its turn proceeds by left-associative Cut inferences. X• 1l >X a 2 ^ X
o X,V=*Y
o
[Cut] 6
X lt X 2 ,V=>Y
Figure 5 2 2 L + {Cut}: left-associative derivation
176
CHAPTER 5.2
The second manifestation of the Cut inference concerns Boolean polymorphism. As we saw in Chapter 1, within the bounds of the count-invariant calculus L, no finite ground type assignment to the Boolean particles can provide an adequate account of Boolean conjoinability, i.e. an account for the fact that the Boolean particles can conjoin arbitrary types X. The basis for our generalized account of Boolean conjoinability will therefore be a polymorphic type (X\X)/X, where the type of the conjunction is an unknown X for which resolution will have to compute suitable bindings in a given context. The problem of solving Boolean polymorphism, then, can be represented as the set of L-equations of Figure 5.2.3, an inference scheme which again is an instantiation of the Cut rule, with {T = P,(X\X)/X,Q} as the cut premise, and X as the unknown conjunction type that has to be computed. P=>Xo
Q=»X
P,(X\X)/X,Q=* X
o U,X,V=»Y
[Cut] A U,P,(X\X)/X,Q,V=> Y Figure 5 2 3 L+{Cut}: Boolean coordination The inference steps of Figure 5.2.3 can be paraphrased as follows. In order to prove a goal sequent U,P,(X\X)/X,Q,V => Y , where (X\X)/X is the variable conjunction particle, the proof procedure has to identify subsequences P and Q, for the left and right conjunct respectively, which can be reduced to a common type X. A successful solution for these two equations immediately validates the Cut premise P,(X\X)/X,Q => X , and will make the goal sequent derivable if also the second premise U,X,V =» Y can be proved, with as cut formula the computed conjunction type X.
POLYMORPHISM
177
In the remainder of this section, the set of L inference rules is extended with the Cut rule, and the reader is familiarized with the bottom-up unfolding of the Gentzen proof trees on the basis of a simple illustration, with the Application rule scheme R1 as the only auxiliary lemma. In the next section, we then turn to the requirements the lemma database must fulfill to solve the variable substitution problem as it presents itself in the case of left-associative processing and Boolean conjoinability.
EXTENDING L WITH CUT
Extending the Lambek-Gentzen system with the Cut requires minimal modifications of the interpreter, as Program 5.2.1 shows. The Cut inference can be added straightforwardly to the data for the interpreter, i.e. the clauses axiomatizing the derivability relation V . To stress the fact that the Cut premise [T|Rest] => [X] is an auxiliary lemma, i.e. a branch of the proof tree for which a cutfree proof could be given, we use '=»*72 as the derivability arrow in the clauses of the lemma database. The interpreter prove/2 is instructed to consider '=¥*' sequents as finished axiom leaves: the point of the Cut proof is not to unfold a complete proof tree for these sequents until basic axioms X ^ X are reached, but to abbreviate that part of the proof tree in the form of an auxiliary lemma. In the illustration, the lemma database is kept quite modest: it is restricted to the functional Application rule R l . The cut inferences, in other words, just abbreviate the small subproofs Y=»Y o
[/L]
oX=>X
6 X/Y, Y=* X
(and likewise for Y\X eliminations) which demonstrate that Application is an L-valid reduction. The second clause for prove/2 cuts off these subtrees at the root node after inspection of the lemma database, and successful execution of the '=>*' clauses found there. When we make the lemma database non-trivial in the next section, by adding derived rules of inference, i.e. recursive clauses for '=>*' that abbreviate arbitrarily large chunks of the proof tree, the cut inference will start saving us real efforts.
178
CHAPTER 5.2
Cut Inference: U,[T | Rest],V =» [Z] * [X] & U,[X],V*[Z]. Lemma database '=**'/2: L-valid reduction schemes [X/Y,Y]=>* [X]. [Y,Y\XH*[X].
Rightward application R1 Leftward application R1
prove/2: prove(Sequent, ProofTree) prove(true, true). prove(T [Y], (T =>* [Y]* [Y]). prove(Sequent, (Sequentl[s][s]*[pp] [X]
true .
Introduction: [T | Rest] => [Y\X] [Y,T | Rest] => [X]. [T | Rest] => [X/Y] [X],
181
POLYMORPHISM
Elimination: Application + Cut U,[X/Y,Y],V => [Z] [Z]. U,[Y,Y\X],V [Z][Z]. Figure 5.2.4 Cut + Application: partial execution. Example 523 [omdat,het,meisje,in,de,tuin,ligt] (cf. Example 5.2.1) THEOREM: [s/s,np/n,n,pp/np,np/n,n,pp\(np\s)]=>[s] PROOF: [s/s,np/n,n,pp/np,np/n,n,pp\(np\s)]=»[s] Figure 5.2.4, then, defines a calculus stronger than AB (since it the Introduction inferences), but weaker than L, and neither cluded in nor including F. We leave its exact location within categorial hierarchy as a subject for further study.
b/c. has inthe
5.3. THE LEMMA DATABASE: SYSTEM M
Let us turn now to the design of the lemma database for the bottom-up inference engine. In the course of a bottom-up derivation, the theorem-prover will be confronted with cut premises T X, where substitutions have to be computed for the unknown succedent type X. An adequate account of Boolean conjoinability and leftassociative processing presupposes that the algorithm meets a number of requirements. * The bottom-up inference engine must inherit the associativity of L. As we have seen in Chapter 2, the associativity of L guarantees that if a sequent A^,...,Aj| => B is derivable, it is derivable for any bracketing of Aj,.../^ into constituents. Some of these bracketings will coincide with conventional constituents, corresponding to subproofs derivable solely in terms of Elimination inferences. Other bracketings partition the antecedent into non-constituents, corresponding to subproofs that crucially involve Conditionalization inferences. In the case of a left-associative derivation, there is no guarantee that the left-corner Cut premise XI,X2 => X covers a conventional constituent; likewise, in the case of Boolean coordination, the conjunct sequents P => X and Q =» X can turn out to be non-constituents. The bottom-up inference engine must be strong enough to produce answer substitutions for the un-
POLYMORPHISM
known cut formula X for arbitrary Cut inferences T the course of a derivation U,T,V => Y.
183 X in
• The solution set for the unknown cut-formula in the bottom-up inferences must be finite. Within the standard axiomatization of L, the ability to collect arbitrary subsequences of a derivable sequent into constituents is a consequence of the Conditionalization inferences [/R],[\R]. But as the foregoing section has shown, Conditionalization is responsible for the logical infinity of L. We want the bottom-up inference engine to absorb just as much of the power of Conditionalization inferences as needed to obtain derivations with the required proof shape; the parsing algorithm should not explore the logically infinite set of L-valid answer substitutions for Cut premises T => X. • The lemma database can restrict itself to minimal answer substitutions for Cut premises T => X. Suppose we prove a sequent U,T,V => Y on the basis of cut premises T X and U,X,V =» Y. From the cut formula X an infinite number of higher-order types X' are L-derivable, which may also support the premise U,X',V => Y. However, there is no point in investigating types derivable from X: if U,X',V Y and X X', we may conclude from the transitivity of W that also U,X,V => Y, i.e. that the premise is already provable with the simpler cut formula. The restriction to minimal answers presupposes a division of labour between the syntactic and the semantic algebra, where syntactic type assignment is maximally simple, and higher-order types needed to obtain alternative readings for a syntactically non-ambiguous expression are derivable in the semantic algebra. To realize the above program, we will pursue the following strategy. We start from the base case for categorial reduction, the Application rule, and we propose a recursive generalization of the derivability relation, based on the monotonicity properties of the categorial operators. When two types cannot be directly combined by simple Application, they are recursively broken up in their immediate subtypes. The recursive decomposition of compound types, clearly, is a finite process that will lead either to the Application base case, or to atomic subtypes that cannot be further decomposed. In order to obtain a reduction also in the case where atoms are reached, a second base case is added to the bottom-up inference engine, which reduces the atoms to the smallest functor type derivable from them.
184
CHAPTER 5.3
MONOTONICTTY
To introduce the concept of monotonicity with respect to the type-forming connectives, it may be useful to return for a moment to the original set-theoretic perspective on types, and reflect on the relationship between the derivability of a sequent
and the validity of the corresponding set-theoretic inequality Xi'-'X.SY. As we have seen in Chapter 1, categorial types are to be interpreted as sets of expressions, more specifically, as subsets of a set S, the set of vocabulary items closed under concatenation. Basic types identify a small number of such subsets of expressions. In order to single out arbitrary subsets of S, we have at our disposal the set of type-forming operators { / , i n d u c t i v e l y characterizing the infinite set of possible types. The interpretation of the typeforming operators is repeated here for convenience. Definition 53.1 Interpretation of the type-forming operators. A-B = {xyeS | x e A & y s B } C/B = { x e S | V y e B, xy e C} A\C = { y e S | V x e A , x y e C }
[Def-] [Def/] [Def\]
Recall now the fundamental laws of categorial arithmetics as they follow immediately from the semantics of the operators. Concatenation is an associative operation. But unlike numerical multiplication, categorial concatenation is non-commutative. Therefore, the product operator has a left-inverse and a right-inverse, the operations of left-division and right-division. The axioms below state these fundamental laws with respect to the basic relationship between types seen as sets of expressions: the inclusion relation '£'. A-(B-C)c(A-B)-Cand(A'B)'CSA'(B-C) A • B £ C if and only if A £ C/B A • B £ C if and only if B £ A\C
[Axiom 1] [Axiom 2] [Axiom3]
The demonstration of the validity of familiar reduction laws, such as the set R1-R6 which we used to illustrate the concept of flexible constituency, is immediate. But to decide on the validity of set-theoretic inequalities X1 •... • X„ £ Y of arbitrary complexity, we shifted to the proof-theoretic Gentzen perspective, where a provable
185
POLYMORPHISM
sequent X1,...,Xn Y in fact asserts that the inequality X j ' — 'X,, £ Y is a theorem of categorial arithmetics. Lambek (1958) formally proves that the sequent implementation of L is equivalent to the original set-theoretic formulation, i.e. that it is a faithful alternative axiomatization. The derivability relation '£' (equivalently, W) imposes a partial ordering on the set of types. We will now investigate the monotonicity properties of the categorial operators {/, • , \ } with respect to this ordering. As we just observed, the type-forming operators are non-commutative, which means that we have to consider the monotonicity of the left and the right argument separately. Definition 5.3.2 gives a general characterization of the relevant concepts. Definition 53.2 On a given a set K, an operation ® is monotone increasing (decreasing) in its first argument with respect to a binary relation R if for arbitrary members of K, if Y R Z, then (Y ® X) R (Z ® X) (left monotone increasing) if Y R Z, then (Z ® X) R (Y ® X) (left monotone decreasing) Monotonicity in the second argument is defined analogously: if Y R Z, then (X ® Y) R (X ® Z) (right monotone increasing) if Y R Z, then (X ® Z) R (X ® Y) (right monotone decreasing) For our specific purposes, the set K is the set of types, the binary relation R the relation of L-derivability, and the operations ® the type-forming connectives. Consider first the division operators V and ' \ \ In Chapter 1, we have discussed an alternative axiomatization of L, proposed in Zielonka (1981), based on the following derived inference rules for unary type-transitions. [ZI] [Z2]
X/Z =*• Y/Z if X=>Y Z/Y=»Z/Xif X=*Y
Z \ X => Z \ Y if X=»Y Y\Z X \ Z if X=>Y
From the set-theoretic perspective on types, the validity of Z1 and Z2 follows from the fact that fractional types are monotone decreasing in their first argument (the domain argument), and monotone increasing in their second (range) argument. The following inference patterns, which define the monotonicity of a functor, are indeed validated by Z1 and Z2. (We write /(Dom,Ran) for Ran/Dom and Dom\Ran, abstracting from directionality.) if X £ Y, then/(Y,Z) £/(X,Z) , if X £ Y, then/(Z,X) £/(Z,Y) .
186
CHAPTER 5.3
As to categorial multiplication, Lambek (1988) proves the following derived rule of inference for product types. [L-]
X 1 -X 2 =>Y 1 -Y 2 if X1 => Y1 and X2=>Y2
[L*] establishes the fact that the product operator is monotone increasing both in its first and in its second argument. The derived inference rule can be straightforwardly translated in the following inference pattern for double monotonicity: if Xt c Yt and X 2 £ Y2, then (X 1 -X 2 ) £ (Y1 • Y2) . We summarize our findings with regard to the monotonicity properties of the type-forming operators in Table 5.3.1. Product types are monotone increasing both in their left and right subtype. Fractional types are monotone decreasing in the domain subtype and monotone increasing in the range subtype.
X
Y
*
+
+
/
+
\
+ Table 53.1
RECURSIVE AXIOMATTZATION: SYSTEM M
In order to provide the recursive generalization of the elementary binary reduction laws (left and right-application), we combine the monotonicity properties of the categorial multiplication and division algebras. The problem then assumes the following form: Given functional application [Rl] as the base case for the reduction relation '£', what recursive generalizations can be deduced from the monotonicity laws for division and product formation? [Rl]
X/Y,Y ç x
Y,Y\X £ X
Reduction, base case
POLYMORPHISM
187
As in the case of the unary type-transitions from the Zielonka system, two forms of recursion can be distinguished, depending on whether we descend into the domain or the range subtype of a complex type. Consider first recursion on the range subtype. Instead of a unary type transition, the premise for our deduction now is a combination of types Z - X £ W. The monotone increasing nature of the range subtype in fractional categories implies that if Z - X £ W, then (Z-X)/Y £ w/Y, where the left-hand term and the right-hand term of '£' in the premise are both divided by Y. One can easily check that (Z • X)/Y includes the set of expressions of type Z • (X/Y), i.e. Z*(X/Y) £ (Z*X)/Y, so that from the transitivity of '£' we may conclude that Z-(X/Y) £ W/Y. The transformation of this deduction into the Horn clause format of the derived inference rules Ml turns the conclusion into the head of the clause, and the premise Z-X £ W into the body. The left case of Ml represents the right division instance for recursion on the range subtype X, as discussed above; the right rule is the symmetric dual for left division. (Throughout this section, we use the informal clausal notation of Chapter 1, omitting list brackets, etc. The official format for the LR interpreter is presented in the next section.) Definition 533 Recursion on the range subtype. [Ml]
Z,X/Y => W/Y if Z,X=>W
Y\X,Z Y\W if x,z=>w
Let us turn now to the second case of recursion, where we descend into the domain subtype of a complex type. The premise assumption is the inequality Z - W £ Y. Since categorial division is monotone decreasing in its domain argument, we infer that if Z • W £ Y, then X/Y £ X/(Z • W). In this inference, we have replaced the more inclusive type Y by the smaller type Z«W. In the conclusion so derived, the left-hand term X/Y and the right-hand term X/(Z*W) can both be multiplied by a common factor Z, an operation which preserves inclusion since
188
CHAPTER 5.3
multiplication is monotone increasing in both arguments, as we saw in [L*]. But now the right-hand type (X/(Z*W))*Z can be simplified by cancelling out the factor Z, i.e. (X/(Z-W))'ZSX/W, which, on the basis of the transitivity of conclusion
leads to the desired
if Z • W S Y , then ( X / Y ) • Z S X / W .
Figure 5.3.1 gives a succinct representation of the complete deduction, specifying the monotonicity laws that validate the inferences. (The reader may observe that a cut-free Gentzen proof for the end-sequent leads to the assumption premise even more directly. Our purpose here, however, is to clearly establish the link between the derived inference rules and the monotonicity properties of the category-forming operators.) [Assumption]
o Z •W £ Y
[Z2] o X/Y £ X/(Z»W)
ozez
[L-] ¿ X / Y - Z S ( X / ( Z - W ) ) - Z ( X / ( Z ' W ) ) - Z £ X/W o
[Cut] ¿ X / Y - Z S X/W Figure 53.1 Transforming the above deduction into Horn clause logic leads to a second set of derived rules of inference, M2 below, which generalize the binary application law to the domain subtype of functor types. As usual, the deduction we just gave for the right-division case is completed by a symmetric dual for left-division. Definition 53.4 Recursion on the domain subtype. [M2]
X/Y,Z X/W if Z,W=>Y
Z,Y\X W\X if W,Z=»Y
POLYMORPHISM
189
M SEMANTICS: PARTIAL EXECUTION
In the preceding paragraph, we have demonstrated the syntactic validity of the reduction laws M l and M2. Consider now the semantic interpretation of the recursive clauses. Recall that the interpretation of a cut-free proof in L is determined by the correspondence between the Elimination rules and the semantic operation of functional application, and between the Introduction rules and lambda abstraction over a variable for the introduced subtype. What we have to show now is how the interpretation of the recursive clauses M l and M2 is construed out of the interpretation of the input parameters, on the basis of the primitive semantic operations of functional application and lambda abstraction. To achieve this goal, we can simply unfold the cut-free Gentzen proofs for M l and M2 with respect to the Elimination and Introduction rules, until we have traversed a path from the conclusion to the premise sequents. Unfolding the resolution proof for the Elimination and Introduction steps results in a number of mutual bindings for the lambda terms interpreting the inferences. As soon as we have determined these bindings, we can remove the proof steps leading to their identification, and substitute the bindings directly in the clauses for M l and M2. Proof unfolding (or partial execution) is a familiar technique in logic programming (see for example Pereira and Shieber 1987). The Gentzen perspective on categorial derivability here offers a transparent illustration of the technique. As an elementary example of partial execution, consider the semantics for R l . We are interested in finding the unifying substitution for the variable Translation in the sequent X/Y:/, Y: a =» X: Translation. The sequent contains one operator V in the antecedent, i.e. the proof can be unfolded deterministically by virtue of the Elimination rule [/L], leading to the premise sequents Y: a
Y: a
and
X: /(a) => X: / ( a ) ,
both of which match the axiom case. The interpretation for the [/L] inference step fixes the semantics of X as /(a). R l is obtained by performing the substitution {Translation = / ( a ) } in the end-sequent of the proof, removing the Elimination inference that led to this substitution. Partial execution fixes the interpretation of the M reduction laws as the lambda terms of Definition 5.3.5, which presents the
190
CHAPTER 5.3
right-oriented case for the clauses Rl, M l and M2. The dual leftoriented case is completely analogous. As in the previous section, we represent the proof steps informally, using the conventional X notation for the semantic recipes. The official Prolog format for the lambda term structures (with the application operator and the lambda abstractor is given at the end of this section. Definition 53.5 M Semantics. [Rl] [Ml]
X/Y:/, Y: a => X:/(o) Z: /, X/Y: g => W/Y: XvJi if
[M2]
X/Y: f,Z:g=> X/W: Xv.f(h) if Z:g, W: v=>Y:/j
The partial unfolding of the Gentzen proofs for M l and M2 is displayed in Figures 5.3.2 and 5.3.3. The boxed nodes are labeled with the conclusions and premises of the recursive reduction laws. Consider first M l (in Figure 5.3.2 below). In order to find a path from the end-sequent to the premise, we blindly expand the endsequent with the inference rules of L. The right term W/Y of M l is a complex type with operator V and subtypes Y,W. The first expansion step removes the operator by means of [/R], which has the effect of adding the subtype Y to the antecedent. The semantic value v of the conditionalized subtype Y is a variable of type Y; the interpretation of the conclusion is derived by applying lambda abstraction over v in the term h, the semantic value that will be computed for the derivation of the subtype W. The second expansion step of the proof switches to the antecedent, since the succedent W is finished. The antecedent is deterministically expanded by applying [/L] to the only functor type X/Y, successfully identifying an argument Y to the right of the functor. The [/L] inference causes branching of the proof tree: the left branch is closed with a sequent matching the axiom scheme; in the right branch we have reached the premise of Ml. The application interpretation of the [/L] inference fixes the semantic value of the premise sequent as a term h of type W, obtained from the semantic value / of the factor Z and the term g(v) which interprets the factor X. Recall that the free occurrence of v in h is bound together with the withdrawal of Y in the conclusion. Substituting the bindings found by partially executing the proof into the boxed premise and conclusion sequents gives Ml.
191
POLYMORPHISM
Y: v
Y: v Q
Q
Z: /, X: SemX => W:/i {SemX = g(v)}
[/L] o Z:/, X/Y:g, Y: v => W: /i {Resuit = Xvii} [/R] 6
Z: /, X/Y: g => W/Y: Resuit
Figure 5 32 [Ml] Recursion: range subtype The proof tree for M2, the case of monotonic descent in the domain subtype, can be deterministically unfolded in a similar manner. We trace the proof of Figure 5.3.3 bottom-up, starting from the boxed premise sequent. By assumption, we can construe a term h with a free occurrence of a variable v of type W on the basis of the factors Z and W. This assumption determines the branching of the proof at the second expansion step, [/L], where the functor X/Y finds its Y argument in the sequence Z,W. The right branch of the proof is then successfully closed with an axiom sequent of type X, corresponding to the functional application of /, the semantic value of X/Y, to the term h which was obtained from the assumption Z,W =* Y. The conclusion of the proof is obtained by withdrawing the W subtype, thereby binding the free occurrence of v in h. We perform the bindings for SemX and Result in the clause for M2, and remove the proof by virtue of which these bindings were computed.
Z: g, W: v =>Y: h
o
oX:f(h)=>X:f(h) {SemX =/(/!)}
[fL] o X/Y:/, Z: g, W: v => X: SemX {Result = Xv.SemX} [/R] 6
X/Y: /,Z:g=> X/W: Result
Figure 533 [M2] Recursion: domain subtype
192
CHAPTER 5.3
ASSOCIATIVITY: ATOMIC BOUNDARY CASES
In the preceding paragraphs, we have studied the clauses of the lemma database {R1,M1,M2} from a declarative perspective. But as with the original axiomatization of L, we can combine the clauses {R1,M1,M2} with the resolution proof procedure. In conjunction with resolution, the clauses {R1,M1,M2} yield an algorithm which, confronted with a goal sequent Typel,Type2 =>* Cut with given input types Typel, Type2 and an unknown result type Cut, will compute the answer substitutions for the unknown Cut type, i.e. the bindings which turn the goal sequent into a logical consequence of the clauses {R1,M1,M2}. Let us reflect on the procedural aspects of the derived inference rules discussed so far, and see what has been realized of the desiderata for the bottom-up inference engine summed up at the beginning of this section. Consider first the finiteness of the solution set for M reductions. Confronted with the above goal sequent, the M laws {R1,M1, M2}, driven by the resolution procedure, synthesize the unknown result type in terms of the information carried by the input categories Typel and Type2. The input types are finite structures, i.e. the building blocks for the M computation are the finite subcomponents of the input categories: the occurrences of atomic subtypes in Typel and Type2, their signature (positive or negative occurrence), and the directionality requirements encoded in the orientation of the operators. The set of well-formed types that can be construed from these building blocks is a priori finite in nature. The relation of M-derivability '=>*' holds between Typel, Type2 and Cut if the latter is a member of this set which preserves the logical invariance properties of L: count-invariance and directionality. Consider a simple example where the goal sequent is a/b, c =>* Cut. The multiset of atomic subtype occurrences is {a,b,c}. The a-count of the antecedent is 1, the b-count -1, and the c-count 1. On the basis of the atom multiset a finite set of conceivable well-formed types can be built (finite because we are not allowed to use a type occurrence more than once): {a,b,c, a/b,a/c,a\b,a\c,b/c,b/a,b\c,b\a, a/(b/c),(a/b)/c,a/(c\b),...}
193
POLYMORPHISM
Clearly, only a subset of this set is L-derivable from the antecedent sequence a/b,c. For the sequent a/b,c=>Cut, the substitution {Cut = a/c} does not qualify as an L-valid answer: it violates countinvariance (since the b-count of a/c is 0, but the b-count of the antecedent -1). Neither can we derive {Cut = a/(b/c)}. Count-invariance is respected in this case, but directionality preservation is violated: if the antecedent sequence a/b,c would reduce to a/(b/c), then by transitivity we would also have a/b, c, b/c => a , which is not L-derivable, since the functor b/c does not find its argument at the right side. The substitution {Cut = a/(c\b)} turns out to be the only type formed on the basis of the atom multiset {a,b,c} which preserves both count-invariance and directionality, and indeed it is the single answer substitution that can be computed here by the resolution procedure on the basis of {R1,M1,M2}. Let us trace the resolution steps. In order to compute an answer substitution for the result type Cut, the resolution procedure tries to match the goal sequent a/b,c=>* Cut against one of the database clauses defining the relation i.e. against the base case Rl, or one of the recursive cases M l or M2. The goal sequent does not match the Application base case Rl: there is no unifying substitution which makes the goal sequent equal to either X/Y.Y =>* X
or
Y,Y\X =>* X
[Rl]
Neither does it match the head of the inference rule Ml: Z, X/Y =** W/Y Y\X,Z=»*Y\W
if if
Z,X=>*W, X,Z=>*W.
or [Ml]
But the goal sequent can be matched against the head of the rightoriented version of M2, X/Y, Z =»* X/W
if
Z,W =>* Y
[M2]
with the unifying substitution {X = a, Y = b , Z = c, Cut = a/W}, which generates the resolvent premise goal [Z,W =>* Y]{X = a, Y = b , Z = c, Cut = a/W} , i.e. c,W=**b.
194
CHAPTER 5.3
The premise goal, in its turn, can be resolved against the Application base case Rl, in its left-oriented version: c,W =>* b
= [Yl, Y1\X1
Xl]{Y1 = c, X I = b , W = c \ b } ,
which further instantiates the variable W with the binding {W = c\b}. The unit clause R l has no body, so at this point, the resolution procedure has produced the empty clause, and the computed answer {Cut = a/(c\b)} is obtained as the composition of the unifying substitutions for the resolution steps. Alternative answer substitutions would be produced if the resolution procedure, upon backtracking, would find different ways of matching the goal sequent against the database clauses for '=>*'. In our example, no alternative answer substitutions can be found, as the reader can check. The general effect of the resolution steps for {R1,M1,M2} can be described as follows. When a goal sequent matches the base case Rl, the result type is a subtype of the antecedent types. For the inference rules M l and M2, the resolution step generates a premise goal in terms of the subtypes of the original goal. This decomposition into subtypes comes to an end when there are no more subtypes left that would licence M1/M2 recursion (e.g. the subgoal c,W V b in the example). At this point, the decomposed sequent will either match the base case Rl, or failure will be detected. For an example of an L-derivable answer which is not reachable on the basis of {R1,M1,M2}, one can think of non-minimal solutions such as a/b, c
a/((a/(c\b))\a),
where the outcome a/((a/(c\b))\a) is a lifted type derivable from the M solution a/(c\b). The answer a/((a/(c\b))\a) cannot be computed on the basis of M: it uses more atom occurrences than available in the antecedent (three occurrences of the subtype a, whereas the antecedent has just one). To convince oneself of the fact that the above sequent does not follow from {R1,M1,M2}, one can try a resolution refutation. The goal sequent matches the head of the right-oriented case of M2: X/Y,Z=>*X/W
if
Z,W =>* Y
[M2]
with the unifying substitution {X=a, Y = b , Z = c , W = (a/(c\b))\a}, which instantiates the premise subgoal as [Z,W =»* Y]{X = a, Y = b , Z = c, W = (a/(c\b))\a}, i.e. c, (a/(c\b))\a =»* b .
195
POLYMORPHISM
This subgoal immediately fails: it does not match any database clause for '=>*'. The fact that M computes result types in terms of the antecedent subtypes makes it into a finite subsystem of L, as required. We must investigate now whether the lemma database {R1,M1,M2} is strong enough to compute solutions Z for arbitrary X,Y, such that X,Y =>* Z. It will be clear that for atomic X and Y, the derived inference rules {R1,M1,M2} cannot compute a product-free answer substitution which has the required preservation properties. For a goal sequent a,b =>* X, with atomic a,b, the total set of constructible product-free types is {a/b,a\b,b/a,b\a}, neither of which has a positive count value for subtype a and for subtype b, matching the antecedent count values. The only boundary case for M reductions in the set {R1,M1,M2} is the Application rule Rl. R1 cannot be the only boundary case if the reduction engine is required to be associative: R l captures the cases of categories that have connectedness, either immediately, in the sense of Ajdukiewicz (1935), or through monotonic recursion. A sequence of atoms is not connected in this sense: there is no subtype that can be cancelled. To obtain associativity for sequences of atomic types, we have two options, both with a price. The first option, as suggested by the preceding discussion, is to leave the product-free subsystem of L. We can add a boundary case to the set {R1,M1,M2}, which for atomic X,Y yields the smallest type including the concatenation of two atoms in L with product, i.e. X, Y
X • Y.
If we decide to tolerate type-internal products, we must generalize the basic reduction rule R l to cover such category objects - a generalization which is quite straightforward: X, Y-Z=>W if X,Y =» W1 and W1,Z=>W
X • Y, Z W if Y,Z W1 and X,W1=>W
The second option stays within the bounds of product-free L, and adds a boundary case to {R1,M1,M2} which, when confronted with atoms X,Y, reduces them to the smallest product-free result category. Let us call this boundary case R*. Definition 53.6 R* X,Y Z/(Y\(X\Z))
X,Y
((Z/Y)/X)\Z
196
CHAPTER 5.3
Whereas reduction on the basis of {R1,M1,M2} cancels a matching subtype in the antecedent types, R* takes together the antecedent types without cancellation as arguments of a functor with unknown range Z. In its general form, R* produces the inflated generator types of Theorem 5.2.1. In what follows, we will reserve R* for atomic input types X,Y, and leave it as a subject for further study whether there is empirical need for the unrestricted general case. The extension of the Lemma database to the fully associative {R*} U {R1,M1,M2} makes it possible for the bottom-up derivation to proceed when confronted with types X,Y that have no connectedness in terms of {R1,M1,M2} on the basis of one assumption external to the antecedent types: the variable range subtype Z introduced by R*. The price to be paid here is that we have to ensure that the conditionalized variable subtype Z cannot be inflated in the course of {M1,M2} recursion, if we want to avoid infinite regress by unification. We have seen an illustration of such infinite regress in Example 5.1.1. In Program 5.3.1 at the end of this section we give an implementation which avoids this unwanted inflation by performing a metalogical check on the subtype which forms the target of the {M1,M2} recursion. The choice between these options to obtain associativity, we would suggest, must be made on empirical grounds. Consider their respective implications with respect to the interpretation of the type-forming connectives. A product type X*Y is assigned to the concatenation of an expression of type X and an expression of type Y. The reduction X,Y => X - Y in fact comes down to the claim that no coherence can be established between atomic antecedent types X,Y and that reduction must be suspended. On the other hand, the functional answer substitutions Z/(Y\(X\Z)) (or ((Z/Y)/X)\Z) generated by R* link together the atoms X,Y as arguments of an anticipated functor Y\(X\Z) that will consume them (or as arguments of a past functor (Z/Y)/X). The examples of non-constituent coordination in Chapter 1 indicate that Boolean conjoinability does not discriminate between sequences of atoms and incomplete constituents derivable on the basis of {R1,M1,M2}. Compare the examples below. In the remainder of this chapter we will therefore assume that the lemma database {R1,M1,M2} is extended with R*, rather than leaving the product-free part of L. Example 53.1 Non-constituent coordination: atom sequences. omdat Jan gek en Marie dom is : 'because J. silly and M. stupid is' he considers John bright but Mary dull
NPAP => S/(AP\(NP\S)) NP,AP=*((VP/AP)/NP)\VP
197
POLYMORPHISM
The semantic interpretation of the R* reductions is given by the lambda recipes below, obtained by partial execution of the Gentzen proofs, which can be unfolded deterministically, as Figure 5.3.4 makes clear for the Z/(Y\(X\Z)) case. The proof for the symmetric dual proceeds analogously. The conditionalized variable v is of type Y\(X\Z) (or (Z/Y)/X), the type of the functor that will consume the arguments X and Y. Definition 53.7 Semantics of R* reduction. X: a,Y\b=> Z/(Y\(X\Z)): \v.v(b)(a) X: a,Y:b* ((Z/Y)/X)\Z: Xv.v(a)(b) X: a => X: a o
Y:b=*Y:bo
o Z: v(b)(a)=>Z: v(b)(a)
[\L] A X: a, X\Z: v(b) => Z: v(b)(a)
[\L] A X: a, Y: b, Y\(X\Z): v
Z: v(b)(a)
[/R] X: a, Y: b => Z/(Y\(X\Z)): Xv.v(6)(a) Figure 53.4
Let us consider some examples. For a goal sequent [np,ap] => [Type], the boundary case R* will compute the answer substitutions {Type = (X/(ap\(np\X)))} {Type = (((X/ap)/np)\X)} Compare the R * reduction with a derivation in terms of {R2,R4}, composition and lifting. The atoms would both be lifted by R4, in such a way that they have a unifiable middle term, which can then be cancelled by means of R2.
198
CHAPTER 5.3
[R2] [R4] [R4] [R2] [R4] [R4]
(X/(ap\(np\X))) (X/(np\X)) np (Y/(ap\Y)) ap
{Y = (np\X)}
(((X/ap)/np)\X) ((Y/np)\Y) np ((X/ap)\X) ap
{Y = (X/ap)}
Composition of lifted atoms is not the only role for R*. Consider the following goal sequent: [n\n,pp/np] =*• [Type]. In this example, no reduction is possible on the basis of {R1,M1, M2} because the functors are standing 'back to back': recursive descent by means of {M1,M2} does not lead to the Application base case Rl, because the functors look for their arguments in opposite directions. The base case R*, however, can be reached, giving rise to four possible answer substitutions, depending on whether we enter the recursion on the n\n or the pp/np functor, and on the choice of the left or right oriented version of R*. {Type {Type {Type {Type
= = = =
(n\(Z/(pp\(n\Z))))/np} (n\(((Z/pp)/n)\Z))/np} n\((Z/(pp\(n\Z)))/np)} n\((((Z/pp)/n)\Z)/np)}
The first answer in this set, for example, is obtained as follows. We know that n\n,pp/np =>* X/np if n\n,pp =>* X , where we descend into the range subtype of the functor pp/np on the basis of Ml. Again by Ml, n\n,pp
n\Y if n,pp =»* Y ,
where this time we descend into the range subtype of the functor n\n. The premise sequent now represents the atomic boundary case, which by means of R* can be reduced as follows: n,pp=>* Z/(pp\(n\Z))
POLYMORPHISM
199
The answer {Type = (n\(Z/(pp\(n\Z))))/np} is obtained as the composition of the substitutions {Type = X/np}, {X=n\Y} and {Y = Z/(pp\(n\Z)}.
CONCLUSION
To summarize the findings of this section, Programs 5.3.1 and 5.3.2 present the Horn clause axiomatization of the extension of the Lemma database with the M inference rules, first in their purely syntactic form, and then in the interpreted form. Since the lambdaterms for left-associative derivations or Boolean coordination will be complex expressions subject to simplification, lambda conversion has been added as a utility in the reduced predicate of Program A.l. As in the preceding chapters, the semantic representation of the sample theorems to be discussed below has been simplified by lambda conversion where possible. The metalogical check on the subtype which is the target of recursion in the case of Ml and M2 has the effect of freezing this subtype, preventing unification to fill in potentially infinite category structures through recursion in the Ml and M2 clauses. Rl: Application [X/Y,Y] =>* [X]. [Y,Y\X]=>*[X]. Ml: Recursion on the range subtype [Z,X/Y] =>* [W/Y] :nonvar(X), [Z,X]=>* [W]. [Y\X,Z]=>*[Y\W]:nonvar(X), [X.ZH* [W], M2: Recursion on the domain subtype [X/Y,Z] =s>* [X/W] :nonvar(Y), [Z,W]=>* [Y][Z,Y\X]** [W\X] :nonvar(Y), [W,Z]=>* [Y].
200
CHAPTER 5.3
R*: Connectedness for atoms [X,Y]=>* [ ( Z / ( Y \ ( X \ Z ) ) ) ] :atom(X), atom(Y). [X,Y]=** [ ( ( ( Z / Y ) / X ) \ Z ) ] atom(X), atom(Y). Program 5 3 . 1 L e m m a Database: system M .
R1 [X/Y:Functor, Y:Arg] =»* [X:Functor@Arg]. [Y:Arg, Y \ X : F u n c t o r ] =>* [X:Functor@Arg].
Ml [Z:SemZ, X/Y:Functor] =>* [W/Y:VarY"SemW] :nonvar(X), [Z:SemZ, X:Functor@VarY] =** [W:SemW]. [Y\X:Functor, Z:SemZ] =>* [ Y \ W : V a r Y ~ S e m W ] :nonvar(X), [X:Functor@VarY, Z:SemZ] [W:SemW],
M2 [X/Y:Functor, Z:SemZ] =>* [X/W:VarW~(FunctorgSemY)] :nonvar(Y), [Z:SemZ, W:VarW] =>* [Y:SemY]. [Z:SemZ, Y \ X : F u n c t o r ] [W\X:VarW~(Functor@SemY)] :nonvar(Y), [W:VarW, Z:SemZ] =>* [Y:SemY],
R* [X:SemX, Y:SemY] =>* [ ( Z / ( Y \ ( X \ Z ) ) ) : V a r ~ ( V a r @ S e m Y @ S e m X ) ] :atom(X), atom(Y). [X:SemX, Y:SemY] =>* [(((Z/Y)/X)\Z):Var A (Var@SemX@SemY)] :atom(X), atom(Y). Program 532
Derived inference rules: semantics
POLYMORPHISM
201
5.4. EXAMPLES: INCREMENTAL PROCESSING, BOOLEAN CONJOINABILITY
In order to illustrate the properties of M derivations, we return to the prototypical problem cases where a minimal associative subsystem of L is required: incremental left-associative parsing and interpretation, and Boolean coordination. In the following paragraphs, we discuss these matters with a simplified model of the relation between the syntactic and the semantic algebra. We assume, as we have done so far in the discussion of the lambda semantics for Gentzen proofs, that there is a rigid mapping between syntactic and semantic types given by the following induction. (Recall from Chapter 1 that the predicative AP and PP types stand for adjectival and prepositional complements, and that attributive use of these phrases requires an alternative categorization with modifier types X/X.) Definition 5.4.1 Mapping from syntactic to semantic types. Basic categories: type(s,t). type(n,(e,t)). type(ap,(e,t)). type(pp,(e,t)). type(np,((e,t),t)).
Compound categories: type(X/Y,(TypY.TypX)) if type(Y,TypY) and type(X,TypX). type(Y\X,(TypY,TypX)) if type(Y,TypY) and type(X,TypX).
This type assignment corresponds to Montague's 'worst case' second-order treatment for noun phrases, and the concomitant higher-order assignment to predicates with noun phrase arguments. In the absence of a theory to relate these higher-order objects to entities and first-order relations among them, the worst-case type assignment offers no basis for an adequate account of quantifier scope phenomena. We will argue in Section 5.5 that quantifier scope phenomena can be dealt with in the semantic algebra, without complicating the syntax. In this section, then, we abstract from scope issues and we adjust the presentation of the syntax-semantics interface in the next section. The simplified type-assignment view brings out in a clear way the crucial semantic property of M derivations which we want to focus on here: the bottom-up M inferences make the proof procedure independent of the branching of the proof tree, but no matter what branching is selected, they preserve the thematic structure.
202
CHAPTER 5.4
INCREMENTAL LEFT-ASSOCIATIVE PROCESSING
The running example throughout this section is the Dutch embedded clause of Example 5.4.1 below. Example 5.4.1 Left-associative parsing. s/s • np • pp/np • np/n • n • pp\ap • ap\(np\s) £ s omdat opa van het meisje afhankelijk is because Granddad on the girl dependent is 'because Granddadis dependent on the girl' In English, approximations of incremental left-to-right processing can be obtained on the basis of non-associative subsystems of L, with a finite reduction system consisting of Application, (generalized) Composition and simple Lifting (see for example Pareschi 1987, Wittenburg 1986,1987). The claim that a flexible categorial system can directly model the incremental left-to-right processing of expressions, can only be supported in an strong (i.e. universal) form by a calculus with the associativity of L. Example 5.4.1 is designed to illustrate the typical problems that have to be solved in the course of a non-trivial strict left-to-right derivation. Dutch is a mixed language with respect to the directionality of functor expressions: the verbal and adjectival functors in the example are leftoriented, the nominal and prepositional functors are right-oriented. The M derivation has to accommodate the mixed directionalities in a uniform left-to-right traversal of the sequent, cancelling matching subtypes where possible, and keeping non-matching subtypes in suspension until their conjugates are reached, where types qualify as conjugates if they have opposed signature. We will contrast the left-associative derivation on the basis of M lemmas with a cut-free top-down proof, which yields the proof tree of Example 5.4.2.
POLYMORPHISM
203
Example 5.4.2 Cut-free derivation: top-down THEOREM: (s/s),np,(pp/np),(np/n),n,(pp\ap),(ap\(np\s))=>s PROOF: (s/s),np,(pp/np),(np/n),n,(pp\ap),(ap\(np\s))=>snp* (s/(ap\(np\s))) (s/(ap\(np\s))),(ap\(np\s))=>ss* Cut, i.e. to compute the smallest type Cut for the combination of the complementizer and the subject. In our discussion of the axiom schemes R1-R6 in Chapter 1, we have seen how a reduction of this kind can be effected by lifting the subject to a polymorphic second order type X/(np\X), and composition of the lifted subject with the complementizer type, under the unifying substitution {X=s}. The system M has no independent unary lifting rule, but it absorbs the unary type transitions necessary to achieve binary reduction. The goal sequent does not resolve against the application base case Rl: the domain subtype of the complementizer functor does not match the argument type. But the goal sequent matches the head of clause M2, X/Y,Z =>* X/W,
{X=s,Y = s,Z=np,Cut = s/W}
which through monotonic descent into the domain subtype generates the premise resolvent np,W =¥* s . This inequality contains no more operators licensing deeper recursion. But the new goal sequent matches the boundary case R l under the unifying substitution {W = np\s}. This solution proves the premise goal; the original goal is solved with the answer substitution {Cut = s/(np\s)} ( which is obtained as the composition of the binding for W with the bindings already computed. The complete trace for the resolution computation of the cut formula is presented below (with integers indicating the level of recursion, and list brackets omitted for legibility.) Observe that the computed cut formula consists of all the subtype occurrences of the input types: at this stage, there are no conjugate subtypes to be resolved away.
206
CHAPTER 5.4
?- s/s,np =»* Cut. (1) CALL: s/s,np =>* XI ? (2) CALL: np.Yl =»* s ? (2) EXIT: np,np\s =>* s ? (1) EXIT: s/s,np s/(np\s) ? Answer substitution: {Cut = s/(np\s)} The resolution trace focuses on the syntactic aspect of M reduction. The semantic interpretation is displayed below together with the parse tree for the first reduction step. As before, the orthographic representation stands for the semantic interpretation of non-logical constants, the atoms omdat and opa in the present case. Observe that the atoms occupy their proper thematic role in the function-argument structure: opa is the subject argument of a one-place predicate P, still unknown at this stage; the propositional modifier omdat takes the proposition 'P(opa)' as its argument. The interpretation of the functor (s/(np\s)) is obtained by lambda abstraction over the predicate variable P. (s/(np\s)) (s/s) 3 omdat np 3 opa
\P.omdat(P(opa)) omdat opa
Step 2: omdat opa van The second cut inference combines the incomplete expression omdat opa of type s/(np\s) with the next element, the preposition van of type pp/np. The cut formula can be obtained after two recursion steps, as the resolution trace shows. The goal sequent, at the first level of recursion, resolves against clause Ml, with the unifying substitution Z,X/Y =»* W/Y
{Z = s/(np\s),X = pp,Y = np,Cut = W/np} .
The resolvent premise, in its turn, matches the head of clause M2, starting the computation of the unknown W as the solution of the equation s/(np\s),pp =>* W , which yields the answer substitution {W = s/(pp\(np\s))} at the next recursion level. (Actually, as the careful reader may have noticed, there is an alternative Cut solution at this point. We return to the non-deterministic nature of the M reductions below.)
POLYMORPHISM
207
Resolution: ?- s/(np\s),pp/np =>* Cut. (1) CALL: s/(np\s),pp/np XI ? (2) CALL: s/(np\s),pp =>* Y1 ? (3) CALL: pp,Zl =>• np\s ? (3) EXIT: pp,pp\(np\s) =>* np\s ? (2) EXIT: s/(np\s),pp =>* s/(pp\(np\s)) ? (1) EXIT: s/(np\s),pp/np (s/(pp\(np\s)))/np ? {Cut = (s/(pp\(np\s)))/np} As in the first cut inference, no conjugate subtypes can be cancelled: although the two occurrences of subtype np are of the right signature (negative in pp/np and positive in s/(np\s), their cancellation would not preserve the directionality properties of the input functors, as the reader can easily verify. The cut formula obtained at this stage accumulates all subtypes of the consumed input types, with the proper signature and directionality properties. But in the semantic interpretation obtained at this stage, the thematic dependencies between the atoms are faithfully represented. The predicate variable P has assumed the more fine-grained structure 'R(van(T))', i.e. the preposition van is identified as the head functor of a prepositional complement of some predicate R. The semantic content of R, and of the noun phrase argument of the preposition, remain unknown. ((s/(pp\(np\s)))/np) (s/(np\s)) (s/s) s omdat np 3 opa (pp/np) 3 van
XTXR.omdat(R(van(T)) (opa)) \P.omdat(P(opa)) van
Step 3: omdat opa van het The next reduction step shows how the familiar functional composition law is subsumed under M reduction. Composition cancels a unifiable middle term of two equidirectional functors, in the present case X/Y,Y/Z
X/Z.
From the perspective of M reduction, composition is derivable from Ml, the upward monotonicity of categorial division for the range
208
CHAPTER 5.4
subtype. Observe that this is the first instance of a computed cut formula which is properly smaller than the antecedent types: the conjugate occurrences of the np subtype have been resolved away, under preservation of count and directionality invariance. Resolution: ?- (s/(pp\(np\s)))/np,np/n =>* Cut. (1) CALL: (s/(pp\(np\s)))/np,np/n =>* XI ? (2) CALL: (s/(pp\(np\s)))/np,np =>* Y1 ? (2) EXIT: (s/(pp\(np\s)))/np,np=>* s/(pp\(np\s)) ? (1) EXIT: (s/(pp\(np\s)))/np,np/n =>* (s/(pp\(np\s)))/n ? {Cut = (s/(pp\(np\s)))/n} The semantic representation for the subproof obtained so far has the following structure. XQXR.omdat(R(van(het(Q)))(opa)) XTXR.omdat (R(van(T)) (opa))
het Step 4: omdat opa van het meisje No comments for this simple application reduction. (s/(pp\(np\s))) ((s/(pp\(np\s)))/n) ((s/(pp\(np\s)))/np) (s/(np\s)) (s/s) 3 omdat np s opa (pp/np) B van (np/n) 3 het n 3 meisje
\R.omdat(R(van(het(meisje)))(opa)) XQXR.omdat(R(van(het(Q)))(opa))
meisje
Step 5: omdat opa van het meisje afhankelijk The present resolution step is of crucial importance for a thorough understanding of the system M. The left corner omdat opa
POLYMORPHISM
209
van het meisje has been reduced to an incomplete expression of type s/(pp\(np\s)), i.e. a functor looking to the right for a leftoriented functor pp\(np\s) to yield a complete expression of the sentence type. The prepositional phrase van het meisje, in other words, is taken to be the immediate prepositional complement of a functor, as yet unknown. The unknown functor appears as the conditionalized subtype (pp\(np\s)), corresponding to the variable R, bound by lambda abstraction in the semantic representation of the preceding step in the computation. The new element qfhankelijk of type (pp\ap) forces the bottom-up inference engine to refine its expectations, since the prepositional phrase now tons out to be the complement of a functor with ap range, rather than of the main predicate. Resolution: ?- s/(pp\(np\s)),pp\ap =>* Cut. (1) CALL: s/(pp\(np\s)),pp\ap =>* XI ? (2) CALL: pp\ap,Yl =>* pp\(np\s) ? (3) CALL: ap,Zl =>* np\s ? (3) EXIT: ap,ap\(np\s) =»* np\s ? (2) EXIT: pp\ap,ap\(np\s) =>* pp\(np\s) ? (1) EXIT: s/(pp\(np\s)),pp\ap =$* s/(ap\(np\s)) ? {Cut = s/(ap\(np\s))} As the resolution trace shows, the new expression of type (pp\ap) can be accommodated by the functor s/(pp\(np\s)) without backtracking on the earlier reduction decisions. The goal sequent s/(pp\(np\s)),pp\ap =>* Cut contains conjugate occurrences of the pp subtype, which occurs negatively in (pp\ap), but with positive signature in s/(pp\(np\s)). The matching subtype can be cancelled under preservation of the directionality requirements, yielding the answer substitution {Cut = s/(ap\(np\s))}, with the semantic interpretation displayed below. Observe how the original subterm R(van(het(meisje))) has been further instantiated to S(afhankelijk(van(het(meisje)))),
210
CHAPTER 5.4
where S now is a variable of type ap\(np\s), the conditionalized subtype. (s/(ap\(np\s))) XS.omdat(S(afhankelijk(van(het(meisje)))) (opa)) (s/(pp\(np\s))) XR.omdat(R(van(het(meisje)))(opa)) ((s/(pp\(np\s)))/n) ((s/(pp\(np\s)))/np) (s/(np\s)) (s/s) 3 omdat np 3 opa (pp/np) 3 van (np/n) 3 het n 3 meisje (pp\ap) 3 afhankelijk afhankelijk Step 6: omdat opa van het meisje aptankelijk is The incomplete expression omdat opa van het meisje afhankelijk has been reduced to a functor of type s/(ap\(np\s)), i.e. the expectance of the bottom-up inference engine has been adjusted to look for a functor of type (ap\(np\s)). Indeed, this is the type of the last expression to be parsed, and the left-associative proof of the original sequent can be successfully terminated by a simple application reduction. The semantic interpretation of the sentence is equivalent to the semantic interpretation of the pure application analysis given above.
s
omdat(is(afhankelijk(van(het(meisje))))(opa)) (s/(ap\(np\s))) XS.omdat(S(afhankelijk(van(het(meisje))))(opa)) (s/(pp\(np\s))) ((s/(pp\(np\s)))/n) ((s/(pp\(np\s)))/np) (s/(np\s)) (s/s) 3 omdat np 3 opa (pp/np) 3 van (np/n) 3 het n 3 meisje (pp\ap) 3 afhankelijk (ap\(np\s)) 3 is is
POLYMORPHISM
211
NON-DETERMINISM AND STRUCTURAL AMBIGUITY
The example above is a non-ambiguous expression. What we have to show now is how a strictly left-incremental M reduction can compute semantically non-equivalent readings for expressions qualifying as structurally ambiguous from a pure Application perspective. The crucial property of M reductions in this respect is that they are not ~ and should not be ~ deterministic: faced with a sequent Typel,Type2 =>* Cut there may be alternative solutions for the cut formula, depending on the choice of the M clause selected for goal reduction, and these alternative cut formulae are associated with different semantic recipes. In this respect, the derivation of Example 5.4.4 represents a successful path through the M search space, which fans out as the resolution refutation of a sequent proceeds. The first reduction step has one solution; at the second cut inference, the search tree forks in two branches, of which the proof of Example 5.4.4 has pursued the first. Figure 5.4.1 below is the initial portion of the M search tree, for the first three cut inferences. s/s • np =>* • pp/np =>* • np/n
=>*
s/(np\s) (s/(pp\(np\s)))/np s/((pp/np)\(np\s)) (s/(pp\(np\s)))/n ((s/(pp\(np\s)))/(np\np))/n (s/(pp\(np\s)))/((np/n)\np) (s/(np\((pp/np)\(np\s))))/n s/((np/n)\((pp/np)\(np\s)))
Figure 5.4.1 M search tree. Notice that for a realistic modelling of the human processing mechanism, one would have to pursue the various possibilities in parallel fashion: the Prolog search algorithm is lost in hopeless backtracking when it commits itself to a wrong choice early in the left-to-right M reduction of a sequent. The choices in the M reductions of Figure 5.4.1 differ in whether conjugate subtypes are cancelled, or whether cancellation is suspended. Early closure phenomena represent the standard examples where one wants to keep both options open. Consider a noun phrase with postnominal modification (a book about John), corresponding to the goal sequent
212
CHAPTER 5.4
np/n,n,(n\n)/np,np => Solution. A greedy cancellation of the conjugate noun subtypes in the first reduction step closes off the noun phrase before the postnominal modifier can consume its argument. But system M leaves two options at the first reduction step np/n,n =»* Cut. The solution {Cut = np} cancels the matching n subtypes, but the alternative substitution (Cut=np/(n\n)} obtained by M2 anticipates for a potential nominal postmodifier, which in fact turns up at the next reduction step. Figure 5.4.2 displays the complete M search space for the goal sequent. Among the third generation solutions one finds the offspring of the incorrectly closed noun phrase at the first reduction (solutions (n/np)\n and ((n/np)\n)/(np\np)), and the complete noun phrase expression together with various incomplete variants, anticipating for further modification of different subexpressions, represented by continuations such as (a book about John) the logician/with many illustrations/which nobody wants to buy/etc. np/n • n=>* • (n\n)/np=»'
• np=>*
np np/(n\n) ((n/np)\n)/np np/np (np/((n\n)\(n\n)))/np (np/(n\n))/np ((n/(np/(n\n)))\n)/np np/(((n\n)/np)\(n\n)) (n/np)\n ((n/np)\n)/(np\np) np np/(np\np) np/((n\n)\(n\n)) (np/((n\n)\(n\n)))/(np\np) np/(n\n) (np/(n\n))/(np\np) (n/(np/(n\n)))\n ((n/(np/(n\n)))\n)/(np\np) np/(np\(((n\n)/np)\(n\n)))
Figure 5.4.2 M search tree for a book about John.
POLYMORPHISM
213
Consider now an ambiguous expression, e.g. the noun phrase a new book about John, corresponding to the goal sequent np/n:a, n/n:new, n:book, (n\n)/np:about, np:john => np:Semantics . We want to obtain two non-equivalent answer substitutions for Semantics, depending on whether the adjective new or the postnominal modifier about John has wider scope: Semantics = a(new(about(john)(book))), Semantics = a(about(john)(new(book))). From the point of view of a cut-free Gentzen proof, this is a case of structural ambiguity, i.e. semantic ambiguity corresponding to branching options in the proof tree. The first reading (with wide scope new) is obtained if the proof of the sequent contains the subproof n:book, (n\n)/np:about, np:john => n:about(john)(book) , whereas the second reading (with wide scope about John) involves a proof containing the subproof n/n:new, n:book => n:new(book) . In Example 5.4.5 we juxtapose the Application parse trees that lead to the different readings. Example 5.4.5 Structural ambiguity: a new book about John Wide scope about John:
Wide scope new:
[Rl] np:a(about(john)(new(book))) (np/n): a [Rl] n: about(john)(new(book)) [Rl] n: new(book) (n/n): new n: book (n\n): about(john) [Rl] ((n\n)/np): about np: john
[Rl] np:a(new(about(john)(book))) (np/n): a [Rl] n: new(about(john)(book)) (n/n): new [Rl] n: about(john)(book) n: book [Rl] (n\n): about(john) ((n\n)/np): about np: john
The system M translates this structural ambiguity into the nondeterminism of M reductions, thus making it possible to associate
214
CHAPTER 5.4
the non-equivalent readings with a uniformly left-branching derivation. The left corner np/n:a, n/n:new =>* Type:Semantics has the following non-equivalent M solutions: {Type = np/n, Semantics = Xx.a(new(x))} , {Type = (np/(n\n))/n, Semantics = Xx\y.a(y(new(x)))} . The first solution leads to a derivation with wide scope new, the second one to a derivation with wide scope about John, as the following trees, both strictly left-branching, show. Example 5.4.6 Non-determinism of M reductions: ambiguity. Wide scope new: [Rl] [Ml] [M2] [Ml]
np: a(new(about(john)(book))) (np/np): \z.a(new(about(z)(book))) (np/(n\n)): Xy.a(new(y(book))) (np/n): Xx.a(new(x)) (np/n): a (n/n): new n: book ((n\n)/np): about np: john
Wide scope about John: [Rl] [Ml] [Rl] [Ml]
np: a(about(john)(new(book))) (np/np): X.z.a(about(z)(new(book))) (np/(n\n)): Xy.a(y(new(book))) ((np/(n\n))/n): XxXy.a(y(new(x))) (np/n): a (n/n): new n: book ((n\n)/np): about np: john
215
POLYMORPHISM
NON-CONSTITUENT COORDINATION
We turn now to the second test case for the Lemma database M: the solution of Boolean polymorphism. Let us recapitulate the structure of the argument. Empirical study shows that the Boolean particles can conjoin subsequences of grammatical expressions. The precise empirical restrictions on what counts as a conjoinable subsequence are as yet unsettled (see e.g. Houtman (1987) for restrictions in Dutch); the relevant property of Boolean conjoinability for our purposes is that it cross-cuts the conventional distinctions between constituents and non-constituents, thus necessitating a flexible categorial calculus. The calculus L has the right logical properties for a generalized account of Boolean conjoinability, assuming a polymorphic type assignment to the Boolean particles: •en' (and) Type: (X\X)/X Semantics: MEET •of (or) Type: (X\X)/X Semantics: JOIN. The associativity of L guarantees that arbitrary subsequences of a grammatical expression can be collected into a single type, i.e. an object in the syntactic and the semantic representation. The polymorphic type assignment to the Boolean particles then makes coordination of such subsequences derivable, provided that the three equations at the leaf nodes of the Boolean Cut in Figure 5.4.3 can be solved. P => Xo
P,((X\X)/X),Q=»X
oQ=»X
o U,X,V=> Y
[Cut] A U,P,((X\X)/X),Q,V=> Y Figure 5.43 L+{Cut}: Boolean coordination Adding a polymorphic functor type (X\X)/X for the Boolean particles is a decidable extension of L. One can approach the
216
CHAPTER 5.4
identification of the unknown conjunction type X either in a topdown or in a bottom-up fashion. Consider first the top-down argument. The Boolean inference scheme of Figure 5.4.3 says that a sequent containing the coordination type (X\X)/X is provable if (and only if) the antecedent can be factorized into subsequences U, P, Q, V, such that for a certain type X P => X, Q => X and U,X,V => Y . But in that case we also have U,P,V =>Y and U,Q,V =* Y , i.e. (by Conditionalization) P => (U\Y)/V and Q=> (U\Y)/V, where U,V are the product types corresponding to the sequences U,V. In other words, for the unknown conjunction type X we can always take the instantiation (U\Y)/V, and the search space for this instantiation isfinitelybounded by the input parameters U,P,Q,V. The top-down identification of the unknown conjunction type X as (U\Y)/V is uncommitted, in the sense that it simply projects the total complexity of the context sequences U,V onto a single type (cf. our remarks on the generator types in the discussion of Theorem 5.2.1). Let us therefore investigate the bottom-up alternative to decidably identify the unknown conjunction type X by solving the conjunct premises P=>X and Q=>X on the basis of M. As we have demonstrated in Section 5.2, the set of answer substitutions for an unknown type Solution in an equation Antecedent =» Solution is infinite, as a result of the Introduction rules [/R] and [\R]. In order to solve the Boolean premises, we have to compute an answer substitution for the variable conjunction type X, such that the left conjunct P and the right conjunct Q reduce to X. The search algorithm, to be decidable, cannot be a blind traversal of the infinite solution space for these equations. Neither can it be based on the Elimination subsystem of L, since this has no solutions for non-constituents. The bottom-up inference engine M, a subsystem of L, reduces the search problem for the conjunct premises to finite dimensions. As we just observed, the number of instantiations of the factorization of the end sequent in the Boolean Cut is finite, yielding the candidate conjunct sequences P and Q. The conjuncts can form complete subphrases of the end sequent, or incomplete expressions. Because M inherits the associativity of L, the premise equations
217
POLYMORPHISM
P => LeftConjuct and Q => RightConjunct are solvable, and the answer substitutions form a finite set of types constructed from the atomic subtype occurrences in P and Q (with the possible addition of the unknown range type in the case of R* reductions). When the conjuncts are incomplete expressions, their cut-free derivation would involve Conditionalization inferences, i.e. 7YY Introductions in a standard cut-free proof. But the conditionalization of M reductions is implicit, i.e. absorbed by the recursive generalization of the Elimination inferences, and it is bounded in terms of the subtype occurrences of the conjunct antecedents. The computed answer substitutions for LeftConjunct and RightConjunct can be tested for unifiability in the course of the resolution refutation of a Boolean sequent. Unifiable answers form candidates for the conjunction type X, and the proof can proceed with a right premise U,X,V=>Y, with an instantiated semantic object).
conjunction type X (and the
corresponding
Figure 5.4.4 recasts the Boolean Cut inference in the Horn clause format required by the interpreter. Observe that we have simplified the representation of Figure 5.4.3 above, by unfolding the left premise. The variable Boolean in the semantic interpretation of the Boolean Cut stands for the generalized MEET and JOIN operations that interpret the particles en (and) and o/ (or). Syntax: U,[P | Left],[(X\X)/X],[Q | Right],V => [Z] [P | Left] =» [X] & [Q | Right] [X] & U,[X],V=>[Z]. Syntax phis Interpretation: U,[P | Left],[(X\X)/X:Boolean],[Q | Right], V => [Z] [Z]. Figure 5.4.4 Boolean Cut.
218
CHAPTER 5.4
We can further reduce the Boolean search space on the basis of count-invariance, as we did for the original cut-free system L. The count-invariant puts the following extra constraints that have to be satisfied by the cut premises. If the left and right conjuncts have non-matching counts, there cannot be a common conjunction type X derivable from them: the M reductions are count-preserving. Moreover, the context sequences U and V, together with one of the conjuncts, must have counts matching the result type Y. These three constraints can be checked prior to the unfolding of any of the premise sequents, and instantiations of the cut factorization that do not satisfy them can be rejected immediately. (The count-constraints are added in the Appendix) We close this section with a worked-out example, concentrating on the syntactic aspects of the derivation, and postponing a treatment of Boolean scope ambiguity to the next section. Consider the Dutch prepositional phrase of Example 5.4.7, with left-recursive prenominal modification (a subphrase of e.g. hij rekent op het met Jan verwante meisje, 'he counts on the girl related to John'.) Example 5.4.7 op het met Jan verwante meisje pp/np np/n pp/np np pp\(n/n) n => pp on the to Jan related girl 'on the girl related to John' The pure Application parse tree for this expression is given below. In Example 5.4.8, we form the Boolean coordination of the boxed non-constituent subsequence np/n,pp/np,np. [Rl] [Rl] [Rl] [Rl] [Rl]
pp (pp/np) s op np (np/n) 3 het n (n/n) PP (pp/np) 3 met np 3 Jan (pp\(n/n)) 3 verwante n 3 meisje
POLYMORPHISM
219
Example 5.4.8 Non-constituent coordination. [op,het,met,Jan,of,het, met,Marie,verwante,meisje] 'on the with John or the with Mary related girl' 'on the girl related to Jan or to Mar/ The theorem prover has to find a resolution refutation for the sequent [pp/np,np/n,pp/np,np,(X\X)/X,np/n,pp/np,np,pp\(n/n),n]=>[pp], thus computing an answer substitution for the polymorphic disjunction particle (X\X)/X. Of twenty possible instantiations of the factorization for the Boolean cut inference of Figure 5.4.3, there is just one which satisfies the count-constraints: U = [pp/np] P = [np/n,pp/np,np] Conj = [(X\X)/X] Q = [np/n,pp/np,np] V = [pp\(n/n),n] Premisel = [np/n,pp/np,np]=>[X] Premise2 = [np/n,pp/np,npj=>[X] Premise3 = [pp/np],[X],[pp\(n/n),n]=^[pp] This factorization correctly identifies the left and right conjunct sequences P and Q, and instantiates the resolvent premises. The next task for the theorem prover is to compute an answer substitution for the conjunction type X. The conjuncts Premisel = [np/n,pp/np,np]=»[X], Premise2 = [np/n,pp/np,np]=>[X], are non-constituent subsequences, i.e. they have no pure Application derivation. But they both have an M-reduction converging on the binding {X = np/(pp\n)} for the conjunction type. The proof can now proceed with Premise3: Premise3 = [pp/np,X,pp\(n/n),n]=*[pp] {X = np/(pp\n)} which succeeds under the substitution {X = np/(pp\n)} for the conjunction type. The complete Gentzen proof tree is presented below.
220
CHAPTER 5.4
PROOF: [pp/np,np/n,pp/np,np,(X\X)/X,np/n)pp/np,np,pp\(n/n),nHpp]* [(np/(pp\n))/np] true 'het met Jan' [(np/(pp\n))/np,npMnp/(pp\ii)][pp] [PP/(PP\n)>PP\(n/n)]=»''' [PP/n] true [pp/n,n]=>[pp][pp] X/((Z/Y)\Z) if X/Y,(Z/Y)\Z=> X
Y\X* (Z/(Y\Z))\Xif Z/(Y\Z),Y\X» X
In the non-directional calculus LP the symmetric duals of Definition 5.5.1 would collapse to: (Y,X)=K((Y,Z),Z),X) if (Y,X),((Y,Z),Z)*X Permissible values of the exponent Z can now be determined by looking for the answer-substitutions for the premise sequent X/Y, (Z/Y)\Z => X (or Z/(Y\Z), Y\X X). This leads to interesting differences, depending on the complexity (atom vs. functor type) and the directionality of the range subtype in the antecedent type X/Y. Suppose first the range subtype X is atomic. In this case, there is only one solution, in the directional and the non-directional system alike: the answer-substitution {Z=X}. Example 5.5.1 illustrates this with the lifting of subject noun-phrases from argument expressions into functors over verb-phrase expressions np\s.
224
CHAPTER 5.5
Example 5.5.1 Argument lifting: np\s Solution: (np\s)=>((Z/(np\Z))\s) (np\s)=K(s/(np\s))\s)(np\s) true s=>s(np\s) true Solution 2: ((np\s)/np)=K(np\s)/((Z/np)\Z)) ((np\s)/np)=K(np\s)/((s/np)\s)) ((np\s)/np),((s/np)\sMnp\s) * n P»(( n p\s)/np),((s/np)\s)=>s(s/np) Z if T =¥ Y:TypeY:a and U, X:TypeX:/(a), V => Z
INTRODUCTION RULES: LAMBDA ABSTRACTION
In order to give a general account of Boolean conjoinability, we have to be able to derive non-constituents. The crucial theorems for this (Division, Composition, Lifting, Associativity — or, in their generalized form, the system M), are consequences of Introduction inferences. The Introduction rules for the connectives V and correspond to lambda-abstraction over a variable v of type TypeY. The difference with our earlier formulation is that the conditional-
POLYMORPHISM
229
ized TypeY is no longer the unique type corresponding to the subcategory Y which is introduced in the conclusion of the inference rule. For the premise sequent to be derivable, it suffices that TypeY belongs to the type-set associated with the subcategory Y. Definition 5.5.6 Introduction inferences. [\R] [/R]
T => Y\X:(TypeY,TypeX):\v.TermX if Y:TypeY:v, T => X:TypeX:TermX T => X/Y: (TypeY,TypeX): Xv.TermX if T, Y:TypeY:v X:TypeX:TermX
CUT: SUBSTITUTION
Finally, we will need the Cut inference to perform semantic type-shifting in the course of a proof, and to drive the bottom-up proofs on the basis of the lemma database M. The semantic operation for Cut inferences is substitution as before. Definition 5.5.7 Cut Rule. [Cut] U, T, V Z if T=>X:TypeX:TrX and U,X:TypeX:TrX, V=>Z
EXAMPLES: TYPE-SHIFTING, SCOPE AMBIGUITY, NON-CONSTITUENT CONJUNCTION
In the remainder of this section, we will illustrate the interaction of syntactic and semantic derivability by presenting a number of cases where either the basic syntactic calculus L is too poor to produce desired readings without the added semantic flexibility argued for here, or where the calculus of semantic type change H needs to be supplemented by the rebracketing potential of L in the derivation of 'non-constituents'. First, we discuss some simple examples of type-shifting and scope ambiguity from Hendriks (1988), to familiarize the reader with the more complicated form of Gentzen proofs. These first examples are based on the Elimination fragment of L, i.e. on an AB type syntax. Then we turn to theorems derived from the Introduction Rule, which is essential for the syntactic flexibility of our calculus. Finally, a discussion of scope-ambiguities in non-constituent coordination displays the combined syntactic and semantic flexibility of the system proposed here.
230
CHAPTER 5 J
Example 5.5.4 Intensional verbs with e-type argument Consider first the sentence Jan zoekt Marie ('John is-looking-for Mar/) below, where an intensional verb is combined with a proper noun object. Assume that the lexical entry for zoekt ('is-lookingfor') specifies the following information for the syntactic and semantic type and basic translation. (Notice that the basic translation of the intensional verb is of type (((et)t)(et)), i.e. not of the basic type for categories ((np\s)/np), which is (e(et)). But, as required, the type (((et)t)(et)) belongs to the type set of the transitive verb category ((np\s)/np), so that there will be a reading in type (e(et)) derived from the basic non-logical constant zoekt in type (((et)t)(et)).) 'zoekt' ((np\s)/np) (((et)t)(et)) zoekt
Category Type: second-order ((et)t) object Translation
Gentzen proofs will be displayed in indented list format as before. Nodes in the proof tree are now labeled by sequents consisting of triples (Cat:Type:Translation). For perspicuity, the syntactic categories and the corresponding semantic types are printed on separate lines (with suppressed comma operator in the semantic types). At the critical points, one finds the (reduced) translations of the computed succedent types, on a node-by-node basis. When the proof is unexciting semantically, we will suffice with the translation of the end-sequents. The root of the proof tree carries the basic category, type and translation of the lexical expressions. In the example, the place where we need the added semantic flexibility of H is the highlighted [Axiom] sequent for the object noun phrase. Since the verb wants a higher-order ((et)t) object, and the object 'Marie' is of basic type e, the object is lifted to type ((et)t) to fulfill the type requirements of the [/L] rule. (This is one way to derive the sentence. For an alternative, with argument lowering of the verb, see Hendriks 1988). [Jan,zoekt,Marie] 'John is looking for Marie' THEOREM: np,((np\s)/np),np=»s e,(((et)t)(et)),e=*
231
POLYMORPHISM
PROOF:
SEMANTICS:
np,((np\s)/np),np=>s [/L] e,(((et)t)(et)),e=>t jan,zoekt,marie =» zoekt(XP.P(marie))(jan) np=>np [Axiom] e=K(et)t) Al: Lifting: marie => XP.P(marie) np,(np\s)=»s [\L] e,(et)=* jan,zoekt(\P.P(marie))=«oekt(AP.P(marie))(jan) np=>np [Axiom] e=»e jan=>jan s=>s [Axiom] zoekt(XP.P(marie))(jan)=>zoekt(XP.P(marie))(jan)
Example 5.5.5 Subject versus object wide scope. In order to introduce the role of the type-shifting Cut, and quantifier expressions, we turn to another standard example which is slightly more complex: subject versus object wide scope for a simple extensional verb combined with generalized quantifier arguments: omdat elk meisje een boek steelt because every girl a book steals 'because every girl steals a book' Assume the following lexical entries for the crucial atoms in this sentence: the verb steelt ('steals'), an extensional two-place relation between e-type objects, and the universal and existential quantifiers elk ('every") and een ('a'), translated as the logical constants ALL and EXIST, which meaning postulates can decompose as shown. 'steelt' (np\(np\s)) (e(et)) steelt 'elk' (np/n) ((et)((et)t)) ALL
( « \P\Q.Vx(P(x)
Q(x)) )
232
CHAPTER 55
'een' (np/n) ((et)((et)t)) EXIST
( « \PXQ.3x(P(x) & Q(x)) )
The (e(et)) verb steelt occurs in the context of two arguments of generalized quantifier type ((et)t). Therefore, in the course of the proof, a type-shifting axiom has to perform argument raising, on both the subject and the object argument. As Hendriks shows, there are two non-equivalent H-derivations for the double argument lifting, depending on whether the subject or object subtype of the verb undergoes argument-lifting first. [steelt] THEOREM: (np\(np\s))=Knp\(np\s)) (e(et)M((et)t)(((et)t)t)) PROOF: (np\(np\s)Mnp\(np\s)) [Axiom] (e(et))=K((et)t)(((et)t)t)) A2: Argument lifting (twice)/ SEMANTICS 1 (Subject wide scope): XTl\T2.T2(Xx.Tl(Xy.steelt(y)(x))) SEMANTICS 2 (Object wide scope): XTl\T2.Tl(Xy.T2(Xx.steelt(y)(x))) The two readings for the complete sentence of Example 5.5.5 have a common proof skeleton. The general structure is given in Figure 5.5.1. At the lexical level, a type-shifting Cut X X' changes the type of the verb from first to third order. See the highlighted [Axiom] sequent below, which is associated with one of the lambda terms for the type-shift just given. The right premise of the Cut can then proceed as a completely straightforward Elimination proof for the type-shifted sequent U,X',V => Y. X=>X'
(s/np) [/R] e,(e(et))=>(et) np,((np\s)/np),np=i.s [/L] e,(e(et)),e=»t np=>np [Axiom] e=4e np,(np\s)=»s [\L] e,(et)=»t np=>np [Axiom] e=>e s=>s [Axiom] t=»t
SEMANTICS: bemint => XxXy.bemint(y)(x) Xjbemint
Xy.bemint(y)(x)
x,bemint,y
bemint(y)(x) y=>y
x,bemint(y) => bemint(y)(x) x=»x bemint(y)(x) => bemint(y)(x)
Example 5.5.7 Incomplete expressions The Associativity of concatenation in L makes arbitrary substrings of a grammatical expression derivable — a crucial property for a generalized treatment of Boolean conjoinability. As an example, we present the [/R] based derivation of the incomplete expression Jan bemint, a non-constituent subsequence of e.g. Jan bemint Marie/een meisje ('John loves Mary/a girl'), first on a first order reading where this left corner could be continued by a proper noun object (e.g. Marie), and then on a third order reading (with argument lifting of the verb) for a generalized quantifier continuation (e.g. een meisje). THEOREM: First order reading np,((np\s)/np)=Ks/np) e,(e(et)Met) PROOF: n P»((nP\s)/np)=Ks/np) [/R] e,(e(et)Het) np,((np\s)/np),np=»s [/L] e,(e(et)),e=»t np=>np [Axiom] e=>e np,(np\s)=>s [\L] e,(et)=rt np=>np [Axiom] e^e s=>s [Axiom] t=»tbemint
SEMANTICS: jan,bemint
\y.bemint(y)(jan)
jan, bemint,y
jan,bemint(y)
bemint(y)jan) y=>y bemint(y)(jan) jan =» jan
(y)(jan) =>bemint(y)(jan)
236
CHAPTER 5 J
THEOREM: Third order reading np,((np\s)/np)=>(s/np) e,(e(et))=K((et)t)t) PROOF: np,((np\s)/np)=>(s/np) [Cut] e,(e(et)M((et)t)t) ((np\s)/np)=*((np\s)/np) [Axiom] (e(et)M((et)t)(et)) np,((np\s)/np)=*(s/np) [/R] e,(((et)t)(et)M((et)t)t) np,((np\s)/np),np=>s [/L] e,(((et)t)(et)),((et)t)=* np=>np [Axiom] ((et)t)=>((et)t) np,(np\s)=>s [\L] e,(et)=»t np=>np [Axiom] e=>e s=>s [Axiom] t=>t SEMANTICS: (third-order reading) jan,bemint XT.T(Xy.bemint(y)(jan)) bemint XQXx.Q(Xy.bemint(y)(x)) jan,XQXx.Q(Xy.bemint(y)(x)) => XT.T(Xy.bemint(y)Qan)) jan,XQXx.Q(Xy.bemint(y)(x)),T T(Xy.bemint(y)(jan)) T=»T jan,Xx.T(Xy.bemint(y)(x)) =» T(Xy.bemint(y)(jan)) jan =4 jan T(Xy.bemint(y)(jan)) => T(Xy.bemint(y)(jan))
[Cut] [Axiom] [/R] [/L] [Axiom] [\L] [Axiom] [Axiom]
Example 5.5.8 Left-associative proof: M Lemmas. We have demonstrated in Section 5.3 how the Elimination and Introduction inferences can be transformed into the bottom-up reduction system M. Within M, the ordering of the expansion-steps of the proof tree is no longer dependent on the directionality requirements of the atoms: the M reductions preserve directionality and thematic structure. Example 5.5.8 gives a strict left-associative proof for the wide-scope object reading of Example 5.5.5, for the cut-free derivation was given above. Partial combination proceeds
POLYMORPHISM
237
left-to-right on the basis of M Lemmas. The final interpretation computed at the root node is equivalent to the interpretation of the cut-free proof. (The translations of the determiner expressions are left unanalysed, so as not to obscure the central point.) omdat elk meisje een boek steelt because every girl a book steals 'because every girl steals a book' THEOREM: (s/s),(np/n),n,(np/n),n,(np\(np\s))=» (tt),((et)((ct)t)),(et),((et)((et)t)),(et),(c(ct)>H PROOF: (s/s),(np/n),n,(np/n),n,(np\(np\s))=>s [Cut] (tt),((et)((et)t)),(et),((et)((et)t)),(et),(e(et))=.t (np\(np\s))=>(np\(np\s)) [Axiom] [1] (e(et))=»(((et)t)(((et)t)t)) Type shifting axiom (s/s),(np/n),n,(np/n),n,(np\(np\s))=»s [Cut] ^.((eOCCeOOiXeO.iietXieOOiXeO.CCieOOiiieOOt))^ (s/s),(np/n)=>((s/(np\s))/n) [Lemma: M] [2] (tt),((et)((et)t)H(et)((((et)t)t)t)) ((s/(np\s))/n),n,(np/n))n,(np\(np\s))=>s [Cut] ((et)((((et)t)t)t)),(et),((et)((et)t)),(et),(((et)t)(((et)t)t»=>t ((s/(np\s))/n),n=*(s/(np\s)) [Lemma: M] [3] ((et)((((et)t)t)t)),(et)=.((((et)t)t)t) (s/(np\s)),(np/n),n,(np\(np\s))=»s [Cut] ((((et)t)t)t)X(et)((et)t)),(et),(((et)t)(((et)t)t)>4t (s/(np\s)),(np/n)=>((s/(np\(np\s)))/n) [Lemma: M] [4] ((((et)t)t)t),((et)((et)t))=.((et)((((et)t)(((et)t)t))t)) ((s/(np\(np\s)))/n),n,(np\(np\s))=w[Cut] ((et)((((et)t)(((et)t)t))t)),(et),(((et)t)(((et)t)t)H ((s/(np\(np\s)))/n),n=*(s/(np\(np\s))) [Lemma: M] [5] ((et)((((et)t)(((et)t)t))t)),(et)= > ((((et)t)(((et)t)t))t) (s/(np\(np\s))),(np\(np\s))=>s[Cut] ((((et)t)(((et)t)t))t),(((et)t)(((et)t)t))=.t (s/(np\(np\s))),(np\(np\s))=>s [Lemma: M] [6] ((((et)t)(((et)t)t))t),(((et)t)(((et)t)t))=>t s=>s [Axiom] t=»t
238
CHAPTER 55
SEMANTICS: omdat( EXIST(boek) (XyALL(meisje) (Xx.steelt(y) (x))) ) [1] [2] [3] [4] [5] [6]
XTlXT2.Tl(Xy.T2(Xx.steelt(y)(x))) (=steelt' ( ( ( e t ) t ) ( ( ( e t ) t ) t ) ) XPXQ.omdat(Q(ALL(P))) XQ.omdat(Q(ALL(meisje))) XSXR.omdat(R(EXIST(S))(ALL(meisje))) XR.omdat(R(EXIST(boek))(ALL(meisje))) XR.omdat(R(EXIST(boek))(ALL(meisje)))(steelt') »omdat(EXIST(boek)(XyALL(meisje)(Xx.steelt(y)(x))))
The subject wide scope reading would be obtained by the alternative instantiation of the argument lifting type-shift (cf. Example 5.5.5), and would have exactly the same proof skeleton as the above derivation. Notice that the type-shifting cut is performed at the lexical level, as in the previous examples. If one wants the proof to model incremental processing, one can adapt the M algorithm, so that it interacts with semantic type shifting. After processing of the left corner omdat elk tneisje een boek (with two generalized quantifier noun phrases of type ((et)t), cf. node [5]), the computed type s/(np\(np\s)) is associated with type ((((et)t)(((et)t)t) )t), i.e. a functor looking for a transitive verb np\(np\s) not with the first order type (e(et)), but with the higher-order type (((et)t)(((et)t)t)). The final application step [6] would then automatically trigger argument lifting for the verb steelt.
Example 5.5.9 Non-constituent coordination. We are ready now to turn to non-constituent coordination where we need the combined flexibility of the syntactic and the semantic calculus. Syntactically, we want to be able to derive incomplete expressions; semantically, we have to account for ambiguities resulting from the interaction of the Boolean meet and join operations with other scope bearing elements, e.g. generalized quantifiers. As an example, consider the Right-Node-Raising sentence below. Jan bemint en Marie haat een meisje 'John loves and Mary hates a girl' Syntactically, this is a case of non-constituent coordination of the subject-verb combinations Jan bemint ('John loves') and Marie haat ('Mary hates'). The sentence has two readings, depending on whether the existential quantifier has scope over the Boolean meet or vice versa.
POLYMORPHISM
239
EXIST/MEET: there is a girl which John loves and Mary hates MEET/EXIST : John loves a girl and Mary hates a girl The two readings correspond to non-equivalent proofs for the sequent THEOREM: np,((np\s)/np),((Cat\Cat)/Cat),np,((np\s)/np),(np/n),n => s e,(e(et)),(Typ,(Typ)Typ)),e,(e(et)),((et)((et)t)),(et) => t The two derivations have a common proof skeleton, depicted in Figure 5.5.2. The sequent is initialized with the lexical type assignment. For the Boolean particles en ('and'), of ('or'), the lexical assignments are the polymorphic structures below (where MEET/JOIN are the generalized conjunction/disjunction operations for arbitrary conjoinable types). 'en' ((Cat\Cat)/Cat) (Typ,(Typ,Typ)) MEET 'of ((Cat\Cat)/Cat) (Typ,(Typ,Typ)) JOIN The proof has to identify the syntactic category, semantic type and interpretation of the conjunction by finding a unifying answer substitution for the variables Cat and Typ. The Boolean Cut performs this task by proving the subgoals P => X:TypX:Trl and Q =» X:TypX:Tr2 i.e. by identifying conjuncts P and Q which reduce to a common category:type X:TypX with interpretations Trl and Tr2 respectively. The subproofs for the conjuncts then determine the succedent of the conjunction as X : TypX : MEET(Trl)(Tr2), where the interpretation of the conjunction is the Boolean meet of the interpretations Trl and Tr2 of the conjuncts. Since we are dealing with non-constituent coordination, the conjunct subgoals are proved on the basis of lemmas from M, the subsystem of L which decidably solves the equations P =» X and Q =» X, with an unknown succedent, as we demonstrated in Section 5.4.
240
CHAPTER 5 J
P
oQ=> X:TypX:Tr2
X:TypX:Trl i
P,CONJ,Q X A i >[Boole] icooiej
oU i .I.V^Y
{ Ar=X:TypX:MEET(Trl)(Tr2) , CONJ= (X\X)/X:(TypX,TypX,TypX)):MEET } [Cut] o U,P,CCW/,Q,V =» Y Figure 5.5.2
Reading 1: EXIST/MEET The verbs bemint ('loves') and haat ('hates') are simple extensional transitive verbs of type (e(et)). The object een meisje ('a girl'), however, is a second-order expression of type ((et)t). Hence, somewhere in the course of the proof type-shifting axioms will have to raise the object arguments from type e to ((et)t). The different readings are obtained by applying the type-shifting axioms at different places in the proof. For the EXIST/MEET reading, the subproof for the conjuncts Jan bemint and Marie haat is given on the first order basic assignment to the verbs, which together with the subject reduce to an incomplete expression (s/np) of type (et), i.e. a functor expecting a direct object np of type e to the right. Conjuction is applied at the first order level, leading to a first answer substitution for the polymorphic Boolean en ('and'): {Cat = (s/np), Typ = (et)} The interpretation obtained at the [Boole] node has the following form: MEET( \y.haat(y)(marie)) ( Xx.bemint(x)(jan)) Xx.bemint(x) (j an) Xy.haat(y) (marie) In the right premise of the Boolean Cut U,X,V => Y, the complete conjunction (s/np) of type (et) is argument-lifted from first order (et) to third order (((et)t)t) by means of a type-shifting axiom. The
POLYMORPHISM
241
crucial node in the proof (highlighted below) corresponds to the lambda term \T.T( Xx.MEET(haat(x)(marie))(bemint(x)(jan)) ). [Jan,bemint,en,Marie,haat,een,meisje] THEOREM: np,((np\s)/np),((Cat\Cat)/Cat),np,((np\s)/np),(np/n),n=>s e,(e(et)),(Typ,(Typ)Typ)))e,(e(et)),((et)((et)t)),(et)=>t PROOF: np,((np\s)/np),((Cat\Cat)/Cat)>np,((np\s)/np),(np/n),n=»s [Cut] e>(e(et)))(Typ)(Typ)Typ)))e,(e(et))>((et)((et)t)))(et)=»t np,((np\s)/np)>(((s/np)\(s/np))/(s/np)),np,((np\s)/np)^(s/np) [Boole] e,(e(et)),((et)((et)(et))),e,(e(et))=»(et) (et) conjunctionI np,((np\s)/np)=>(s/np) [Cut] e,(e(et))=>(et) np,((np\s)/np)=»(s/np) [Lemma] e,(e(et))=Ket) (s/np)=>(s/np) [Axiom] (et)=Ket) np,((np\s)/np)=»(s/np) [Cut] e,(e(et))=s>(et) np,((np\s)/np)=>(s/np) [Lemma] e,(e(et))=>(et) (s/np)=>(s/np) [Axiom] (et)=>(et) (s/np),(np/n),n=»s [Cut] (et),((et)((et)t)),(et)=* (s/np)=Ks/np) [Axiom] (et)=>(((et)t)t) Argument lifting.f (s/np),(np/n),n=>s [Cut] (((et)t)t),((et)((et)t)),(et)^t (s/np),(np/n)=*(s/n) [Lemma] (((et)t)t),((et)((et)t)M(et)t) (s/n),n=>s [Cut] ((et)t),(et)=* (s/n),n=>s [Lemma] ((et)t),(et)=* s=>s [Axiom] t=>t SEMANTICS: (reduced) 3x[meisje(x) & MEET(haat(x)(marie))(bemint(x)(jan))]
242
CHAPTER 5.5
Reading 2: MEET/EXIST For the reading where Boolean meet has scope over the existential quantifier of the object argument, the argument-lifting axiom (et) => (((et)t)t) is applied to the individual conjuncts, which are then conjoined on the higher-order interpretation. See the highlighted Axiom sequents in the proof below. This leads to a second unifying substitution for the polymorphic conjunction type: {Cat = (s/np), Typ = (((et)t)t} Here is the interpretation obtained at the [Boole] node: MEET( XT2.T2( Xy.haat(y) (marie))) (XT1.T1( Xx.bemint(x)(jan))) XT1.T1( Xx.bemint(x)(jan)) \x.bemint(x) (j an) XT1.T1( Xx.bemint(x)(jan)) Argument lifting XT2.T2( Xy.haat(y)(marie)) Xy.haat(y) (marie) XT2.T2( Xy.haat(y) (marie)) Argument lifting This lambda term can be further reduced by means of the equivalence MEET(P)(Q) Xv.MEET(P(v))(Q(v)), discussed in Chapter 1. MEET(XT2.T2(Xy.haat(y)(marie))) (XTl.Tl(Xx.bemint(x)(jan))) XT.MEET(T(Xy.haat(y)(marie)))(T(Xx.bemint(x)(jan))) Semantically, the conjunction now looks for a term-type argument, which after lambda conversion will be distributed over the conjuncts. [Jan,bemint,en,Marie,haat,een,meisje] THEOREM: np,((np\s)/np),((Cat\Cat)/Cat),np,((np\s)/np),(np/n),n=»s e,(e(et))((Typ,(Typ,Typ)),e,(e(et)),((et)((et)t)),(et)=>t
POLYMORPHISM
243
PROOF: np,((np\s)/np),((Cat\Cat)/Cat),np,((np\s)/np),(np/n),n=>s [Cut] e,(e(et)),(Typ)(Typ,Typ)),e>(e(et)),((et)((et)t)),(et)^t np)((np\s)/np),(((s/np)\(s/np))/(s/np)),np,((np\s)/np)=i.(s/np) [Boole] e,(e(et)),((((et)t)t)((((et)t)t)(((et)t)t))),e,(e(et))=»(((et)t)t) np,((np\s)/np)=»(s/np) [Cut] e,(e(et)M((et)t)t) np,((np\s)/np)=>(s/np) [Lemma] e,(e(et))=Ket) (s/np)=»(s/np) [Axiom] (et)=»(((et)t)t) Argument lifting np,((np\s)/np)=>(s/np) [Cut] e,(e(et))=K((et)t)t) np,((np\s)/np)=»(s/np) [Lemma] e,(e(et))=Ket) (s/np)=»(s/np) [Axiom] (et)=K((et)t)t) Argument lifting (s/np),(np/n),n=w [Cut] (((et)t)t),((et)((et)t)),(et)=* (s/np),(np/n)=>(s/n) [Lemma] (((et)t)t),((et)((et)t))=.((et)t) (s/n),n=>s [Cut] ((et)t),(et)=* (s/n),n=w [Lemma] ((et)t),(et)=»t s=>s [Axiom] t=*t SEMANTICS: (XT.MEET(T(Xy.haat(y)(marie)))(T(Xx.bemint(x)(jan))))(EXIST(meisje)) MEET (EXIST(meisje)(Xy.haat(y)(marie))) (EXIST(meisje)(Xx.bemint(x)(jan))) « MEET(3x[meisje(x) & haat(x)(marie)])(3x[meisje(x) & bemint(x)(jan)])
CONCLUSIONS
Our findings with regard to the syntax/semantics interface can be summarized as follows. The calculus LP is overgenerating, both syntactically and semantically, because of its Commutativity. In syntax, a grammatical expression remains derivable under permutation of its atoms. Semantically, LP does not preserve thematic
244
CHAPTER 5.5
structure: arguments can be permuted in the function-argument structure of their predicate. The flexible syntactic calculus L mends most of these problems. The associativity of L guarantees a generalized account of Boolean conjoinability; order and thematic structure are preserved, because there is no structural rule of Permutation. But given a rigid category-to-type mapping, L cannot account for the full set of quantifier scope phenomena: order-preservation excludes a general rule of argument raising. The quantification calculus H was presented as a specialized system of semantic typechange, generating all possible quantifier scope readings. This semantic calculus is over-restrictive when combined with the syntactic system AB, the Elimination fragment of L. By shifting to a relational approach for the category-to-type mapping, and redefining the interpretation procedure for Gentzen proofs accordingly, we can incorporate the semantic flexibility offered by H without committing the L syntax. The resulting combination L + H obeys Frege's Principle: it divides the explanatory burden between the syntactic and the semantic algebras, within the bounds of compositionality. At this point, we can complete the inventory of the sources of ambiguity that have been encountered in this study. At the level of the syntactic algebra, we have to distinguish lexical and structural ambiguity. Lexical ambiguity is caused by the fact that a lexical item may belong to different syntactic categories. As a special form of lexical ambiguity, we have studied the polymorphic (X\X)/X type assignment to the Boolean coordination particles: instead of associating these particles with a set of ground types, we have assigned them a variable type which assumes a categorial identity depending on the context it finds itself in. Structural ambiguity arises from the fact that a cut-free normal form derivation for a sequent may be unfolded in a number of alternative ways: some of the branchings of the cut-free proof may result in semantically non-equivalent interpretations for the sequent. When L is extended with the Cut inference, these readings can be obtained in a way which is essentially independent of the branching of the proof tree, as the discussion of the system M has made clear. In this section, we have added an important source of ambiguity, by relaxing the correspondence between syntactic categories and semantic types. A derivation, in the system L+H, can involve purely semantic forms of type-change, in addition to the type-change associated with alternative syntactic derivations for a sequent. The Boolean polymorphism of the coordination particles interacts in interesting ways with these different forms biguity, suggesting questions for further research. For the non-polymorphic form of L(P), the number of distinct
(X\X)/X of amstandard, readings
POLYMORPHISM
245
(modulo logical equivalence) is finite, as we have seen in Theorem 1.2.2 from Van Benthem (1986a). It remains to be shown whether this form of finiteness carries over to the extension of L with Boolean polymorphism, as investigated in these chapters.
APPENDIX I* PROGRAM A1 General Utilities 7 /* Operator declarations */ ?-op(300Jxfe,r). ?-op(300pcfe,'\'). ?-op(300,xfx,' •'). ?-op(1200,xfx,' [Y]*72 below gives the full system M, as discussed in Section 5.3. The user can substitute a lemma database with different combinatory possibilities. For example, by restricting '=¥*'/2 to the base cases, one obtains an AB calculus. */
257
APPENDIX
[X/Y,Y] [X], [Y,Y\X] =>* [X],
/* Application */
[Z,X/Y] =>* [W/Y] :nonvar(X), [Z,X]=>* [W]. [Y\X,ZH* [Y\W]:nonvar(X), [X,Z]=>* [W],
/* Ml: Recursion on range subtype 7
[X/Y,Z] [X/W] :nonvar(Y), [Z,W] =»* [Y], [Z,Y\X]=>*[W\X]:nonvar(Y), [W,Z] ** [Y].
/* M2: Recursion on domain subtype 7
[X,Y]
[(Z/(Y\(X\Z)))]:atom(X), atom(Y). [X,Y]=>* [(((Z/Y)/X)\Z)] :atom(X), atom(Y).
/* Connectedness for atoms 7
/* Boolean Cut 7 /* As an example of extra constraints within Gentzen inferences, here is the Boolean cut, discussed in Section 5.4. Before attempting the premise subproofs, a count-check is performed to make sure that the premise instantiations have compatible count values. To equip the bottom-up theorem prover with the Boolean cut, one can add this inference as the second clause for ' [X] & [Q | Right] => [X] & U,[X],V=>[Z] ),1).
258
LR4.PRO
/* Lexical look-up */ look_up(Q,n). look_up([Word|T],[Type|R]) :type(Word,Type), look_up(T,R). /* LR4.PRO. L+{Cut}. Lambda semantics */ /* Semantically interpreted version of LR3.PRO */ [X]
[X]true.
/* Axiom */
U,[X:Sem_X,Y:Sem_Y],V => [Z] [X:Sem_X,Y:Sem_Y] =»* [Cut:Sem_Cut] & U,[Cut:Sem_Cut],V=>[Z].
/* Cut •/
/* Lemma Database: M */ [X/Y:Functor, Y:Arg] =>* [X:Functor@Arg]. [Y:Arg, Y\X:Functor] =»* [X:Functor@Arg]. [Z:Sem_Z, X/Y:Functor] =>* [W/Y:Var_Y~Sem_W] :nonvar(X), [Z:Sem_Z, X:Functor@Var_Y] =>* [W:Sem_W]. [Y\X:Functor, Z:Sem_Z] =>* [Y\W:Var_Y^Sem_W] :nonvar(X), [X:Functor@Var_Y, Z:Sem_Z] =>* [W:Sem_W]. [X/Y:Functor, Z:Sem_Z] =>* [X/W:Var_W'>(Functor@Sem_Y)] :nonvar(Y), [Z:Sem_Z, W:Var_W] [Y:Sem_Y]. [Z:Sem_Z, Y\X:Functor] [W\X:Var_W/N(FunctoreSem_Y)] :nonvar(Y), [W:Var_W, Z:Sem_Z] =>* [Y:Sem_Y], [X:Sem_X, Y:Sem_Y] =>* [(Z/(Y\(X\Z))):F~(F@Sem_Y@Sem_X)] :atom(X), atom(Y). [X:Sem_X, Y:Sem_Y] =>* [(((Z/Y)/X)\Z):F~(F@Sem_X@Sem_Y)] atom(X), atom(Y).
APPENDIX
/* Boolean Cut */ assert(
(
U,[P | Left],[(X\X)/X:Boolean],[Q | Right], V [Z] { var(X), match_count((U,[Q | Right],V),[Z]) , match_count([P | Left],[Q | Right]) } & [P | Left] => [X:Conjunctl] & [Q | Right] => [X:Conjunct2] & U,[X:Boolean@Conjunct2@ Conjunct 1],V * [Z] ),1).
/* Lexical look-up */ look_up(Q,Q). look_up([Word|T],[Type:Word|R]) :type(Word,Type), look_up(T,R). I* LEXICON.PRO Sample lexical type assignment */ /* bascat/1: basic types */ bascat(s). bascat(np). bascat(n). bascat(ap). bascat(pp). r boolean/1 •/ /* The Boolean particles have a special clause for X reduction. V boolean(en). boolean(of). boolean(and). boolean(or). /* type/2. Lexical type assignment: type(Word,Type) */ /* Dutch */ /* Polymorphic types */ type(en,(X\X)/X). type(of,(X\X)/X) . type(niet,X/X).
259
260
/* Ground types */ type(afhankelijk,pp\ap). type(van,pp/np). type(gek,ap). type(dom,ap). type(is,ap\(np\s)). type(vindt,ap\(np\(np\s))). type(tuin,n) . type(meisje,n). type(boek,n). type(het,np/n). type(de,np/n) . type(een,np/n) . type(in,pp/np). type(naast,pp/np). type(leest,np\(np\s)). type(kust,np\(np\s)) . type(ligt,pp\(np\s)). type(legt,pp\(np\(np\s))). type(opa,np) . type(oma,np). type(omdat,s/s). /* F.nglkh */
/* Polymorphic types */ type(and,(X\X)/X). type(or,(X\X)/X) . type(not,X/X). /* Ground types */ type(john,np). type(mary,np). type(loves,(np\s)/np). type(hates,(np\s)/np). type(puts,((np\s)/pp)/np). type(on,pp/np). type(the,np/n). type(table,n). type(bed,n). type(record,n). type(book,n). type(girl,n).
LEXICON.PRO
261
APPENDIX
SAMPLE SESSION
?- start. CALCULUS > LR1. /* Cut-free L, no semantic interpretation */ MODE (full/semantics) > full. ANTECEDENT > [john,puts,the,book,on,the,table]. SUCCEDENT > s. [np,((np\s)/pp)/np,np/n,n,pp/np,np/n,n]=»[s][pp] [np/n,n]=>[np][n]